Multi-Modal Hate Speech Detection in Hinglish

A multi-modal hate speech detection system for Hinglish (Hindi-English code-mixed language). This Streamlit-based application allows users to detect hate speech in text, audio, video, and images using deep learning, NLP, OCR, and speech-to-text.

What is Hate Speech Detection?

Hate speech detection is the process of identifying and classifying content (text, audio, video, images, etc.) as hate speech or non-hate speech. This project focuses on Hinglish, a code-mixed language, and supports detection across multiple modalities using advanced machine learning and NLP techniques.

Key Features

Multi-Modal Detection: Supports text, audio, video, and image hate speech detection in Hinglish.
Real-Time Inference: Fast, interactive predictions via a modern Streamlit web interface.
Robust Preprocessing: Advanced text cleaning, OCR for images, and speech-to-text for audio/video.
Transfer Learning: Utilizes a fine-tuned BERT model for high-accuracy classification.
User-Friendly UI: Intuitive navigation and clear results for all input types.
Modular Codebase: Easily extendable for new modalities or languages.

💻 Prerequisites

Python 3.8+
pip

🚀 Installation

Clone the repository:

git clone https://github.com/rahul-jaiswar-git/Hate-Shield-AI.git
cd Hate-Shield-AI

Install dependencies:
```
pip install -r requirements.txt
```
Download the model files:
- Download all model files from the Hugging Face repository: Hinglish-based-Hate-Speech-detection-model-v1 on Hugging Face
- Place all downloaded files (e.g., tf_model.h5, config.json, tokenizer_config.json, vocab.txt, special_tokens_map.json, label_encoder.pkl) into the hate_speech_model/ directory.

⬇️ Download the Model

You can download the pre-trained model and all required files from Hugging Face:

https://huggingface.co/rahuljaiswarofficial/Hinglish-based-Hate-Speech-detection-model-v1

After downloading, place all files in the hate_speech_model/ directory before running the app.

☕ Usage

Run the Streamlit app:

streamlit run app.py

Use the sidebar to navigate between Home, Model Check, and About Us.
In "Check Model", select the desired classification task (Text, Audio, Video, Image).
Upload files or enter text as prompted.

🛠️ Technology Stack

User Interface:
- Streamlit (web app framework)
- Pillow (image handling)
Machine Learning & NLP:
- TensorFlow (deep learning backend)
- HuggingFace Transformers (BERT model)
- joblib (model serialization)
- numpy (numerical operations)
Audio & Speech Processing:
- SpeechRecognition (speech-to-text)
- pydub, librosa, soundfile (audio file handling)
Image & Video Processing:
- OpenCV (image/video processing)
- pytesseract (OCR)
Utilities:
- requests (API calls)

🏗️ File Structure

project/
├── app.py                  # Main Streamlit app
├── requirements.txt        # Python dependencies
├── README.md               # Project documentation
├── styles.css              # Custom styles for Streamlit
├── hate_speech_model/      # Model files and label encoder
├── Classifier/             # Classification modules for each modality
│   ├── text_classification.py
│   ├── audio_classification.py
│   ├── video_classification.py
│   ├── image_classification.py
├── Frontend/               # Images and GIFs for UI
│   ├── models.gif
│   ├── about us.gif
│   └── Hate.jpg
├── Dataset/                # (Optional) Data for training/testing
├── Eg Data/                # (Optional) Example data for demo
└── ...

🤝 Contributors

_{Rahul Jaiswar}

Special Thanks: Open-source community, HuggingFace, and all referenced libraries

📝 License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Modal Hate Speech Detection in Hinglish

Table of Contents

What is Hate Speech Detection?

Key Features

💻 Prerequisites

🚀 Installation

⬇️ Download the Model

☕ Usage

🛠️ Technology Stack

🏗️ File Structure

🤝 Contributors

📝 License

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Classifier		Classifier
Collab Files		Collab Files
Dataset		Dataset
Eg Data		Eg Data
Frontend		Frontend
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
styles.css		styles.css

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Hate Speech Detection in Hinglish

Table of Contents

What is Hate Speech Detection?

Key Features

💻 Prerequisites

🚀 Installation

⬇️ Download the Model

☕ Usage

🛠️ Technology Stack

🏗️ File Structure

🤝 Contributors

📝 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages