IntelliHR - RAG-Enabled AI Assistant with MCP Architecture
An intelligent HR assistant that orchestrates multiple data sources using Model Context Protocol (MCP) and Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses.
Built as a capstone project to demonstrate practical AI/ML system design and full-stack development.
Table of Contents:
- Highlights
- Architecture
- Demo Screenshots
- Tech Stack
- Features
- Installation
- Project Structure
- Future Roadmap
- Contribute
- Smart Orchestration - LLM intelligently routes queries to appropriate data sources
- RAG Integration - Semantic search over policy documents with zero hallucination
- Beautiful UI - Modern glassmorphism design with dark theme
- Fast Responses - Sub-2 second query processing with Groq's llama-3.3-70b
- Context-Aware - Actually reads and respects document confidentiality markers
- Real-time Analytics - Track queries, tools used, and system performance
┌─────────────────────────────────────────────┐
│ User Interface (Streamlit) │
│ Glassmorphism UI • Dark Theme • Analytics │
└────────────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Orchestrator (Groq LLM) │
│ llama-3.3-70b-versatile │
│ • Function Calling │
│ • Tool Selection │
│ • Response Generation │
└────────────────┬────────────────────────────┘
│
┌────────┴─────────┐
│ MCP Framework │
└────────┬─────────┘
│
┌────────────┼────────────┐
│ │ │
▼ ▼ ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│Database │ │Filesystem│ │ RAG │
│ Server │ │ Server │ │ Server │
├─────────┤ ├──────────┤ ├─────────┤
│ SQLite │ │Text Files│ │ChromaDB │
│Employee │ │Announce- │ │Policies │
│Records │ │ments │ │(PDFs) │
└─────────┘ └──────────┘ └─────────┘
How it works:
- User Query → Enters question in natural language
- Orchestrator → LLM analyzes and selects appropriate tools
- MCP Servers → Execute queries on respective data sources
- Response → LLM combines results into coherent answer
Main Page:
You may select questions from quick actions or ask your own:
Querying:
More Querying:
Analytics Dashboard:
Core Technologies:
- Python 3.11+ - Primary language
- Streamlit - Web UI framework
- Groq API - LLM inference (llama-3.3-70b-versatile)
- SQLite - Employee database
- ChromaDB - Vector database for RAG
- LangChain - RAG pipeline orchestration
Key Libraries:
- langchain - Document processing and RAG
- langchain-groq - Groq LLM integration
- langchain-huggingface - Embeddings (all-MiniLM-L6-v2)
- chromadb - Vector storage
- pypdf - PDF document loading
- asyncio - Asynchronous operations
Design:
- Custom CSS with glassmorphism
- Responsive design
- Dark theme with purple gradient
- SVG architecture diagrams
Current Features (v1.0)
✅ Multi-Source Querying
- Employee database search (name, department, ID)
- Announcement retrieval (holidays, events, updates)
- Policy document search with RAG
✅ Intelligent Routing
- LLM automatically selects correct data source
- Multi-tool queries supported
- Context-aware responses
✅ Professional UI
- Real-time chat interface
- Session statistics tracking
- Tool usage visualization
- Query history with expandable details
✅ Data Sources
- 10 employee records
- 4 announcement files
- 3 policy documents (Leave, POSH, Salary)
System Capabilities
-
9 Tools across 3 MCP servers
-
Sub-2s average response time
-
Zero hallucination with RAG
-
Conversation memory for follow-up questions
Prerequisites:
- Python 3.11 or higher
- Groq API key (Get one free)
Step 1 : Clone the repository
git clone https://github.com/yourusername/intellihr-rag-mcp.git
cd intellihr-rag-mcp
Step 2 : Create virtual environment
python -m venv .venv
# Activate
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate
Step 3 : Install Dependencies
pip install -r requirements.txt
Step 4 : Set up environment variables
# Windows PowerShell
$env:GROQ_API_KEY="your-groq-api-key-here"
# Linux/Mac
export GROQ_API_KEY="your-groq-api-key-here"
Step 5 : Initialize the Database
python setup_database.py
Step 6 : Run the mcp servers to initialize them
python mcp_servers/database_server.py
python mcp_servers/rag_server.py
python mcp_servers/file_system_server.py
Step 7 : Run the orchestrator
python orchestrator.py
Step 8 : Run the Application
streamlit run app.py
The app will open in your browser at http://localhost:8501
intellihr-rag-mcp/
├── app.py # Main Streamlit application
├── orchestrator.py # LLM orchestrator for tool routing
├── setup_database.py # Database initialization script
├── requirements.txt # Python dependencies
│
├── mcp_servers/ # MCP Server implementations
│ ├── database_server.py # Employee database server
│ ├── filesystem_server.py # Announcements server
│ └── rag_server.py # Policy documents RAG server
│
├── ui/ # UI components
│ ├── __init__.py
│ └── styles.py # Custom CSS styles
│
├── data/ # Data storage
│ ├── announcements/ # Text files for announcements
│ ├── policies/ # PDF policy documents
│ └── chroma_store/ # ChromaDB vector database
│
├── employees.db # SQLite database (created after setup)
└── README.md # This file
- Enhanced RAG - Vector search for announcements (eliminate hallucination)
- Voice Interface - Speech-to-text and text-to-speech
- Multi-tenancy - Department-specific access control
- Advanced Analytics - Query insights and usage patterns
- API Endpoints - REST API for programmatic access
- Mobile Responsive - Optimized mobile experience
- Export Features - Download chat history and reports
- Admin Dashboard - Manage documents and users
- Fine-tuned Models - Custom model for domain-specific queries
- Docker Deployment - Containerized deployment
Under Consideration:
- Integration with Slack/Teams
- Email notification system
- Document auto-ingestion pipeline
- Multi-language support
- Real-time collaboration features
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch (git checkout -b feature/AmazingFeature)
- Commit changes (git commit -m 'Add AmazingFeature')
- Push to branch (git push origin feature/AmazingFeature)
- Open a Pull Request
Areas for Contribution
-
UI/UX improvements
-
Additional data source integrations
-
Performance optimizations
-
Bug fixes
-
Documentation improvements
-
Test coverage