Skip to content

SAKET-SK/RAG-MCP-Project

Repository files navigation

IntelliHR - RAG-Enabled AI Assistant with MCP Architecture

An intelligent HR assistant that orchestrates multiple data sources using Model Context Protocol (MCP) and Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses.

Built as a capstone project to demonstrate practical AI/ML system design and full-stack development.

Table of Contents:


Highlights

  • Smart Orchestration - LLM intelligently routes queries to appropriate data sources
  • RAG Integration - Semantic search over policy documents with zero hallucination
  • Beautiful UI - Modern glassmorphism design with dark theme
  • Fast Responses - Sub-2 second query processing with Groq's llama-3.3-70b
  • Context-Aware - Actually reads and respects document confidentiality markers
  • Real-time Analytics - Track queries, tools used, and system performance

Architecture

┌─────────────────────────────────────────────┐
│          User Interface (Streamlit)         │
│  Glassmorphism UI • Dark Theme • Analytics  │
└────────────────┬────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────┐
│     Orchestrator (Groq LLM)                 │
│  llama-3.3-70b-versatile                    │
│  • Function Calling                         │
│  • Tool Selection                           │
│  • Response Generation                      │
└────────────────┬────────────────────────────┘
                 │
        ┌────────┴─────────┐
        │  MCP Framework   │
        └────────┬─────────┘
                 │
    ┌────────────┼────────────┐
    │            │            │
    ▼            ▼            ▼
┌─────────┐ ┌──────────┐ ┌─────────┐
│Database │ │Filesystem│ │   RAG   │
│ Server  │ │  Server  │ │ Server  │
├─────────┤ ├──────────┤ ├─────────┤
│ SQLite  │ │Text Files│ │ChromaDB │
│Employee │ │Announce- │ │Policies │
│Records  │ │ments     │ │(PDFs)   │
└─────────┘ └──────────┘ └─────────┘

How it works:

  • User Query → Enters question in natural language
  • Orchestrator → LLM analyzes and selects appropriate tools
  • MCP Servers → Execute queries on respective data sources
  • Response → LLM combines results into coherent answer

Demo Screenshots

Main Page:

image

You may select questions from quick actions or ask your own:

image

Querying:

image

More Querying:

image

Analytics Dashboard:

image

Tech Stack

Core Technologies:

  • Python 3.11+ - Primary language
  • Streamlit - Web UI framework
  • Groq API - LLM inference (llama-3.3-70b-versatile)
  • SQLite - Employee database
  • ChromaDB - Vector database for RAG
  • LangChain - RAG pipeline orchestration

Key Libraries:

  • langchain - Document processing and RAG
  • langchain-groq - Groq LLM integration
  • langchain-huggingface - Embeddings (all-MiniLM-L6-v2)
  • chromadb - Vector storage
  • pypdf - PDF document loading
  • asyncio - Asynchronous operations

Design:

  • Custom CSS with glassmorphism
  • Responsive design
  • Dark theme with purple gradient
  • SVG architecture diagrams

Features

Current Features (v1.0)

✅ Multi-Source Querying

  • Employee database search (name, department, ID)
  • Announcement retrieval (holidays, events, updates)
  • Policy document search with RAG

✅ Intelligent Routing

  • LLM automatically selects correct data source
  • Multi-tool queries supported
  • Context-aware responses

✅ Professional UI

  • Real-time chat interface
  • Session statistics tracking
  • Tool usage visualization
  • Query history with expandable details

✅ Data Sources

  • 10 employee records
  • 4 announcement files
  • 3 policy documents (Leave, POSH, Salary)

System Capabilities

  • 9 Tools across 3 MCP servers

  • Sub-2s average response time

  • Zero hallucination with RAG

  • Conversation memory for follow-up questions

  • Back to the top


Installation

Prerequisites:

  • Python 3.11 or higher
  • Groq API key (Get one free)

Step 1 : Clone the repository

git clone https://github.com/yourusername/intellihr-rag-mcp.git
cd intellihr-rag-mcp

Step 2 : Create virtual environment

python -m venv .venv

# Activate
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activate

Step 3 : Install Dependencies

pip install -r requirements.txt

Step 4 : Set up environment variables

# Windows PowerShell
$env:GROQ_API_KEY="your-groq-api-key-here"

# Linux/Mac
export GROQ_API_KEY="your-groq-api-key-here"

Step 5 : Initialize the Database

python setup_database.py

Step 6 : Run the mcp servers to initialize them

python mcp_servers/database_server.py
python mcp_servers/rag_server.py
python mcp_servers/file_system_server.py

Step 7 : Run the orchestrator

python orchestrator.py

Step 8 : Run the Application

streamlit run app.py

The app will open in your browser at http://localhost:8501


Project Structure

intellihr-rag-mcp/
├── app.py                      # Main Streamlit application
├── orchestrator.py             # LLM orchestrator for tool routing
├── setup_database.py           # Database initialization script
├── requirements.txt            # Python dependencies
│
├── mcp_servers/               # MCP Server implementations
│   ├── database_server.py     # Employee database server
│   ├── filesystem_server.py   # Announcements server
│   └── rag_server.py          # Policy documents RAG server
│
├── ui/                        # UI components
│   ├── __init__.py
│   └── styles.py              # Custom CSS styles
│
├── data/                      # Data storage
│   ├── announcements/         # Text files for announcements
│   ├── policies/              # PDF policy documents
│   └── chroma_store/          # ChromaDB vector database
│
├── employees.db               # SQLite database (created after setup)
└── README.md                  # This file

Future Roadmap

  • Enhanced RAG - Vector search for announcements (eliminate hallucination)
  • Voice Interface - Speech-to-text and text-to-speech
  • Multi-tenancy - Department-specific access control
  • Advanced Analytics - Query insights and usage patterns
  • API Endpoints - REST API for programmatic access
  • Mobile Responsive - Optimized mobile experience
  • Export Features - Download chat history and reports
  • Admin Dashboard - Manage documents and users
  • Fine-tuned Models - Custom model for domain-specific queries
  • Docker Deployment - Containerized deployment

Under Consideration:

  • Integration with Slack/Teams
  • Email notification system
  • Document auto-ingestion pipeline
  • Multi-language support
  • Real-time collaboration features

Contribution

Contributions are welcome! Here's how you can help:

  • Fork the repository
  • Create a feature branch (git checkout -b feature/AmazingFeature)
  • Commit changes (git commit -m 'Add AmazingFeature')
  • Push to branch (git push origin feature/AmazingFeature)
  • Open a Pull Request

Areas for Contribution

  • UI/UX improvements

  • Additional data source integrations

  • Performance optimizations

  • Bug fixes

  • Documentation improvements

  • Test coverage

  • Back to the top

About

An intelligent HR assistant that orchestrates multiple data sources using Model Context Protocol (MCP) and Retrieval-Augmented Generation (RAG) to provide accurate, context-aware responses.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages