Skip to content

pancham1920/docu-scout

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docu-Scout: Professional RAG-Based AI Agent

Architected by Pancham Singh | Senior Software Engineer

Docu-Scout is a full-stack Retrieval-Augmented Generation (RAG) application designed to demonstrate high-performance AI orchestration. It allows users to "scout" through complex documentation using a local vector store and low-latency LLM inference.

🏗️ The Architecture

The system follows a modular decoupled architecture:

  • Frontend: React 18, TypeScript, Vite, Tailwind CSS, Lucide.
  • Backend: FastAPI (Python), Uvicorn, LangChain/LlamaIndex.
  • Vector Engine: ChromaDB (Local Persistent Storage).
  • Inference: Groq Cloud (Llama 3 70B) for sub-500ms response times.

🧠 Design Decisions & Senior Considerations

  • Local Vectorization: Used ChromaDB to ensure data remains within the infrastructure boundary, addressing GDPR/Privacy concerns.
  • TypeScript-First: Ensured strict typing across the frontend to prevent runtime errors in complex AI state management.
  • Async Ingestion: Implemented FastAPI background tasks for document processing to keep the UI responsive during "scouting" phases.

🚀 Quick Start (Production-Ready)

1. Backend Setup

cd backend
python -m venv venv
source venv/bin/activate  # Windows: .\venv\Scripts\activate
pip install -r requirements.txt
# Ensure your .env contains GROQ_API_KEY
python api.py

### 2. Frontend Setup
cd frontend
npm install --legacy-peer-deps
npm run dev

Future Roadmap (Scale & Production)
1.Advanced Retrieval (The "R" in RAG)
Hybrid Search: Implement BM25 + Vector Search to improve accuracy for specific technical terms and code snippets.

Re-ranking: Integrate a Cross-Encoder Re-ranker to sort retrieved chunks, ensuring the LLM only processes the most contextually relevant data.

2. Performance & Cost Optimization
Semantic Caching: Integrate RedisVL to cache common user queries. This reduces LLM API costs by ~40% and drops latency to <50ms for repeated questions.

Streaming Responses: Transition from standard REST to Server-Sent Events (SSE) for real-time word-by-word streaming in the UI.

3. Observability & Evaluation (The "Senior" Layer)
RAGAS Framework: Implement automated evaluation to measure Faithfulness (no hallucinations) and Answer Relevance.

Tracing: Integrate LangSmith or Arize Phoenix for deep-dive debugging of the retrieval chain.

4. Security & Compliance (EU/GDPR Focus)
PII Redaction: Add a middleware layer to scrub Personally Identifiable Information before data is sent to the LLM provider.

Auth Integration: Secure the /chat endpoint using OAuth2/JWT for multi-tenant user support.

About

A full-stack RAG-based AI Agent using FastAPI, React, and Groq.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors