A two-stage generative AI system that converts natural language questions into SQL queries, using schema-linking and fine-tuned Large Language Models (LLMs). Built during my final-year project at AxeFinance.
- Two-stage architecture:
- Schema Linking Model: Identifies correct tables & columns.
- SQL Generation Model: Generates SQL queries.
- Fine-tuned open-source models (Mistral, LLaMA).
- API endpoint built using FastAPI.
- Dataset: Spider benchmark.
- Achieved 83.46% execution accuracy on Spider.
- Python
- Hugging Face Transformers
- PEFT / LoRA
- PyTorch
- FastAPI
- SQL Server (for testing)
- Google Colab (for training)
- Execution Accuracy: 83.46%
- Schema Linking Accuracy: 91.66%
- Ranked 3rd on Spider Benchmark
pip install -r requirements.txt
uvicorn src.api:app --reload
