End-to-end fraud detection system for credit card transactions.
Maintained by Bhanuja Karumuru.
This repository contains:
- An ensemble fraud model (logistic regression, random forest, gradient boosting)
- MLflow model tracking/registry for model versioning
- A FastAPI inference service for real-time scoring
- A Kafka-based streaming scorer
- Observability with Prometheus + Grafana + Alertmanager
Dataset source: Kaggle credit card fraud dataset.
- Docker Desktop (or Docker Engine)
- Docker Compose
docker compose up --build- API (Swagger): http://localhost:8000/docs
- API health: http://localhost:8000/health
- MLflow UI: http://localhost:5001
- Prometheus: http://localhost:9090
- Grafana: http://localhost:3000
- Alertmanager: http://localhost:9093
curl -sS http://localhost:8000/predict \
-H 'content-type: application/json' \
-d '{"features":{"Time":0,"Amount":0,"V1":-1.359807,"V2":-0.072781,"V3":2.536347,"V4":1.378155,"V5":-0.338321,"V6":0.462388,"V7":0.239599,"V8":0.098698,"V9":0.363787,"V10":0.090794,"V11":-0.5516,"V12":-0.617801,"V13":-0.99139,"V14":-0.311169,"V15":1.468177,"V16":-0.470401,"V17":0.207971,"V18":0.025791,"V19":0.403993,"V20":0.251412,"V21":-0.018307,"V22":0.277838,"V23":-0.110474,"V24":0.066928,"V25":0.128539,"V26":-0.189115,"V27":0.133558,"V28":-0.021053}}'services/trainer: trains the ensemble model and registers it to MLflow; promotes the model when validation criteria passservices/retrainer: scheduled retraining loopservices/batch-scorer: scheduled batch evaluation and metric loggingservices/api: real-time inference API + Prometheus metricsservices/simulator: publishes transactions to Kafkaservices/scorer: consumes Kafka messages, scores them using the Production model from MLflow, exports Prometheus metricsinfra/prometheus,infra/grafana,infra/alertmanager: monitoring and alerting configuration
creditcard.csv.zipis a large file. If you want a lighter repository, download the dataset locally and adjust the compose volume mount accordingly.