Skip to content

pazs10ve/Waste-Management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Waste Classification - MLOps Ready

A production-ready MLOps structure for the waste classification project with:

  • Modular pipelines in src/
  • Experiment tracking and artifacts via Weights & Biases (W&B)
  • Type-safe configs via Pydantic
  • W&B Hyperparameter Sweeps
  • DVC pipeline scaffolding
  • CI (GitHub Actions) with linting and optional training
  • Reproducible environments with pinned requirements.txt
  • Containerization via Docker

Project Structure

  • src/data/data_loader.py — dataset creation, class mapping, and augmentations
  • src/model/model_builder.py — DenseNet121 model builder
  • src/pipelines/train_pipeline.py — training with W&B logging and artifact export
  • src/pipelines/evaluation_pipeline.py — evaluation for Keras and TFLite models with W&B logging
  • src/pipelines/inference_pipeline.py — webcam inference using TFLite or Keras model
  • requirements.txt — pinned dependencies
  • Dockerfile — container for training
  • .dockerignore — exclude local artifacts and data by default

Prerequisites

  • Python 3.10+ recommended
  • For GPU (optional): Proper NVIDIA drivers + CUDA/cuDNN compatible with your TF version
  • A W&B account: https://wandb.ai/

Setup (Local)

  1. Create a virtual environment and install dependencies:
python -m venv .venv
. .venv/Scripts/activate  # Windows PowerShell: .venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt
  1. Login to W&B:
wandb login
  1. Prepare data directory (expected default):
RealWaste/
  ├─ ClassA/
  │    ├─ img1.jpg
  │    └─ ...
  ├─ ClassB/
  └─ ...

During training, original classes are mapped to 5 target classes: ['Organic','Inorganic','Metal','Electronics','Others'] (see DataHandler.class_mapping).

Train

python -m src.pipelines.train_pipeline

Environment variables (via W&B config) you can change directly in train_pipeline.py main:

  • data_dir (default RealWaste)
  • img_size (default (224,224))
  • batch_size (default 32)
  • epochs (default 10)
  • learning_rate (default 1e-3)
  • with_augmentation (default True)
  • model_path (default waste_classifier.h5)
  • tflite_path (default waste_classifier.tflite)
  • quantize (default True)

Artifacts (H5 and TFLite) are logged to W&B automatically as Artifacts.

Config is type-safe via pydantic in src/config/schemas.py:

  • Override parameters by editing code, via environment, or W&B overrides (e.g., when using sweeps).

Model selection/custom models:

  • Select backbone by setting model_name in config (densenet121, resnet50, mobilenetv2).

Evaluate

python -m src.pipelines.evaluation_pipeline

Logs a classification report and confusion matrices for Keras and, if present, TFLite model to W&B.

Inference (Webcam)

python -m src.pipelines.inference_pipeline

Notes:

  • Default expects waste_classifier.tflite in project root.
  • Preprocessing matches training: exported model includes preprocess_input, so raw RGB is fed to the interpreter.

Hyperparameter Sweeps (W&B)

  1. Create sweep:
wandb sweep sweeps/wandb_sweep.yaml
  1. Run an agent (repeat for parallelism):
wandb agent <SWEEP_ID>

The sweep can vary learning_rate, batch_size, epochs, img_size_h/w, and model_name.

DVC Pipeline

Initialize DVC in the repo (one-time):

dvc init

Run stages:

dvc repro train
dvc repro evaluate

By default we do not version the raw RealWaste/ folder (see .dvcignore). If you want to track snapshots, add it as a DVC-tracked directory.

CI/CD (GitHub Actions)

Workflow at .github/workflows/ci.yml runs on push/PR:

  • Lints with flake8
  • Optionally runs training if manually dispatched with run-training: true and WANDB_API_KEY secret set.

To run training from the workflow_dispatch UI, add repository secret WANDB_API_KEY and trigger the workflow with run-training=true.

Docker

Build the image:

docker build -t waste-classifier:latest .

Run training inside Docker (mount your data and pass W&B API key):

docker run --rm \
  -e WANDB_API_KEY=YOUR_WANDB_API_KEY \
  -v %cd%/RealWaste:/data/RealWaste \
  -v %cd%:/app \
  waste-classifier:latest

The default ENTRYPOINT runs src.pipelines.train_pipeline. Data is expected at /data/RealWaste; either change config.data_dir in the pipeline or mount accordingly.

If you want to run evaluation instead:

docker run --rm \
  -e WANDB_API_KEY=YOUR_WANDB_API_KEY \
  -v %cd%:/app \
  waste-classifier:latest \
  python -m src.pipelines.evaluation_pipeline

W&B Notes

  • Replace YOUR_ENTITY in train_pipeline.py and evaluation_pipeline.py with your W&B entity/org.
  • All key hyperparameters are recorded and artifacts (H5/TFLite) are versioned as W&B Artifacts.

Next Steps

  • Add CI/CD (GitHub Actions) to run lint/tests and optionally kick off training jobs
  • Add dataset versioning via W&B Artifacts or DVC for raw data snapshots
  • Add unit tests for DataHandler and create_model
  • Add model registry/promote best models by validation metric

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors