Waste Classification - MLOps Ready

A production-ready MLOps structure for the waste classification project with:

Modular pipelines in src/
Experiment tracking and artifacts via Weights & Biases (W&B)
Type-safe configs via Pydantic
W&B Hyperparameter Sweeps
DVC pipeline scaffolding
CI (GitHub Actions) with linting and optional training
Reproducible environments with pinned requirements.txt
Containerization via Docker

Project Structure

src/data/data_loader.py — dataset creation, class mapping, and augmentations
src/model/model_builder.py — DenseNet121 model builder
src/pipelines/train_pipeline.py — training with W&B logging and artifact export
src/pipelines/evaluation_pipeline.py — evaluation for Keras and TFLite models with W&B logging
src/pipelines/inference_pipeline.py — webcam inference using TFLite or Keras model
requirements.txt — pinned dependencies
Dockerfile — container for training
.dockerignore — exclude local artifacts and data by default

Prerequisites

Python 3.10+ recommended
For GPU (optional): Proper NVIDIA drivers + CUDA/cuDNN compatible with your TF version
A W&B account: https://wandb.ai/

Setup (Local)

Create a virtual environment and install dependencies:

python -m venv .venv
. .venv/Scripts/activate  # Windows PowerShell: .venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt

Login to W&B:

wandb login

Prepare data directory (expected default):

RealWaste/
  ├─ ClassA/
  │    ├─ img1.jpg
  │    └─ ...
  ├─ ClassB/
  └─ ...

During training, original classes are mapped to 5 target classes: ['Organic','Inorganic','Metal','Electronics','Others'] (see DataHandler.class_mapping).

Train

python -m src.pipelines.train_pipeline

Environment variables (via W&B config) you can change directly in train_pipeline.py main:

data_dir (default RealWaste)
img_size (default (224,224))
batch_size (default 32)
epochs (default 10)
learning_rate (default 1e-3)
with_augmentation (default True)
model_path (default waste_classifier.h5)
tflite_path (default waste_classifier.tflite)
quantize (default True)

Artifacts (H5 and TFLite) are logged to W&B automatically as Artifacts.

Config is type-safe via pydantic in src/config/schemas.py:

Override parameters by editing code, via environment, or W&B overrides (e.g., when using sweeps).

Model selection/custom models:

Select backbone by setting model_name in config (densenet121, resnet50, mobilenetv2).

Evaluate

python -m src.pipelines.evaluation_pipeline

Logs a classification report and confusion matrices for Keras and, if present, TFLite model to W&B.

Inference (Webcam)

python -m src.pipelines.inference_pipeline

Notes:

Default expects waste_classifier.tflite in project root.
Preprocessing matches training: exported model includes preprocess_input, so raw RGB is fed to the interpreter.

Hyperparameter Sweeps (W&B)

Create sweep:

wandb sweep sweeps/wandb_sweep.yaml

Run an agent (repeat for parallelism):

wandb agent <SWEEP_ID>

The sweep can vary learning_rate, batch_size, epochs, img_size_h/w, and model_name.

DVC Pipeline

Initialize DVC in the repo (one-time):

dvc init

Run stages:

dvc repro train
dvc repro evaluate

By default we do not version the raw RealWaste/ folder (see .dvcignore). If you want to track snapshots, add it as a DVC-tracked directory.

CI/CD (GitHub Actions)

Workflow at .github/workflows/ci.yml runs on push/PR:

Lints with flake8
Optionally runs training if manually dispatched with run-training: true and WANDB_API_KEY secret set.

To run training from the workflow_dispatch UI, add repository secret WANDB_API_KEY and trigger the workflow with run-training=true.

Docker

Build the image:

docker build -t waste-classifier:latest .

Run training inside Docker (mount your data and pass W&B API key):

docker run --rm \
  -e WANDB_API_KEY=YOUR_WANDB_API_KEY \
  -v %cd%/RealWaste:/data/RealWaste \
  -v %cd%:/app \
  waste-classifier:latest

The default ENTRYPOINT runs src.pipelines.train_pipeline. Data is expected at /data/RealWaste; either change config.data_dir in the pipeline or mount accordingly.

If you want to run evaluation instead:

docker run --rm \
  -e WANDB_API_KEY=YOUR_WANDB_API_KEY \
  -v %cd%:/app \
  waste-classifier:latest \
  python -m src.pipelines.evaluation_pipeline

W&B Notes

Replace YOUR_ENTITY in train_pipeline.py and evaluation_pipeline.py with your W&B entity/org.
All key hyperparameters are recorded and artifacts (H5/TFLite) are versioned as W&B Artifacts.

Next Steps

Add CI/CD (GitHub Actions) to run lint/tests and optionally kick off training jobs
Add dataset versioning via W&B Artifacts or DVC for raw data snapshots
Add unit tests for DataHandler and create_model
Add model registry/promote best models by validation metric

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Waste Classification - MLOps Ready

Project Structure

Prerequisites

Setup (Local)

Train

Evaluate

Inference (Webcam)

Hyperparameter Sweeps (W&B)

DVC Pipeline

CI/CD (GitHub Actions)

Docker

W&B Notes

Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
sweeps		sweeps
.dockerignore		.dockerignore
.dvcignore		.dvcignore
Dockerfile		Dockerfile
README.md		README.md
dvc.yaml		dvc.yaml
quantize.py		quantize.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Waste Classification - MLOps Ready

Project Structure

Prerequisites

Setup (Local)

Train

Evaluate

Inference (Webcam)

Hyperparameter Sweeps (W&B)

DVC Pipeline

CI/CD (GitHub Actions)

Docker

W&B Notes

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages