A custom transformer model training and deployment toolkit built for rapid experimentation, efficient fine-tuning, and seamless deployment to Hugging Face Hub.
This repository is centered around one principle:
Experiment → Fail → Optimize → Repeat
It provides a streamlined wrapper layer on top of modern LLM tooling to accelerate training speed, simplify inference, and reduce engineering overhead during iteration.
you can find a model I trained from this playground below and their HuggingFace links:
| Model | README | HuggingFace |
|---|---|---|
| my-model | readme | link |
| mini-stories | readme | link |
This project was built to:
- Maximize LLM training throughput
- Reduce friction in experimentation cycles
- Standardize model training and evaluation workflows
- Simplify inference and deployment pipelines
- Improve reproducibility across experiments
Deploy trained models to HuggingFace Hub:
python src/scripts/hf_deploy.py \
--model_path <model-path> \
--config_path <config-path> \
--repo_id username/my-model
# .pt convert to .safetensors then upload
python src/scripts/hf_deploy.py \
--model_path <model-path> \
--config_path <config-path> \
--repo_id username/my-model \
--tokenizer_path <tokenizer-path> \
--
# direct .safetensors upload
huggingface-cli upload prasannaJagadesh/my-model \
./Prasanna-SmolLM-360M-3.1 \
--repo-type model Deploy datasets to HuggingFace Hub:
python src/scripts/dataset-deploy.py \
--dataset-path <dataset-path> \
--repo-id username/my-dataset
python src/scripts/dataset-deploy.py \
--dataset-path <dataset-path> \
--repo-id username/csv-dataset \
--description "My dataset description" \
--license "mit" \
--tags nlp classification
python src/scripts/dataset-deploy.py \
--dataset-path <dataset-path> \
--repo-id username/my-dataset \
--preserve-cardConvert old checkpoint formats to HuggingFace:
python src/scripts/hf-old-conversion.py \
--model_path <model-path> \
--repo_id username/converted-model \
--config_path <config-path>
python src/scripts/hf-old-conversion.py \
--model_path <model-path> \
--repo_id username/converted-model \
--n_layer 12 \
--n_embd 768 \
--attention MQAsrc/
├── config/ # Centralized configuration (TrainingConfig, DeployConfig)
├── templates/ # Model card templates for HuggingFace
├── scripts/ # CLI tools (hf_deploy.py, dataset-deploy.py)
├── pretrain/ # Training and inference engines
├── customTransformers/ # Custom transformer architectures
├── attention/ # Attention mechanisms (MHA, GQA, MQA)
├── FFN/ # Feed-forward network variants
├── models/ # Model-specific training scripts
└── services/ # Cloud storage services