This is a bare-bones language model trainer. Feed it whatever text you want and it’ll train a tiny GPT-like model from scratch.
There’s a GPU test, a training script that spits out samples as it learns, and a CLI to poke the trained model. Everything’s plain Python, easy to tweak, and meant as a simple playground for small models.
- Prepare data: Put your plain-text corpus (UTF-8) in
trainingdata_corpus.txt. - Train:
venv\Scripts\activate
pip install -r requirements.txt
python step0-testcuda.py rem (optional)
python train.py- Chat (command line):
python CLI.py- CPU only? Edit
config.jsonafter training to include"providers": ["CPUExecutionProvider"], then run the CLI.