Bachelor thesis project on teacher-guided QR-DQN using distillation, preference-based reward relabeling, and behavior cloning.
python machine-learning reinforcement-learning deep-reinforcement-learning pytorch knowledge-distillation preference-learning behavior-cloning qr-dqn distributional-rl human-ai-interaction teacher-guided-learning reward-relabeling
-
Updated
Apr 16, 2026 - Python