Add GLM5 SFT support by samaritan1998 · Pull Request #1844 · THUDM/slime

samaritan1998 · 2026-04-20T04:50:15Z

Summary

Add a GLM5-specific SFT loss mask type that follows GLM-style stop markers.
Add a GLM5 SFT launch script using sft_loss and the existing GLM5 Megatron model config.
Add unit coverage for multi-turn GLM5 masking, tool calls, and step_loss_mask handling.

Ran GLM5 mask tests and adjacent Qwen3.5 mask tests via direct Python invocation with a lightweight transformers stub, because this local environment does not have pytest or transformers installed.
bash -n scripts/run-glm5-744B-A40B-sft.sh
git diff --check

samaritan1998

Local validation passed for GLM5 loss mask behavior and script syntax.

Add GLM5 SFT support

58e6e7c

samaritan1998 commented Apr 20, 2026

View reviewed changes

samaritan1998 marked this pull request as ready for review April 20, 2026 04:52