In this story we will fine-tune qwen3-coder-next, the most powerful, local-inference capable MoE codegen model as of Feb, 2026, to be used with defuss and document how to use it with both popular IDE bases, VS Code and IntelliJ base as well as OpenCode (OSS Claude Code). We will publish the dataset generation pipeline, the full fine-tuning training pipeline, train the model on a decent GPU cluster and benchmark against qwen3-coder-next. defuss-qwen3-next-inference will be implemented as a vLLM/MLX/llama.cpp-backed custom inference server, running cross-platform at highest speed possible with the lowest GPU resources we can possibly achieve.