Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
generate_with_retool.py	generate_with_retool.py
requirements.txt	requirements.txt
retool_qwen3_4b_rl.sh	retool_qwen3_4b_rl.sh
retool_qwen3_4b_sft.sh	retool_qwen3_4b_sft.sh
rl_data_preprocess.py	rl_data_preprocess.py
sft_data_processing.py	sft_data_processing.py
tool_sandbox.py	tool_sandbox.py

Name

Last commit message

Last commit date

README.md

generate_with_retool.py

requirements.txt

retool_qwen3_4b_rl.sh

retool_qwen3_4b_sft.sh

rl_data_preprocess.py

sft_data_processing.py

tool_sandbox.py

Retool: from SFT to RL

This example demonstrates how to use the retool functionality for tool-enabled language model generation.

Overview

The retool example provides:

Safe Python code execution in a sandbox environment
Tool registry for managing available tools
Integration with language model generation
Reward calculation for tool usage

Files

generate_with_retool.py: Main generation function with tool support
tool_sandbox.py: Tool execution and safety management
sft_data_processing.py: Process SFT dataset

Usage

Setup and download datasets:

cd slime
pip install -e . --no-deps
# For SFT part, you can use later model to RL directly and skip SFT. 
hf download --repo-type dataset JoeYing/ReTool-SFT  --local-dir /root/JoeYing/ReTool-SFT
hf download Qwen/Qwen3-4B-Instruct-2507 --local-dir /root/Qwen/Qwen3-4B-Instruct-2507

# For RL part
hf download --repo-type dataset zhuzilin/dapo-math-17k --local-dir /root/dapo-math-17k
hf download --repo-type dataset zhuzilin/aime-2024  --local-dir /root/aime-2024
# download our SFT model if you want to skip SFT
hf download font-info/qwen3-4b-sft-SGLang-RL --local-dir /root/font-info/qwen3-4b-sft

Create torch dist For SFT

source scripts/models/qwen3-4B.sh
PYTHONPATH=/root/Megatron-LM python tools/convert_hf_to_torch_dist.py \
    ${MODEL_ARGS[@]} \
    --hf-checkpoint /root/Qwen/Qwen3-4B-Instruct-2507 \
    --rotary-base 5000000 \
    --save /root/Qwen/Qwen3-4B-Instruct-2507_torch_dist

Or RL only

source scripts/models/qwen3-4B.sh
PYTHONPATH=/root/Megatron-LM python tools/convert_hf_to_torch_dist.py \
    ${MODEL_ARGS[@]} \
    --hf-checkpoint /root/font-info/qwen3-4b-sft \
    --rotary-base 5000000 \
    --save /root/font-info/qwen3-4b-sft_torch_dist

SFT:

python examples/retool/sft_data_processing.py
bash examples/retool/retool_qwen3_4b_sft.sh

bash examples/retool/retool_qwen3_4b_rl.sh

Use in your training scripts by importing the generate function:

from generate_with_retool import generate, reward_func

Tool Format

The system uses the following tool format:

You may call one or more functions to assist with the user query.

You are provided with function signatures within <tools></tools> XML tags:
<tools>
{"type": "function", "function": {"name": "code_interpreter", "description": "A tool for executing code.", "parameters": {"type": "object", "properties": {"code": {"type": "string", "description": "The code to execute."}}, "required": ["code"]}}}
</tools>

For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:
<tool_call>
{"name": <function-name>, "arguments": <args-json-object>}
</tool_call>

Safety Features

Code execution in isolated sandbox
Memory and time limits
Dangerous operation detection
Allowed module restrictions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Retool: from SFT to RL

Overview

Files

Usage

Tool Format

Safety Features

FilesExpand file tree

retool

Directory actions

More options

Directory actions

More options

Latest commit

History

retool

Folders and files

parent directory

README.md

Retool: from SFT to RL

Overview

Files

Usage

Tool Format

Safety Features