openclaw-opd/4b_lora训练报错问题

这是运行脚本
[run_qwen3_4b_openclaw_opd_topk_lora.sh](https://github.com/user-attachments/files/26999466/run_qwen3_4b_openclaw_opd_topk_lora.sh)
直接运行会无法启动主的推理服务
(SGLangEngine pid=3080582) [2026-04-23 11:02:56] INFO:     Started server process [3080920]
(SGLangEngine pid=3080582) [2026-04-23 11:02:56] INFO:     Waiting for application startup.
(SGLangEngine pid=3080582) [2026-04-23 11:02:56] Using default chat sampling params from model generation config: {'repetition_penalty': 1.0, 'temperature': 0.6, 'top_k': 20, 'top_p': 0.95}

(SGLangEngine pid=3080582) thread '<unnamed>' (3080920) panicked at /root/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/rayon-core-1.13.0/src/registry.rs:171:10:
(SGLangEngine pid=3080582) The global thread pool has not been initialized. ThreadPoolBuildError { kind: IOError(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }) }
(SGLangEngine pid=3080582) note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
(SGLangEngine pid=3080582) [2026-04-23 11:02:56] ERROR:    Traceback (most recent call last):
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/starlette/routing.py", line 694, in lifespan
(SGLangEngine pid=3080582)     async with self.lifespan_context(app) as maybe_state:
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/python3.12/lib/python3.12/contextlib.py", line 210, in __aenter__
(SGLangEngine pid=3080582)     return await anext(self.gen)
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/fastapi/routing.py", line 201, in merged_lifespan
(SGLangEngine pid=3080582)     async with original_context(app) as maybe_original_state:
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/python3.12/lib/python3.12/contextlib.py", line 210, in __aenter__
(SGLangEngine pid=3080582)     return await anext(self.gen)
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/zhongkaipeng/qwenpaw_train/own_train_packages/sglang-d566816d838ce92d3ae044209f7d67eaa58ce74a/python/sglang/srt/entrypoints/http_server.py", line 308, in lifespan
(SGLangEngine pid=3080582)     fast_api_app.state.openai_serving_rerank = OpenAIServingRerank(
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/zhongkaipeng/qwenpaw_train/own_train_packages/sglang-d566816d838ce92d3ae044209f7d67eaa58ce74a/python/sglang/srt/entrypoints/openai/serving_rerank.py", line 212, in __init__
(SGLangEngine pid=3080582)     self._yes_token_id, self._no_token_id = _get_yes_no_token_ids(
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/zhongkaipeng/qwenpaw_train/own_train_packages/sglang-d566816d838ce92d3ae044209f7d67eaa58ce74a/python/sglang/srt/entrypoints/openai/serving_rerank.py", line 30, in _get_yes_no_token_ids
(SGLangEngine pid=3080582)     yes_tokens = tokenizer.encode("yes", add_special_tokens=False)
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2732, in encode
(SGLangEngine pid=3080582)     encoded_inputs = self.encode_plus(
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 3123, in encode_plus
(SGLangEngine pid=3080582)     return self._encode_plus(
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/transformers/tokenization_utils_fast.py", line 627, in _encode_plus
(SGLangEngine pid=3080582)     batched_output = self._batch_encode_plus(
(SGLangEngine pid=3080582)   File "/var/ai-cloud/project/qwenpaw-rl-dev/lib/python3.12/site-packages/transformers/tokenization_utils_fast.py", line 553, in _batch_encode_plus
(SGLangEngine pid=3080582)     encodings = self._tokenizer.encode_batch(
(SGLangEngine pid=3080582) pyo3_runtime.PanicException: The global thread pool has not been initialized. ThreadPoolBuildError { kind: IOError(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" }) }
(SGLangEngine pid=3080582) [2026-04-23 11:02:56] ERROR:    Application startup failed. Exiting.
如果加上一句话：
export TOKENIZERS_PARALLELISM=false
那么脚本可以正常启动，但是每次qwenpaw调工具都会导致prm报错cuda oom,并且崩掉一个显卡的prm推理服务。


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

openclaw-opd/4b_lora训练报错问题 #101

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

openclaw-opd/4b_lora训练报错问题 #101

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions