qwen3 inference close thinking mode

### Checklist / 检查清单

- [x] I have searched existing issues, and this is a new question or discussion topic. / 我已经搜索过现有的 issues，确认这是一个新的问题与讨论。

### Question Description / 问题描述

python代码做Qwen3的推理时，如何关闭thinking，3.12版本
https://swift.readthedocs.io/en/v3.12/Instruction/Inference-and-deployment.html
```
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.infer_engine import TransformersEngine, RequestConfig, InferRequest
model = 'Qwen/Qwen2.5-0.5B-Instruct'

# Load the inference engine
engine = TransformersEngine(model, max_batch_size=2)
request_config = RequestConfig(max_tokens=512, temperature=0)

# Using 2 infer_requests to demonstrate batch inference
infer_requests = [
    InferRequest(messages=[{'role': 'user', 'content': 'Who are you?'}]),
    InferRequest(messages=[{'role': 'user', 'content': 'Where is the capital of Zhejiang?'},
                           {'role': 'assistant', 'content': 'The capital of Zhejiang Province, China, is Hangzhou.'},
                           {'role': 'user', 'content': 'What are some fun places here?'}]),
]
resp_list = engine.infer(infer_requests, request_config)
query0 = infer_requests[0].messages[0]['content']
print(f'response0: {resp_list[0].choices[0].message.content}')
print(f'response1: {resp_list[1].choices[0].message.content}')
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

qwen3 inference close thinking mode #8033

Checklist / 检查清单

Question Description / 问题描述

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

qwen3 inference close thinking mode #8033

Description

Checklist / 检查清单

Question Description / 问题描述

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions