-
Notifications
You must be signed in to change notification settings - Fork 183
Open
Labels
StoryNext iteration summary and TODO listNext iteration summary and TODO list
Description
要用这个模型的全部功能有点麻烦,问题就在于没有统一能力(没有开源voice editing),导致需要加载不同模型
没太理解论文里面table1为什么vd和cv模型无法克隆,按道理说,就算不支持克隆指令,完全用上下文拼接靠icl应该也可以克隆的
计划实现为 icl 克隆模式,支持用 voice design 和 custom voice 模型来克隆音色,效果可能不行,但是对于显存不够的情况还是很实用的
Table 1: Overview of the Qwen3-TTS model family.
| Model Name | Streaming | Multilinguality | Voice Clone | Instruction Following |
|---|---|---|---|---|
| Qwen3-TTS-12Hz-1.7B-Base | ✓ | ✓ | ✓ | |
| Qwen3-TTS-12Hz-1.7B-VoiceDesign | ✓ | ✓ | ✓ | |
| Qwen3-TTS-12Hz-1.7B-CustomVoice | ✓ | ✓ | ✓ | |
| Qwen3-TTS-12Hz-0.6B-Base | ✓ | ✓ | ✓ | |
| Qwen3-TTS-12Hz-0.6B-CustomVoice | ✓ | ✓ | ||
| Qwen3-TTS-25Hz-1.7B-Base | ✓ | ✓ | ✓ | |
| Qwen3-TTS-25Hz-1.7B-VoiceEditing | ✓ | ✓ | ✓ | ✓ |
| Qwen3-TTS-25Hz-1.7B-CustomVoice | ✓ | ✓ | ✓ | |
| Qwen3-TTS-25Hz-0.6B-Base | ✓ | ✓ | ✓ | |
| Qwen3-TTS-25Hz-0.6B-CustomVoice | ✓ | ✓ |
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
StoryNext iteration summary and TODO listNext iteration summary and TODO list