[feature request] enhance consistency of TTS

Currently if there's a long text input, qwen3 tts will simply process it all, with empty output or noise after 2:30 minutes; but if we simply split the text to multiple short pieces, the voice consistency will be broken;


I think add function to auto split text inside ctx, and use the last sentence or few words as voice clone, starting the next piece of text, thus the voice consistency be maintained, the computation speed up, then we can easily create long audios.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feature request] enhance consistency of TTS #2132

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[feature request] enhance consistency of TTS #2132

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions