Currently if there's a long text input, qwen3 tts will simply process it all, with empty output or noise after 2:30 minutes; but if we simply split the text to multiple short pieces, the voice consistency will be broken;
I think add function to auto split text inside ctx, and use the last sentence or few words as voice clone, starting the next piece of text, thus the voice consistency be maintained, the computation speed up, then we can easily create long audios.
Currently if there's a long text input, qwen3 tts will simply process it all, with empty output or noise after 2:30 minutes; but if we simply split the text to multiple short pieces, the voice consistency will be broken;
I think add function to auto split text inside ctx, and use the last sentence or few words as voice clone, starting the next piece of text, thus the voice consistency be maintained, the computation speed up, then we can easily create long audios.