vavi-speech2

Text to Speech and Speech to Text (JSAPI2) engines for Java

Type	Description	Sythesizer	Recognizer	Quality	Comment
AquesTalk10	AquesTalk, JNA	✅	-	😐	ゆっくり
Google Cloud Text To Speech	Google Cloud Text To Speech, Library	✅	🚧	👑
Cocoa	Rococoa, JNA	✅	🚫	😐
Open JTalk	jtalkdll, JNA	✅	-	💩
VoiceVox	VOICEVOX, REST	✅	-	😃	ずんだもん
CoeiroInk	CoeiroInk, REST	✅	-	😃	つくよみちゃん
Gyutan (Open JTalk in Java)	Gyutan, Library	✅	-	💩
AivisSpeech	Aivis Project, REST	✅	-	👑
Google AI Studio	Google Gemini API, Library	✅	-	🚀
Qwen3-TTS	Openai API, Library	✅	-	💡	voice cloning!

Install

maven

https://jitpack.io/#umjammer/vavi-speech2

AquesTalk10

place AquesTalk10.framework into ~/Library/Frameworks
create symbolic link AquesTalk10.framework/AquesTalk as AquesTalk10.framework/AquesTalk10
write aquesTalk10DevKey into local.properties

Google Cloud Text To Speech

get token as json
set system property "vavi.speech.googlecloud.credential" your_json_path

Open JTalk

make libjtalk.dylib from https://github.com/rosmarinus/jtalkdll
locate libjtalk.dylib into java classpath or jna.library.path system property

VOICEVOX

download the application
run the application before using this library

COEIROINK

download the application
run the application before using this library

DoCoMo AI Agent API (wip)

https://agentcraft.sebastien.ai/

AivisSpeech

download the application
run the application before using this library

Google Gemini API (Google AI Studio)

get api key
set environment variable "GOOGLE_API_KEY" the api key

Qwen3-TTS

install https://github.com/umjammer/Qwen3-TTS-Openai-Fastapi (anything is ok as long as it's openai-compatible api)
run the server before using this library, don't forget to adjust port no.
default url is http://localhost:50090. this is modifiable by the system property vavi.speech.qwen3tts.url

Usage

system property

vavi.speech.voicevox.url ... the VOICEVOX api server url, default is http://localhost:50021.
vavi.speech.coeiroink.url ... the COEIROINK api server url, default is http://localhost:50032.
vavi.speech.aivis.url ... the AivisSpeech api server url, default is http://localhost:10101.
vavi.speech.qwen3tts.url ... the Qwen3-TTS api server url, default is http://localhost:50090.

system property (qwen3-tts specific)

vavi.speech.qwen3tts.clone ... use clone voice or not, default is false.
vavi.speech.qwen3tts.refAudio ... when using clone voice, set reference audio file path. (only wav file is tested)
vavi.speech.qwen3tts.refText ... when using clone voice, set reference text. (transcription of the audio above)

user

Reference

jsr113
- vavi patched (volume enabled)

TODO

~~speech.properties~~
engine
- watson
- ~~open jtalk~~
  - ~~https://github.com/icn-lab/Gyutan~~ (done)
- festival
- amazon polly
- microsoft cognitive services text to speech
- ~~https://github.com/julius-speech/julius~~ -> Gyutan
- ~~VoiceVox~~
  - ~~search レキシカ voice and parameter~~ (wip)
    - vavi.speech.voicevox.VoiceVoxTest#test5
    - RekishikaTest
- https://github.com/espeak-ng/espeak-ng
- https://github.com/festvox/flite
text analytics + nicotalk character emotion (nicotalk branch)
- wave lipsync
  - https://github.com/hecomi/MMD4Mecanim-LipSync-Plugin/blob/master/Assets/LipSync/Core/LipSyncCore.cs
VoiceVox editor compatible
- ~~CoeiroInk~~ ... ~~api doesn't work~~ ~~api is different from VoiceVox?~~ yes
  - https://github.com/sevenc-nanashi/coeiroink-v2-bridge 🎯
  - ~~https://github.com/sinsen9000/MultiSpeech~~ api is old
- LMROID
- SHAREVOX
- http://itvoice.starfree.jp/
~~AVSpeechSynthesizer needs obj-c block~~
~~rcp client/server (wip)~~ -> vavi-speech-rpc
~~[googlecloud] setting by system property instead of env~~

_{images by 霊夢, 魔理沙, ずんだもん}

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
README.md		README.md
jitpack.yml		jitpack.yml
local.properties.sample		local.properties.sample
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vavi-speech2

Install

maven

AquesTalk10

Google Cloud Text To Speech

Open JTalk

VOICEVOX

COEIROINK

DoCoMo AI Agent API (wip)

AivisSpeech

Google Gemini API (Google AI Studio)

Qwen3-TTS

Usage

system property

system property (qwen3-tts specific)

user

Reference

TODO

About

Uh oh!

Releases 23

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

vavi-speech2

Install

maven

AquesTalk10

Google Cloud Text To Speech

Open JTalk

VOICEVOX

COEIROINK

DoCoMo AI Agent API (wip)

AivisSpeech

Google Gemini API (Google AI Studio)

Qwen3-TTS

Usage

system property

system property (qwen3-tts specific)

user

Reference

TODO

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 23

Uh oh!

Contributors

Uh oh!

Languages