SpeakSwiftly

A Swift package providing local, multi-speaker text-to-speech via a typed Swift API and a JSONL worker surface. Includes voice creation by design or clone, as well as custom text normalization via TextForSpeech

Overview

SpeakSwiftly is a TTS-in-a-box solution for Swift app devs. It ships both an importable library product, and a worker executable, The library gives Swift callers a typed runtime surface, while the executable gives non-Swift hosts (Python, Rust, etc.) a newline-delimited JSON protocol over stdio.

Motivation

This project was born from my desire for a simple, "plug-and-play" TTS option for other things I'm building. It's rapidly turned into something I think others will find useful as well.

SpeakSwiftly currently supports:

Typed Swift runtime APIs through SpeakSwiftlyCore
A long-lived JSONL worker executable for non-Swift callers
Stored voice profiles and text-normalization profiles
Resident backend switching between qwen3 and marvis
Resident model unload and reload controls
Managed generated-file and generated-batch artifacts

For deeper contributor-facing architecture notes, runtime behavior details, development guidance, and full verification workflows, see CONTRIBUTING.md.

Setup

SpeakSwiftly is a standard Swift package that depends on:

Library consumers can add the package directly from GitHub:

.package(url: "https://github.com/gaelic-ghost/SpeakSwiftly.git", from: "0.9.2")

Then add SpeakSwiftlyCore to the target that will own the runtime.

SpeakSwiftlyCore also carries a vendored mlx-swift_Cmlx.bundle resource so linked consumers can resolve the packaged MLX shader bundle and bundled default.metallib without spelunking through DerivedData.

For package-local validation:

swift build

For real MLX-backed local worker runs, publish the Xcode-backed runtime first:

sh scripts/repo-maintenance/publish-runtime.sh --configuration Debug

That produces stable local runtime launchers under .local/xcode/current-debug and .local/xcode/current-release.

Usage

Typed Swift Runtime

import SpeakSwiftlyCore
import TextForSpeech

let runtime = await SpeakSwiftly.liftoff()
await runtime.start()

let handle = await runtime.generate.speech(
    text: "Hello there.",
    with: "default-femme"
)

for try await event in handle.events {
    print(event)
}

When the whole input is source code rather than prose with embedded code, use sourceFormat:

let sourceHandle = await runtime.generate.speech(
    text: "struct WorkerRuntime { let sampleRate: Int }",
    with: "default-femme",
    sourceFormat: .swift
)

The typed runtime is organized around stored concern handles that callers can keep and reuse:

runtime.generate
runtime.player
runtime.voices
runtime.normalizer
runtime.jobs
runtime.artifacts

When callers need to construct a standalone text normalizer, SpeakSwiftly.Normalizer(...) now throws if the persisted text-profile archive cannot be loaded or decoded. The worker runtime still uses a best-effort recovery path for unreadable archives so SpeakSwiftly.liftoff() can continue starting in operator-facing environments.

Runtime preferences have a matching typed surface:

import SpeakSwiftlyCore

let configuration = SpeakSwiftly.Configuration(speechBackend: .marvis)
try configuration.save(to: URL(fileURLWithPath: "/tmp/speakswiftly-configuration.json"))

let runtime = await SpeakSwiftly.liftoff(configuration: configuration)

If a host needs the packaged MLX bundle or the exact metallib path, use the support-resource surface:

let mlxBundleURL = try SpeakSwiftly.SupportResources.mlxBundleURL()
let defaultMetallibURL = try SpeakSwiftly.SupportResources.defaultMetallibURL()

Worker Executable

Launch the published runtime through the stable launcher:

sh scripts/repo-maintenance/publish-runtime.sh --configuration Debug
"$PWD/.local/xcode/current-debug/run-speakswiftly"

At startup the worker begins preloading the resident model and emits JSONL status events on stdout.

Consumer Test Harness

The package also ships a small executable consumer harness, SpeakSwiftlyTesting, for package-level smoke tests:

swift run SpeakSwiftlyTesting resources
swift run SpeakSwiftlyTesting status
swift run SpeakSwiftlyTesting smoke

resources prints the packaged bundle and metallib paths, status constructs the typed runtime and prints the first terminal status payload it sees, and smoke runs both checks in sequence.

API Notes

The package currently publishes:

SpeakSwiftlyCore as the typed Swift runtime library
SpeakSwiftly as the worker executable

Key typed runtime entry points include:

runtime.generate.speech(text:with:textProfileName:textContext:sourceFormat:)
runtime.generate.audio(text:with:textProfileName:textContext:sourceFormat:)
runtime.generate.batch(_:with:)
runtime.voices.create(design named:from:vibe:voice:outputPath:)
runtime.voices.create(clone named:from:vibe:transcript:)
runtime.voices.list()
runtime.voices.delete(named:)
runtime.player.list()
runtime.player.pause()
runtime.player.resume()
runtime.player.state()
runtime.player.clearQueue()
runtime.player.cancelRequest(_:)
runtime.jobs.expire(id:)
runtime.jobs.generationQueue()
runtime.jobs.job(id:)
runtime.jobs.list()
runtime.artifacts.file(id:)
runtime.artifacts.files()
runtime.artifacts.batch(id:)
runtime.artifacts.batches()
SpeakSwiftly.SupportResources.bundle
SpeakSwiftly.SupportResources.mlxBundleURL()
SpeakSwiftly.SupportResources.defaultMetallibURL()
runtime.status()
runtime.switchSpeechBackend(to:)
runtime.reloadModels()
runtime.unloadModels()

The typed Swift library and the JSONL worker surface intentionally use different naming styles:

Swift keeps Cocoa-style method names that read naturally at the call site.
JSONL keeps snake_case, verb-first operation names.
JSONL read-one operations use get_*.
JSONL collection and queue reads use list_*.
JSONL CRUD-style writes use create_*, replace_*, and delete_*.
JSONL lifecycle and control operations keep literal verbs like generate_*, set_*, reload_*, unload_*, pause, resume, clear_*, cancel_*, load_*, save_*, and reset_* when the operation is not best modeled as CRUD.

Resident runtime controls currently map like this:

Typed Swift API	JSONL `op`	Notes
`status(id:)`	`"get_status"`	Returns the current `stage`, `resident_state`, and `speech_backend`.
`switchSpeechBackend(to:id:)`	`"set_speech_backend"`	Requires a `"speech_backend"` field on the JSONL request.
`reloadModels(id:)`	`"reload_models"`	Re-warms the currently selected resident backend.
`unloadModels(id:)`	`"unload_models"`	Drops resident models from memory and parks later resident-dependent generation until residency returns.

Command Reference

The worker protocol is newline-delimited JSON over standard input and output.

Representative request shapes:

{"id":"req-1","op":"generate_speech","text":"Hello there","profile_name":"default-femme"}
{"id":"req-1f","op":"generate_audio_file","text":"Save this one for later playback.","profile_name":"default-femme"}
{"id":"req-batch","op":"generate_batch","profile_name":"default-femme","items":[{"text":"First saved file."},{"artifact_id":"custom-batch-artifact","text":"Second saved file.","text_profile_name":"logs"}]}
{"id":"req-text-style","op":"get_text_profile_style"}
{"id":"req-set-text-style","op":"set_text_profile_style","text_profile_style":"compact"}
{"id":"req-status","op":"get_status"}
{"id":"req-generated-file","op":"get_generated_file","artifact_id":"req-1f-artifact-1"}
{"id":"req-generated-files","op":"list_generated_files"}
{"id":"req-switch","op":"set_speech_backend","speech_backend":"marvis"}
{"id":"req-reload","op":"reload_models"}
{"id":"req-unload","op":"unload_models"}

Representative response and event shapes:

{"event":"worker_status","stage":"warming_resident_model","resident_state":"warming","speech_backend":"qwen3"}
{"event":"worker_status","stage":"resident_model_ready","resident_state":"ready","speech_backend":"qwen3"}
{"id":"req-unload","ok":true,"status":{"event":"worker_status","stage":"resident_models_unloaded","resident_state":"unloaded","speech_backend":"qwen3"},"speech_backend":"qwen3"}
{"id":"req-after-unload","event":"queued","reason":"waiting_for_resident_models","queue_position":1}
{"id":"req-reload","ok":true,"status":{"event":"worker_status","stage":"resident_model_ready","resident_state":"ready","speech_backend":"qwen3"},"speech_backend":"qwen3"}

Raw JSONL callers should send absolute filesystem paths for path fields, or include cwd when using relative paths. The typed Swift helpers populate caller working-directory context automatically.

For the full wire examples, detailed event flow, and operator-facing behavior notes, see CONTRIBUTING.md.

Development

Use this repository as the primary development home for SpeakSwiftly. Keep the public README focused on product and usage information, and put contributor-facing architecture notes, repository workflow, and deep operational guidance in CONTRIBUTING.md.

For package-focused development, prefer:

swift build
swift test

For real runtime verification and published local worker workflows, use the scripts under scripts/repo-maintenance/ as described in CONTRIBUTING.md.

Verification

Baseline package verification:

swift build
swift test

Real MLX-backed runtime verification starts by publishing the Xcode-backed runtime:

sh scripts/repo-maintenance/publish-runtime.sh --configuration Debug
sh scripts/repo-maintenance/verify-runtime.sh --configuration Debug

Extended e2e, trace-capture, and deep-trace workflows are documented in CONTRIBUTING.md.

License

Apache License 2.0. See LICENSE and NOTICE.

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.agents/plugins		.agents/plugins
.codex/plugins		.codex/plugins
.github/workflows		.github/workflows
Sources		Sources
Tests/SpeakSwiftlyTests		Tests/SpeakSwiftlyTests
docs/maintainers		docs/maintainers
plugins/apple-dev-skills		plugins/apple-dev-skills
scripts/repo-maintenance		scripts/repo-maintenance
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
Package.resolved		Package.resolved
Package.swift		Package.swift
README.md		README.md
ROADMAP.md		ROADMAP.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpeakSwiftly

Table of Contents

Overview

Motivation

Setup

Usage

Typed Swift Runtime

Worker Executable

Consumer Test Harness

API Notes

Command Reference

Development

Verification

License

About

Uh oh!

Releases 48

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SpeakSwiftly

Table of Contents

Overview

Motivation

Setup

Usage

Typed Swift Runtime

Worker Executable

Consumer Test Harness

API Notes

Command Reference

Development

Verification

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 48

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages