feat: complete on-device Moshi voice inference for iOS by LegalPrimes · Pull Request #403 · kyutai-labs/moshi

LegalPrimes · 2026-02-14T17:53:11Z

Summary

Add HuggingFace model download script for LM (q8 GGUF), Mimi codec, and tokenizer
Rebuild XCFramework with all 16 C functions in header (was missing 5), binary size reduced from ~128MB to ~46MB per slice via symbol stripping
Create idiomatic Swift FFI wrapper (MoshiFFI.swift) with thread-safe serial dispatch queue, ARC cleanup, and proper error handling
Add macOS Rust integration test harness for end-to-end inference verification

Test plan

bash -n rust/scripts/download-models.sh — syntax check
./rust/scripts/build-xcframework.sh — full XCFramework rebuild with verification
cargo test --package moshi-ios — unit tests pass
cargo test --package moshi — existing tests pass
Integration tests compile and are #[ignore]d by default (require model files to run: MOSHI_TEST_LM_PATH=... cargo test --package moshi-ios --test inference_test -- --ignored)

🤖 Generated with Claude Code

Adds moshi_mlx.bridge module for JSON-RPC communication with MoshiMLXBackend.swift: Protocol: - Stdin commands: initialize, start, stop, audio (base64), get_status - Stdout events: loading, ready, user_text, model_text, audio, lag, error Features: - Async stdin/stdout JSON-RPC loop - Multiprocessing for model inference (server process) - Graceful SIGTERM handling - Progress reporting during model loading - Base64 audio encoding/decoding

- Downloads LM weights (model.q8.gguf) and tokenizer from kyutai/moshika-candle-q8 - Downloads Mimi codec (model.safetensors) from kyutai/mimi - Idempotent: skips files that already exist - Configurable output dir via CLI arg or MOSHI_MODEL_DIR env var - Warns when HF_TOKEN is not set for gated repos Task: fn-1-complete-moshi-on-device-voice.1

…ized size - Update build script to use release-no-debug profile with strip="symbols" reducing static library size from ~128MB to ~46MB per slice - Add CMAKE_POLICY_VERSION_MINIMUM and iOS toolchain shim to fix sentencepiece cross-compilation with CMake 4.x - Remove cdylib from moshi-ios crate-type (only staticlib needed) - Build script always rebuilds both targets to ensure header freshness - Add verification steps: header function count, lipo -info, module maps - Header now contains all 17 C functions (was 84 lines/~11 functions) Task: fn-1-complete-moshi-on-device-voice.2

- Add MoshiSession class wrapping C FFI opaque pointer with ARC (deinit) - Add MoshiError enum bridging moshi_get_last_error() to Swift errors - Enforce thread safety via private serial DispatchQueue for all FFI calls - Wrap 14 of 16 C functions (excludes env-var moshi_load_model and internal moshi_free_string) - Use moshi_process_audio_ex for safer variable-length output handling - Expose sampleRate (24kHz) and frameSamples (1920) as static constants - Manage C string memory with defer { moshi_free_string() } pattern - No force-unwraps; all optionals handled safely Task: fn-1-complete-moshi-on-device-voice.3

…rification - Create integration test at rust/moshi-ios/tests/inference_test.rs exercising the full FFI lifecycle: create -> load -> process -> verify -> destroy - Add rlib crate-type to moshi-ios so Rust integration tests can link against it - Test processes 10 frames (19200 samples, 0.8s) of 440Hz sine wave input - Verifies moshi_process_audio_ex returns 0 and produces non-empty output - Verifies moshi_is_initialized, moshi_get_sample_rate, moshi_metal_available - Model paths via env vars (MOSHI_TEST_LM_PATH, MOSHI_TEST_MIMI_PATH); skips gracefully - All tests marked #[ignore] by default, opt-in via --ignored flag - Includes error-path test for processing audio without loaded model Task: fn-1-complete-moshi-on-device-voice.4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

LegalPrimes and others added 7 commits February 13, 2026 17:53

feat: WebSocket server for iOS clients

a8b872c

chore: update flow task state

43f6604

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: complete on-device Moshi voice inference for iOS#403

feat: complete on-device Moshi voice inference for iOS#403
LegalPrimes wants to merge 7 commits intokyutai-labs:mainfrom
LegalPrimes:feat/fn-1-complete-moshi-on-device-voice

LegalPrimes commented Feb 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LegalPrimes commented Feb 14, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant