Add on-device LLM post-processing for transcriptions#106
Open
userFRM wants to merge 1 commit intoStarmel:masterfrom
Open
Add on-device LLM post-processing for transcriptions#106userFRM wants to merge 1 commit intoStarmel:masterfrom
userFRM wants to merge 1 commit intoStarmel:masterfrom
Conversation
Adds manual per-recording LLM text processing using Apple MLX framework for on-device inference. Users can clean up filler words, restructure transcriptions as developer prompts, or format as structured markdown via a wand menu on each recording. Key changes: - New LLM/ module: LLMPostProcessor (model lifecycle with generation counter for race protection), LLMProcessingMode (clean/dev/markdown), LLMModelRegistry (curated models 0.5B-4B), DevModePrompts - Per-recording processing with rawTranscription for revert-to-raw - Settings: Speech/LLM sub-tabs, model install with cancel + rollback - DB migration v3 adding rawTranscription column with backfill - Conditional compilation (#if canImport(MLX)) for non-MLX builds - Build system: dynamic libomp discovery, Metal Toolchain checks - mlx-swift-lm pinned to semver 2.30.3 (was branch: main) - CI pinned to Xcode 16.2 (workaround for ml-explore/mlx-swift-lm#94) - Fix PermissionsManager to use AXIsProcessTrustedWithOptions
5392107 to
43d096d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Details
New files
OpenSuperWhisper/LLM/LLMPostProcessor.swift— Singleton managing MLX model lifecycle with generation counter for async race protection, task coalescing for duplicate loads, andTask.detachedfor off-main-thread inferenceOpenSuperWhisper/LLM/LLMProcessingMode.swift— Processing mode enum (raw/clean/dev/markdown) with system promptsOpenSuperWhisper/LLM/LLMModelRegistry.swift— Curated registry of 8 MLX models (5 general purpose, 3 code-specialized)OpenSuperWhisper/LLM/DevModePrompts.swift— System prompt for converting spoken developer dictation to structured markdownModified files
@MainActoron SettingsViewModel, Speech/LLM sub-tabs, model install flow with cancel button and rollback on failure/cancel (race-safe via model-id guards)rawTranscriptionfield, DB migration v3 with backfillrawTranscriptionset on hotkey recordings, removed dead.processingstaterawTranscriptionthrough on transcription completionAXIsProcessTrustedWithOptionsinstead ofAXIsProcessTrusted()upToNextMajorVersion: 2.30.3Test plan
rawTranscriptionis set, wand menu works (clean/dev/markdown)rawTranscriptionis set, revert-to-raw works./run.shon both Intel and Apple Silicon Macs🤖 Generated with Claude Code