v0.11.0
✨ Highlights
Accurate token counts in the sidebar. The TUI sidebar's "Conversation", "System", "Memories", "Compartments", "Tool Calls", and residual "Tool Defs + Overhead" numbers now match what Anthropic actually sees on the wire — validated within 0.1% against a live 340K-input request capture. Sessions will self-heal on the next transform pass without any user action.
New "Tool Calls" sidebar slice. Tool-invocation tokens (tool_use, tool_result) are now tracked and displayed separately from the fixed tool-schema overhead. This tells you at a glance how much context is reducible via ctx_reduce vs how much is structural.
🐛 Fixes
Tokenizer fallback was silently active. The ai-tokenizer integration shipped in an earlier release used an eval("require") pattern that silently threw inside Bun's ESM runtime on every call — so every token count in the sidebar and status dialog was running through a chars/3.5 heuristic. On long sessions this inflated the "Tool Defs + Overhead" residual from ~22K (real) to ~90K+ (fake) and misattributed ~70K tokens across segments. Static ESM import now makes the real Claude tokenizer actually load. Sessions will self-heal on next transform pass (watermark-based re-count when drift >50 tokens).
Opus 4.7 thinking-block integrity. The stripReasoningFromMergedAssistants workaround for consecutive-assistant merges now handles wire-format thinking and redacted_thinking types in addition to OpenCode's internal reasoning. Previously, two consecutive assistants each carrying a thinking block could slip through unchanged and trigger Anthropic's "thinking blocks ... cannot be modified" 400 error — exactly the failure mode this function exists to prevent.
Cache-bust cascade on upgraded DBs. session_meta validator now tolerates NULL for INTEGER columns added later via ensureColumn (system_prompt_tokens, conversation_tokens, tool_call_tokens, times_execute_threshold_reached, compartment_in_progress, cleared_reasoning_through_tag). Before this fix, very old DBs could fail the validator, fall back to defaults, reset lastResponseTime=0, and cause the scheduler to return "execute" on every pass — endless cache busts. Added a companion healNullIntegerColumns heal to normalize any such rows on startup.
Persistent transform errors are now visible. The top-level transform error handler previously swallowed every error as if it were transient, which meant persistent schema or programming bugs silently disabled magic-context for the entire session with no user-facing signal. It now distinguishes SQLITE_BUSY / SQLITE_LOCKED (log + skip, normal behavior) from persistent non-transient errors (log with full detail + persist a summary into session_meta.last_transform_error, which the sidebar already surfaces).
Token cache invalidation coverage. The per-message token cache is now invalidated on message.updated (per-message, with session-wide fallback when the event lacks a message id) and on session.compacted (session-wide, since native compaction restructures messages). Previously only message.removed and session.deleted were covered, so retries or compaction could leave stale counts in memory.
Memory injection budget consistency. Memory trim-to-budget switched from chars/4 heuristic to the real estimateTokens() call, matching the rest of the plugin's token math. Prevents under-packing code/JSON-heavy memories or over-packing prose-heavy ones.
Robustness micro-fixes.
readUint32BEnow coerces to unsigned with>>> 0so malformed PNG headers with MSB-set bytes don't bypass the< 1fallback and produce wrong image-token counts.- Removed dead
|| 0in the WebP lossy parser (the& 0x3fffmask already produces a non-negative result). - Write-if-changed guard on
lastTransformErrorprevents WAL write amplification during persistent-error states.
🔧 Internal
- New persisted columns:
conversation_tokens,tool_call_tokensinsession_meta. Added viaensureColumn— no migration required. MessageUpdatedAssistantInfogains optionalmessageIDfrominfo.idso per-message cache invalidation can target precisely.- Test schemas updated across
command-handler,note-nudger,heuristic-cleanup,ctx-reduce,storage-tags, andstoragetests to include the two new token columns. - New focused test file
clear-message-tokens-cache.test.tscovers per-message and session-wide cache invalidation paths plus cross-session isolation (5 tests). stripReasoningFromMergedAssistantstest coverage expanded with 3 cases exercising wire-formatthinking/redacted_thinkingsequences.- All plugin token math unified on
estimateTokens()fromai-tokenizer— only remaining chars/3.5 estimate is intentional (pre-filter bucket for dreamer key-file selection where file content isn't loaded).
🔬 Process
This release was validated with two Athena solo-mode council audits:
- Post-implementation audit surfaced 12 findings. 9 were real bugs and fixed (including the thinking/redacted_thinking merge-strip gap and the NULL INTEGER column cache-bust cascade). 3 were deliberately skipped as bounded display edge cases.
- Verification audit of the fixes confirmed ship-readiness with strong agreement across 7 members. Three optional low-priority suggestions were applied on top.
540 tests passing (up from 532 at release start), typecheck clean, build clean.
Full changelog: v0.10.1...v0.11.0