Skip to content

fix(tab-manager): stabilize AI chat — reactivity fixes, prompt hardening, and dead code removal#1527

Merged
braden-w merged 11 commits intomainfrom
opencode/jolly-mountain
Mar 13, 2026
Merged

fix(tab-manager): stabilize AI chat — reactivity fixes, prompt hardening, and dead code removal#1527
braden-w merged 11 commits intomainfrom
opencode/jolly-mountain

Conversation

@braden-w
Copy link
Member

The tab-manager's AI chat feature was built on early-alpha TanStack AI. It worked, but accumulated a layer of workarounds and indirection that made it fragile. This branch is a focused stabilization pass: fix the bugs we could reproduce, harden the prompt architecture so the LLM stops acting on tabs it can't control, and remove dead code that was just getting in the way.

The reactivity bug

TanStack AI's StreamProcessor mutates tool-call parts in place — it writes output, state, and approval directly onto existing objects. Svelte 5's fine-grained reactivity tracks object identity, and SvelteMap doesn't deep-proxy its values. So when TanStack mutated a tool-call part, Svelte never noticed. Tool-call badges, approval UI, and results just… didn't update.

The fix is a shallow clone on every onMessagesChange callback:

// Before: Svelte misses in-place mutations
onMessagesChange(messages) {
  messageStore.set(convId, messages);
}

// After: new identity → Svelte picks up changes
onMessagesChange(messages) {
  messageStore.set(convId, messages.map(msg => ({
    ...msg,
    parts: msg.parts.map(part => ({ ...part })),
  })));
}

This is the correct approach for raw ChatClient. A future migration to @tanstack/ai-svelte's useChat (which owns the $state internally) would eliminate the need for manual cloning entirely — that's tracked in a separate spec.

The cross-device problem

The AI was happily trying to close tabs from other synced devices — and succeeding at calling the Chrome API with those IDs, which returned closedCount: 0 with no error. The LLM had no way to know which tabs were local.

The fix was two-layered: inject the current device ID into a separate, immutable system message so the LLM knows the rules before it acts, and write those rules as hard constraints rather than suggestions:

export function buildDeviceConstraints(deviceId: string): string {
  return `## Current Device — Hard Constraints
- Current device ID: "${deviceId}".
- A tab is mutable only if its ID starts with "${deviceId}_".
- Never call a mutating tool for any tab ID that does not start with "${deviceId}_".
- Tabs from other devices are read-only.`;
}

The prompt architecture now sends two system messages — device constraints first (always), then the base/custom prompt second. Even if a conversation overrides its system prompt, the device constraints can't be bypassed:

┌──────────────────────────────────────────────────┐
│  System Message 1 (immutable)                    │
│  buildDeviceConstraints(deviceId)                │
│  "Never call a mutating tool for any tab ID      │
│   that does not start with abc123_"              │
├──────────────────────────────────────────────────┤
│  System Message 2 (overridable)                  │
│  conv.systemPrompt ?? TAB_MANAGER_SYSTEM_PROMPT  │
│  Role, capabilities, behavioral guidelines       │
└──────────────────────────────────────────────────┘

Removing the indirection layer

tab-actions.ts (345 lines) was a single-consumer indirection layer. Every execute* function was called from exactly one .withActions() handler in workspace.ts. The file existed because the actions were defined before the workspace had a .withActions() API — once that API landed, the indirection was just noise.

Deleting it and inlining the Chrome API calls directly into the mutation handlers made the code easier to follow and revealed opportunities for tryAsync, Promise.allSettled, and batch operations that weren't obvious when the logic was split across two files:

Before                              After
──────                              ─────
workspace.ts                        workspace.ts
  handler: (args) =>                  handler: async ({ tabIds }) => {
    executeCloseTabs(tabIds, id)        const nativeIds = toNativeIds(tabIds, deviceId);
         │                              await tryAsync({
         ▼                                try: () => browser.tabs.remove(nativeIds),
tab-actions.ts                            catch: () => Ok(undefined),
  executeCloseTabs(tabIds, id) {        });
    // parse IDs, call Chrome API       return { closedCount: nativeIds.length };
    // handle errors                  }
  }

Similarly, reconnectSync() was a one-liner wrapper around workspaceClient.extensions.sync.reconnect() — five callsites were using the wrapper instead of the method directly. Removed the wrapper, updated the callsites.

Everything else

TanStack AI 0.5.x → 0.6.x — all seven @tanstack/ai-* packages bumped. The new version tightens the tools parameter from object[] to Tool[], requiring a type assertion in ai-chat.ts.

Default provider → OpenAIOPENAI_CHAT_MODELS[0] is now the default model for new conversations.

Continuation timeout — after a tool executes, the ChatClient fires a continuation request for the LLM to respond to the tool result. Sometimes the server never starts streaming (API timeout, rate limiting, network). Without a timeout, the loading dots persist forever. Now auto-stops after 60 seconds and surfaces an error.

Responsive chat drawer — replaced fixed h-[400px] with clamp(300px, 50vh, 600px) so the panel scales with viewport.

Terminologymutationaction across tool-bridge JSDoc and the progressive tool trust spec. Matches the naming convention used everywhere else.

Changelog

  • Fix AI chat tool results not updating in real-time during streaming
  • Fix AI attempting to close tabs from other synced devices
  • Add 60-second timeout for hung AI continuation requests
  • Switch default AI provider to OpenAI

braden-w added 11 commits March 13, 2026 00:05
Replace fixed h-[400px] with clamp(300px, 50vh, 600px) so the chat
panel scales with viewport—comfortable on laptops and desktops alike.
Delete tab-actions.ts and move Chrome API execution logic directly
into the workspace .withActions() mutation handlers. The file was a
single-consumer indirection layer — every execute* function was only
called from one corresponding handler in workspace.ts.

Adds nativeTabId helper alongside the existing composite ID parsers
and imports generateId for the save handler.
Shallow-clone messages and parts in onMessagesChange to break reference
identity—TanStack AI's StreamProcessor mutates tool-call parts in place
(output, state, approval) but SvelteMap doesn't deep-proxy, so Svelte 5's
fine-grained reactivity missed in-place mutations on tool-call badges,
approval UI, and tool results.

Inject current device ID into the system prompt so the LLM knows which
tabs are local vs read-only from other synced devices. Previously the AI
would attempt to close tabs from other devices, resulting in closedCount: 0.

Update system-prompt wording to clarify that destructive actions have their
own approval UI (don't double-confirm in prose) and replace 'mutation'
terminology with 'action'.

Add visible logging (console.log for status/loading transitions, onError
callback) to diagnose hung continuation requests where Stream 3 never
resolves.
Replace 'mutation' with 'action' in tool-bridge JSDoc to match the
terminology used throughout the codebase. Mark two completed success
criteria in the progressive tool trust spec.
Update all TanStack AI packages from 0.5.x to 0.6.x:
- @tanstack/ai: ^0.5.1 → ^0.6.3
- @tanstack/ai-anthropic: ^0.5.0 → ^0.6.0
- @tanstack/ai-openai: ^0.5.0 → ^0.6.0
- @tanstack/ai-client: ^0.4.5 → ^0.6.0
- @tanstack/ai-gemini: ^0.5.0 → ^0.8.0
- @tanstack/ai-grok: ^0.5.0 → ^0.6.0
- @tanstack/ai-svelte: ^0.5.4 → ^0.6.4

The new version tightens the tools parameter in chat() from
object[] to Tool[]. Destructure tools separately and assert the
proper type in ai-chat.ts to satisfy the stricter signature.
OPENAI_CHAT_MODELS[0] is now the default model for new conversations,
picking up the latest model from the updated provider package.
Three prompt architecture improvements per expert review:
1. Split system prompt into separate messages — device constraints
   in their own system message, base/custom prompt in a second.
2. Rewrite device block as hard constraints using imperative language
   ('Never call a mutating tool for any tab ID that does not start
   with…') instead of softer advisory phrasing.
3. Immutable safety block — device constraints always first, even
   when a conversation overrides the base prompt.

Add a 60-second timeout for the 'submitted' status. After a tool
executes and the ChatClient fires a continuation request, the server
sometimes never starts streaming (LLM API timeout, rate limiting,
network issue). Without a timeout the loading dots persist forever.
Now auto-stops and surfaces an error after 60 seconds.
# Conflicts:
#	apps/tab-manager/src/lib/state/chat-state.svelte.ts
#	apps/tab-manager/src/lib/tab-actions.ts
#	apps/tab-manager/src/lib/workspace.ts
…ient.extensions.sync.reconnect()

All consumers now call through the workspace client directly,
removing a layer of indirection and keeping everything derived
from the single workspaceClient instance.
… toNativeIds

reconnectSync() has no remaining callers after the previous commit.
Move toNativeIds helper to bottom of file near the action handlers
that use it.
@braden-w braden-w merged commit 8b483cc into main Mar 13, 2026
1 of 9 checks passed
@braden-w braden-w deleted the opencode/jolly-mountain branch March 13, 2026 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant