Skip to content

fix: prevent REPL freeze and silent hangs for 3P providers#456

Open
deadlydiamond wants to merge 1 commit intoGitlawb:mainfrom
deadlydiamond:fix/3p-provider-repl-freeze
Open

fix: prevent REPL freeze and silent hangs for 3P providers#456
deadlydiamond wants to merge 1 commit intoGitlawb:mainfrom
deadlydiamond:fix/3p-provider-repl-freeze

Conversation

@deadlydiamond
Copy link
Copy Markdown

Summary

Fixes the REPL freeze that occurs when using third-party providers (DeepSeek, OpenAI-compat, Gemini, Ollama, Groq) in interactive mode. The CLI silently hangs with no error message after the user submits their first message.

Root cause: Claude Code's system prompt (~50K+ tokens), 50 tool definitions (~104KB request body), and plugin hook context injection (~14K+ tokens from claude-mem alone) exceed the context limits of most 3P models (e.g. DeepSeek's 64K token limit). The provider silently drops the connection instead of returning an error.

Changes

  • Trimmed system prompt (prompts.ts): Detects CLAUDE_CODE_USE_OPENAI/CLAUDE_CODE_USE_GEMINI and sends a ~500 token prompt with essential coding agent instructions instead of the full ~50K prompt
  • Tool filtering (openaiShim.ts): Limits tools to 15 essential ones (Bash, Read, Write, Edit, Grep, Glob, Agent, Skill, WebSearch, WebFetch, etc.) reducing request body from ~104KB
  • Hook skip (hooks.ts, sessionStart.ts): Skips SessionStart and UserPromptSubmit hooks for 3P providers — these inject thousands of tokens of context that push requests past provider limits
  • Plugin management skip (REPL.tsx): Disables useManagePlugins and performStartupChecks for 3P providers — the plugin loading pipeline triggers heavy I/O and AppState churn that freezes the React REPL
  • Command timeout (main.tsx): Adds 10-second timeout on getCommands() as a safety net
  • Utility (envUtils.ts): Exports is3PProvider() for consistent 3P detection

What works after this fix

All core coding agent tools: Bash, Read, Write, Edit, Grep, Glob, Agent, WebSearch, WebFetch, TaskCreate, TodoWrite, AskUserQuestion

What's disabled for 3P providers

Plugin skills (/brainstorm, /plan, etc.), plugin hooks, plugin MCP servers, claude-mem memory, and the full Claude-style system prompt. These require Claude's 200K context window and are architecturally incompatible with smaller-context models.

Future architectural work needed

The proper long-term fix would require:

  1. Lazy plugin loading (load on /skill invocation, not startup)
  2. Dynamic system prompt based on model's context window
  3. Tool pagination (send relevant tools per-task)
  4. Hook timeouts (3s instead of current 10 minutes)
  5. Context budget (measure total request size before sending)

Test plan

  • bun run smoke passes
  • bun run build succeeds
  • Interactive mode with DeepSeek responds without freezing
  • Interactive mode with --bare still works
  • -p mode works with all providers
  • Claude/Anthropic mode is unaffected (no 3P env vars set)

Fixes #435, relates to #337, #433

🤖 Generated with Claude Code

… OpenAI, Gemini)

Third-party providers with smaller context windows (e.g. DeepSeek's 64K)
cannot handle the full Claude Code system prompt (~50K+ tokens), 50 tool
definitions (~104KB), and plugin hook context injection (~14K+ tokens).
This causes the REPL to silently freeze with no error message.

Changes:
- Add trimmed system prompt for 3P providers (~500 tokens vs ~50K)
- Filter tools to 15 essential ones in OpenAI shim (Bash, Read, Write,
  Edit, Grep, Glob, Agent, Skill, WebSearch, WebFetch, etc.)
- Skip SessionStart/UserPromptSubmit hooks for 3P providers
- Skip plugin management (useManagePlugins) for 3P providers
- Skip performStartupChecks for 3P providers
- Add 10s timeout on command loading as safety net
- Export is3PProvider() utility for consistent detection

Root cause: Claude Code was designed for Claude's 200K context window.
The system prompt, tool schemas, and plugin hooks combined exceed the
context limits of most third-party models, causing them to silently
drop the connection with no error.

Fixes Gitlawb#435, relates to Gitlawb#337, Gitlawb#433

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@gnanam1990 gnanam1990 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into the freeze. The root cause analysis in the PR description is useful, but the current implementation appears to fix it by disabling broader parts of the product than intended. I see a few blocking issues:

  1. src/utils/hooks.ts:1978 and src/utils/hooks.ts:3016
    This disables all hooks for OpenAI/Gemini, not just the startup/prompt hooks described in the PR. executeHooks / executeHooksOutsideREPL are the generic dispatchers, so this also suppresses unrelated hook events and user automation. That is a much larger behavior change than "skip SessionStart and UserPromptSubmit to avoid oversized requests."

  2. src/services/api/openaiShim.ts:963 and src/services/api/openaiShim.ts:972
    The tool filtering runs on the shared OpenAI shim path, so it also affects GitHub Models requests, not just the "3P providers" called out in the PR. The description frames this as a context-window workaround for smaller providers, but the implementation removes tools for other OpenAI-compatible paths too. That needs to be scoped more carefully or justified/tested explicitly.

  3. src/main.tsx:2023
    The new 10s getCommands() timeout is global startup behavior, not limited to the affected REPL/provider path. On slower environments this can now silently continue with an empty command set for unaffected users as well. That feels too broad for a fix targeted at 3P interactive freezes.

  4. src/screens/REPL.tsx:781 and src/screens/REPL.tsx:795
    The new provider checks use exact string equality (=== '1') instead of the repo's normal truthy parsing (isEnvTruthy). If a user has CLAUDE_CODE_USE_OPENAI=true or another accepted truthy value, this path won't activate and the freeze remains.\n\nI think the overall direction is reasonable, but I'd want this narrowed so it:\n- only skips the specific hook/plugin/startup work actually implicated in the freeze,\n- only trims tools for the providers that need it,\n- avoids introducing a global startup timeout regression,\n- uses consistent provider detection across the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Defer REPL startup plugin checks until prompt idle to avoid post-launch input freeze

2 participants