Skip to content

feat: OpenRouter baseURL + hybrid search perf instrumentation#2485

Open
monkeygold wants to merge 5 commits into
garrytan:masterfrom
monkeygold:master
Open

feat: OpenRouter baseURL + hybrid search perf instrumentation#2485
monkeygold wants to merge 5 commits into
garrytan:masterfrom
monkeygold:master

Conversation

@monkeygold

Copy link
Copy Markdown

Summary

  • gateway.ts: Pass baseURL override from config (enables OpenRouter as embedding provider) + pass dimensions parameter to textEmbeddingModel()
  • hybrid.ts: Add exact-match LRU cache for repeated queries (5min TTL, 128 entries), 5s timeout on embedding API calls, per-stage performance instrumentation

Context

These changes were made to support gbrain running with OpenRouter as the embedding API provider (Qwen3-Embedding-8B 1024d) instead of direct OpenAI. The performance instrumentation helps diagnose slow queries (~227ms typical breakdown).

Test plan

  • gbrain sync completes with OpenRouter embeddings
  • gbrain query returns correct semantic results
  • No secrets in diff

Generated with Devin

monkeygold and others added 5 commits May 16, 2026 15:37
…nk + utility scripts

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
gateway.ts:
- Pass baseURL override from config (enables OpenRouter as embedding provider)
- Pass dimensions parameter to textEmbeddingModel()

hybrid.ts:
- Add exact-match LRU cache for repeated queries (5min TTL, 128 entries)
- Add 5s timeout on embedding API calls (prevents OpenRouter outliers)
- Add per-stage performance instrumentation (mode_resolve, keyword_search,
  embed_query, vector_search, rrf_fusion, cosine_rescore, post_fusion, dedup)

Generated with [Devin](https://devin.ai)

Co-Authored-By: Devin <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant