Skip to content

refactor: three-layer AI proxy architecture (protocols/providers/transport)#13170

Merged
nic-6443 merged 3 commits intoapache:masterfrom
nic-6443:refactor/ai-proxy-three-layer
Apr 8, 2026
Merged

refactor: three-layer AI proxy architecture (protocols/providers/transport)#13170
nic-6443 merged 3 commits intoapache:masterfrom
nic-6443:refactor/ai-proxy-three-layer

Conversation

@nic-6443
Copy link
Copy Markdown
Member

@nic-6443 nic-6443 commented Apr 7, 2026

Summary

Refactors the AI proxy system from a two-layer architecture (ai-proxy + ai-drivers) into a three-layer architecture with clear separation of concerns.

Architecture

Before: ai-proxy → ai-drivers/openai-base.lua (monolithic)
After:  ai-proxy → ai-protocols/ + ai-providers/ + ai-transport/ (3-layer)
  • Layer 1: ai-protocols/ — Client-facing protocol adapters (openai-chat, openai-embeddings). Each implements a uniform interface: matches(), is_streaming(), prepare_request(), parse_sse_event(), extract_response_text(), build_deny_response(), etc.

  • Layer 2: ai-providers/ — Thin provider modules (renamed from ai-drivers) declaring a capabilities map (supported protocols → endpoint config). All inherit from base.lua which handles request building, response parsing, and streaming.

  • Layer 3: ai-transport/ — Pure infrastructure (http.lua, sse.lua, auth.lua) extracted from the old monolithic openai-base.lua.

Key Changes

  • Renamed apisix/plugins/ai-drivers/apisix/plugins/ai-providers/
  • Extracted protocol logic into apisix/plugins/ai-protocols/
  • Extracted transport logic into apisix/plugins/ai-transport/
  • Added converters/ for cross-protocol bridging (Embeddings→Vertex Predict)
  • Updated downstream plugins (ai-prompt-decorator, ai-prompt-guard, ai-request-rewrite, ai-rag, ai-aliyun-content-moderation) to use protocol-aware APIs
  • Updated Makefile install targets

Bug Fixes Included

Fix Location Description
ctx.var.llm_raw_usagectx.llm_raw_usage ai-aliyun-content-moderation.lua ctx.var is nginx variables, not Lua tables — usage was always nil, falling back to zeros
Clone auth.query before merge ai-providers/base.lua Prevents cross-request state pollution when endpoint query params are merged
Strip Accept-Encoding header ai-transport/http.lua Prevents compressed responses from LLM providers that break JSON/SSE parsing

Protocol Detection

Detection uses body signals:

  1. openai-chat → body has messages field
  2. openai-embeddings → body has input field (catch-all)

Test Changes

  • All existing test cases preserved and passing
  • Updated test configs for ai-driversai-providers rename
  • Added tests for bug fixes (content moderation usage, Accept-Encoding strip, auth.query mutation)

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. enhancement New feature or request labels Apr 7, 2026
@nic-6443 nic-6443 force-pushed the refactor/ai-proxy-three-layer branch 4 times, most recently from 1d77969 to 9bd0498 Compare April 7, 2026 11:12
…sport)

Refactor the AI proxy system from a 2-layer architecture (ai-proxy + ai-drivers)
to a 3-layer architecture:

1. ai-protocols/ - Protocol adapters (openai-chat, openai-embeddings) that handle
   request/response parsing for different API formats
2. ai-providers/ - Provider implementations with capability declarations, replacing
   the monolithic ai-drivers/ layer
3. ai-transport/ - HTTP transport, SSE streaming, and auth utilities extracted from
   the old openai-base.lua

Key improvements:
- Protocol detection is separated from provider logic, making it easier to add new
  API formats
- Providers declare their capabilities (which protocols they support), enabling
  automatic protocol-to-provider matching
- Transport layer is reusable across providers
- Converter framework allows bridging between protocols (e.g., openai-embeddings
  to vertex-predict)
- Health check now includes auth headers for providers that require authentication

Bug fixes included:
- Fix incorrect ctx.var.llm_raw_usage reference (should be ctx.llm_raw_usage)
- Clone auth.query before merging to prevent cross-request state pollution
- Strip Accept-Encoding from upstream requests to prevent compressed responses
  breaking SSE parsing
@nic-6443 nic-6443 force-pushed the refactor/ai-proxy-three-layer branch from 9f3110d to 722808e Compare April 7, 2026 12:01
@nic-6443 nic-6443 requested a review from Copilot April 7, 2026 12:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the APISIX AI proxy implementation from a 2-layer ai-proxy + ai-drivers/* design into a 3-layer architecture: protocol adapters (ai-protocols/*), provider definitions (ai-providers/*), and shared transport utilities (ai-transport/*), and updates dependent plugins/tests accordingly.

Changes:

  • Introduces protocol detection + conversion (ai-protocols/*, ai-protocols/converters/*) and migrates providers to capability-based modules (ai-providers/*).
  • Extracts HTTP/SSE/auth infrastructure into ai-transport/* and updates ai-proxy / ai-proxy-multi to use the new pipeline.
  • Updates plugin integrations and Test::Nginx suites, plus Makefile install targets, to reflect the new module layout and bug-fix coverage.

Reviewed changes

Copilot reviewed 50 out of 50 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
t/plugin/ai-request-rewrite2.t Updates test config to new ai-request-rewrite provider/options schema.
t/plugin/ai-request-rewrite.t Switches upstream references to local httpbin host/port.
t/plugin/ai-rag.t Minor formatting/blank-line adjustment.
t/plugin/ai-proxy3.t Adjusts log regex + adds spacing in embedded Lua for readability.
t/plugin/ai-proxy2.t Removes explicit upstream stanza from test route config.
t/plugin/ai-proxy.t Adds new regression/bugfix tests (endpoint port handling, Accept-Encoding stripping, SSE fragmentation, auth.query mutation).
t/plugin/ai-proxy-vertex-ai.t Loosens response regex + adds a unit test for Vertex path function behavior.
t/plugin/ai-proxy-openrouter.t Removes a stray leading line artifact.
t/plugin/ai-proxy-multi3.t Tweaks log level + adds a probe endpoint for request inspection.
t/plugin/ai-proxy-multi2.t Removes explicit upstream stanza from test route config.
t/plugin/ai-proxy-multi.t Removes schema-logging test and minor formatting changes.
t/plugin/ai-proxy-multi.balancer.t Adds healthcheck test server + new tests for per-instance checks and disabling healthchecks.
t/plugin/ai-proxy-azure-openai.t Test file cleanup; minor indentation change in embedded Lua.
t/plugin/ai-proxy-anthropic.t Updates test server/endpoint/model settings to match new provider behavior.
t/plugin/ai-prompt-guard.t Adds regression coverage ensuring chat-completions path still works with protocol detection changes.
t/plugin/ai-prompt-decorator.t Adds regression coverage for message decoration with updated protocol handling.
t/plugin/ai-aliyun-content-moderation.t Expands tests across providers + adds usage preservation, multimodal, and upstream-error skip coverage.
Makefile Installs new ai-providers, ai-protocols, ai-transport directories instead of ai-drivers.
apisix/plugins/prometheus/exporter.lua Makes LLM active-connections metric increment resilient when metric is unavailable.
apisix/plugins/ai-transport/sse.lua Updates SSE decoding/encoding + introduces buffer boundary splitting for fragmented SSE reads.
apisix/plugins/ai-transport/http.lua Adds shared HTTP request helpers and forwarded-header construction (incl. stripping Accept-Encoding).
apisix/plugins/ai-transport/auth.lua Adds shared GCP OAuth2 token caching helper for providers.
apisix/plugins/ai-request-rewrite.lua Migrates to provider/protocol-based sidecar LLM calls and extracts rewritten text from protocol adapter.
apisix/plugins/ai-rag.lua Uses protocol helpers to append augmentation messages in a protocol-aware way.
apisix/plugins/ai-proxy/schema.lua Switches to provider schema enum; adds timeout max; removes legacy chat_request_schema.
apisix/plugins/ai-proxy/base.lua Implements protocol detection + provider capability routing + transport-driven request lifecycle.
apisix/plugins/ai-proxy.lua Migrates from drivers→providers and defers active-connection decrement to log phase.
apisix/plugins/ai-proxy-multi.lua Migrates from drivers→providers; adds protocol detection; adjusts healthcheck/upstream construction logic.
apisix/plugins/ai-providers/vertex-ai.lua New provider module declaring capabilities for chat + vertex predict endpoints.
apisix/plugins/ai-providers/schema.lua New provider name registry for schemas/enums.
apisix/plugins/ai-providers/openrouter.lua Migrates OpenRouter provider to capability-based provider module.
apisix/plugins/ai-providers/openai.lua Migrates OpenAI provider to capability-based provider module (chat + embeddings).
apisix/plugins/ai-providers/openai-compatible.lua Migrates openai-compatible provider to capability-based module.
apisix/plugins/ai-providers/gemini.lua Migrates Gemini provider to capability-based provider module.
apisix/plugins/ai-providers/deepseek.lua Migrates DeepSeek provider to capability-based provider module.
apisix/plugins/ai-providers/base.lua New provider base implementing build_request + non/stream parsing and sidecar request support.
apisix/plugins/ai-providers/azure-openai.lua Migrates Azure OpenAI provider to capability-based provider module.
apisix/plugins/ai-providers/anthropic.lua Migrates Anthropic provider to capability-based provider module.
apisix/plugins/ai-providers/aimlapi.lua Migrates AIMLAPI provider to capability-based provider module.
apisix/plugins/ai-protocols/openai-embeddings.lua New protocol adapter for embeddings: detection, usage extraction, deny responses, etc.
apisix/plugins/ai-protocols/openai-chat.lua New protocol adapter for chat completions: SSE parsing, usage/text extraction, deny responses.
apisix/plugins/ai-protocols/init.lua Protocol registry + detection order + message prepend/append helpers + converter lookup.
apisix/plugins/ai-protocols/converters/openai-embeddings-to-vertex-predict.lua New converter bridging OpenAI embeddings requests to Vertex Predict and converting responses back.
apisix/plugins/ai-protocols/converters/init.lua Converter registry + registration for embeddings→vertex predict converter.
apisix/plugins/ai-prompt-guard.lua Switches message extraction to protocol-aware helper.
apisix/plugins/ai-prompt-decorator.lua Switches prepend/append logic to protocol-aware helpers.
apisix/plugins/ai-drivers/vertex-ai.lua Deleted legacy driver implementation (migrated to providers/protocols/converters).
apisix/plugins/ai-drivers/schema.lua Deleted legacy driver schema (migrated to providers schema).
apisix/plugins/ai-drivers/openai-base.lua Deleted legacy monolithic driver base (replaced by provider base + transport + protocols).
apisix/plugins/ai-aliyun-content-moderation.lua Migrates to protocol-aware content extraction and deny response building; improves response-check behavior.
Comments suppressed due to low confidence (1)

apisix/plugins/ai-transport/sse.lua:110

  • _M.encode() appends an extra empty data: line for any event.data that doesn't already end with a newline because it iterates over (event.data .. "\n"). This changes the payload (decoder will see a trailing \n) and can break SSE clients/protocol adapters. Iterate over the actual lines in event.data without forcing an additional trailing newline (only emit an empty data: line when the original data contains an explicit trailing newline).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Guard auth.query health check path rewrite with checks/active nil check
- Deep copy checks before mutation to prevent accumulation across requests
- Return nil in vertex-predict path when model is missing
- Pass ctx to get_provider_protocol for context-aware path resolution

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
moonming
moonming previously approved these changes Apr 7, 2026
@nic-6443
Copy link
Copy Markdown
Member Author

nic-6443 commented Apr 7, 2026

The linux_apisix_current_luarocks CI failure is unrelated to this PR. It failed in t/cli/test_dns.sh with: failed: resolve upstream host in preread phase should works fine - a timing-dependent stream proxy DNS resolution test. Our changes only modify AI proxy plugins and tests.

The vertex-predict capability path function now returns nil when
ctx.var.llm_model is absent, instead of producing a bogus URL with
the literal string 'nil'. Update the test expectation accordingly.
@apache apache deleted a comment from jarvis9443 Apr 8, 2026
@nic-6443 nic-6443 merged commit be45f23 into apache:master Apr 8, 2026
23 of 26 checks passed
@nic-6443 nic-6443 deleted the refactor/ai-proxy-three-layer branch April 8, 2026 03:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants