Skip to content

fix: prevent apisix_llm_active_connections gauge leak when plugin exits early via ngx.exit()#13139

Open
shreemaan-abhishek wants to merge 3 commits intoapache:masterfrom
shreemaan-abhishek:fix/llm-active-connections-gauge-leak
Open

fix: prevent apisix_llm_active_connections gauge leak when plugin exits early via ngx.exit()#13139
shreemaan-abhishek wants to merge 3 commits intoapache:masterfrom
shreemaan-abhishek:fix/llm-active-connections-gauge-leak

Conversation

@shreemaan-abhishek
Copy link
Copy Markdown
Contributor

Problem

apisix_llm_active_connections is a Prometheus Gauge that tracks in-flight LLM requests. The counter leaks (never decrements) whenever a plugin calls ngx.exit() during request processing — not only in SSE streaming, but also in non-streaming responses.

Root cause: When ai-aliyun-content-moderation (or any other plugin) calls ngx.exit() inside a phase handler (e.g. body_filter, header_filter), OpenResty terminates the current coroutine immediately. This exit is not caught by the pcall wrapping the upstream request in ai-proxy/base.lua. As a result:

  1. exporter.inc_llm_active_connections(ctx) is called before pcall(do_request)
  2. A plugin calls ngx.exit() — either mid-stream (SSE) or after receiving a complete non-streaming response
  3. exporter.dec_llm_active_connections(ctx) placed after pcall is never reached
  4. Gauge leaks — only goes up, never down

This affects both ai-proxy and ai-proxy-multi in all request types: non-streaming chat, SSE streaming, and any other path where a downstream plugin exits early.

Fix

Remove the dec call from after pcall in ai-proxy/base.lua and instead rely solely on the log phase, which always runs even after ngx.exit(). Introduce a ctx.llm_active_connections_tracked flag to prevent double-decrement:

ai-proxy/base.lua — increment and set flag, no dec after pcall:

exporter.inc_llm_active_connections(ctx)
ctx.llm_active_connections_tracked = true
local ok, code_or_err, body = pcall(do_request)
-- dec is intentionally NOT here — handled in log phase

ai-proxy.lua and ai-proxy-multi.lua log phase:

function _M.log(conf, ctx)
    if ctx.llm_active_connections_tracked then
        exporter.dec_llm_active_connections(ctx)
        ctx.llm_active_connections_tracked = false
    end
    -- ...
end

The log phase runs unconditionally regardless of how the request ended (normal completion, upstream error, or ngx.exit() from any plugin), so the gauge is always correctly decremented.

Tests

Added a regression test in t/plugin/ai-aliyun-content-moderation.t:

  • Creates a route with prometheus + ai-proxy + ai-aliyun-content-moderation (check_response=true)
  • Sends a non-streaming chat request (LLM mock always returns offensive content)
  • Content moderation denies the response via ngx.exit(400)
  • Asserts apisix_llm_active_connections{...} 0 in Prometheus metrics after the log phase completes

All existing tests in t/plugin/prometheus-ai-proxy.t (40 tests) continue to pass.

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

…ts early via ngx.exit()

Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Apr 1, 2026
nic-6443
nic-6443 previously approved these changes Apr 3, 2026
Signed-off-by: Abhishek Choudhary <shreemaan.abhishek@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants