feat(responseobs): add threshold-gated large-response counter by adamyeats · Pull Request #243 · grafana/sqlds

adamyeats · 2026-04-22T12:24:24Z

Summary

Adds a threshold-gated counter, plugins_sql_large_responses_total, inside the responseobs subpackage introduced by #242. The counter increments once per Observation that crosses a configured threshold — i.e. at the same decision point that fires the structured warn log in #242.

Stacked on #242 — base branch is feat/responseobs, not main. Review #242 first; the diff here shows only the counter additions.

Shape

plugins_sql_large_responses_total counter{datasource_type, app_url, datasource_uid}

Cardinality note for reviewers

This is the part most likely to draw a reflexive reject, so calling it out:

Increments happen only when a threshold is crossed (default 50 MiB OR 1M rows, per feat(responseobs): add subpackage for large-response observation #242). Normal-sized responses produce no new series.
Steady-state series count ≈ (stacks with abusive queries) × (avg large-datasource-instances per stack). Order of magnitude: tens to hundreds, not tens of thousands.
Contrast with the histograms in feat: emit response-size histograms from DBQuery #241, which are {datasource_type} only. Putting app_url/uid on the histogram would have blown Mimir limits; putting them on this threshold-gated counter is the intended trade — per-stack identification for alerting, but the gate prevents unbounded growth.

If cardinality does prove higher than estimated in production, a Prometheus relabel drop on app_url is the immediate mitigation — documented here so oncall doesn't have to rediscover it.

Label choices

app_url replaces slug. backend.GrafanaConfig exposes AppURL() but no dedicated slug accessor. The feat(responseobs): add subpackage for large-response observation #242 log field uses app_url for the same reason — keeping labels consistent between log and counter. If anyone knows a reliable slug source I missed, happy to switch.
datasource_uid included (not on histograms) — the counter is where operators drill into a specific abusive datasource instance, so the UID is load-bearing. The threshold gate makes the cardinality cost acceptable.
No datasource_name — would require label sanitization (names can have spaces/special chars). UID is sufficient for identification.

Integration point

One line in Observe:

largeResponsesCounter.WithLabelValues(obs.Datasource.Type, appURL, obs.Datasource.UID).Inc()

Placed right after the backend.Logger.Warn call. No caller-side changes needed — every consumer of responseobs.Observe picks up the counter automatically.

Suggested alert (for downstream consumers)

Not shipping alert rules — the consuming team owns that. Suggested shape:

sum by (datasource_type, app_url, datasource_uid) (
  rate(plugins_sql_large_responses_total[15m])
) > 0

i.e. "any SQL datasource producing large responses in the last 15m". Tune as needed.

Introduces plugins_sql_large_responses_total counter, incremented once per Observation that crosses a configured threshold. Cardinality is self-limiting because increments only happen on crossings. Labels: datasource_type, app_url, datasource_uid. app_url replaces the earlier "slug" label because backend.GrafanaConfig exposes no dedicated slug accessor; operators can derive a slug by parsing the URL.

adamyeats requested a review from a team as a code owner April 22, 2026 12:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(responseobs): add threshold-gated large-response counter#243

feat(responseobs): add threshold-gated large-response counter#243
adamyeats wants to merge 1 commit intofeat/responseobsfrom
feat/responseobs-counter

adamyeats commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adamyeats commented Apr 22, 2026

Summary

Shape

Cardinality note for reviewers

Label choices

Integration point

Suggested alert (for downstream consumers)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant