feat: include alert status transitions in poll-alerts payload#6601
feat: include alert status transitions in poll-alerts payload#6601harisrujanc wants to merge 4 commits into
Conversation
Extend the poll-alerts Pusher payload with per-alert status transition
data so downstream consumers can detect alert status changes without
follow-up REST calls.
New payload fields (backward compatible, additive):
- alerts: list of {fingerprint, status, previous_status} dicts
- statuses: fingerprint -> current status map
- resolved_fingerprints: convenience subset of resolved alerts
Implementation:
- Add poll_alerts_payload() helper in keep/api/consts.py
- Add get_last_alert_statuses_by_fingerprints() DB query
- Capture previous status before set_last_alert() in process_event_task
- Update routes/alerts.py trigger sites to use poll_alerts_payload()
- Widen PollAlertsPayload TypeScript type (no behavioral UI change)
- Add comprehensive tests for payload builder and UI parser
Extends keephq#6591 (fingerprints in poll-alerts).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
shahargl
left a comment
There was a problem hiding this comment.
Nice extension of #6591 — clean helper, backward-compatible payload, and the previous-status capture is correctly placed before set_last_alert overwrites it. One real blocker around cross-dialect JSON handling, plus a correctness caveat and a couple of nits. Requesting changes.
Blocking
1. get_last_alert_statuses_by_fingerprints reads JSON in Python — returns None on MySQL/SQLite
The new query selects the whole Alert.event column and reads it in Python:
select(LastAlert.fingerprint, Alert.event)...
...
if isinstance(event, dict):
status_dict[fingerprint] = event.get("status")
else:
status_dict[fingerprint] = NoneAlert.event is Column(JSON). On Postgres it deserializes to a dict (so this works — which is why the Postgres CI job passes), but on MySQL and SQLite a JSON column commonly comes back as a string, so isinstance(event, dict) is False and previous_status becomes None for every alert on those backends. That silently breaks the headline feature (transition detection) for MySQL/SQLite deployments.
Every other status/JSON read in the codebase uses the dialect-aware helper get_json_extract_field (see keep/api/core/db_utils.py and the many get_json_extract_field(session, Alert.event, "status") call sites in db.py). Please follow that pattern — it also avoids hauling the entire event blob over the wire:
from keep.api.core.db_utils import get_json_extract_field
with Session(engine) as session:
status_field = get_json_extract_field(session, Alert.event, "status")
query = (
select(LastAlert.fingerprint, status_field)
.join(Alert, LastAlert.alert_id == Alert.id)
.where(LastAlert.tenant_id == tenant_id)
.where(LastAlert.fingerprint.in_(fingerprints))
)
results = session.execute(query).all()
return {fingerprint: status for fingerprint, status in results}Please also add a small DB-backed test (run on sqlite) asserting get_last_alert_statuses_by_fingerprints returns the real status — the current tests only cover the pure poll_alerts_payload builder (dialect-independent), which is why this slipped through.
Should address (correctness caveat)
2. previous_status ignores enrichment-based status overrides
It reads Alert.event.status, but the effective status can be overridden by AlertEnrichment.enrichments.status (the existing status queries in db.py, e.g. get_alerts_metadata_by_ids, always coalesce both enrichments.status and event.status). So a dismissed/suppressed alert whose status lives in enrichments will report the wrong previous_status. If that's acceptable for the intended consumers, please add a comment documenting that previous_status reflects the raw event status, not the enrichment-effective status.
Nits
resolved_fingerprintsis derivable fromstatuses/alerts, so it's slightly redundant payload — fine to keep as a convenience, just noting it.- Good cleanup removing the unused
fingerprints_for_poll_payloadimports; payload stays backward-compatible and the UI parser correctly ignores the new fields.
CI note (not caused by this PR)
The two failing DB jobs (mysql-with-redis, sqlite-without-redis) fail on flaky Playwright E2E tests unrelated to this change:
test_filter_search_timeframe_combination_with_queryparams— strict mode violation:[data-testid='facet']resolved to 2 elementstest_run_workflow_from_alert_and_incident— "Workflow started successfully" resolved to 2 elements
Postgres + unit + integration + frontend all pass. These just need a re-run; they don't reflect a problem in this PR.
Summary
The only blocker is the cross-dialect JSON read (#1) so the feature works on MySQL/SQLite. Fix that (+ a sqlite-backed test and a note on enrichment status), re-run the flaky E2E jobs, and this is good to merge.
- Replace Python-side event dict parsing with get_json_extract_field() which handles Postgres, MySQL, and SQLite JSON column differences. - Only extracts the 'status' field server-side instead of the full event blob, reducing wire traffic. - Add docstring noting that previous_status reflects raw event status, not enrichment-effective status (acceptable for transition detection). - Add DB-backed sqlite tests for get_last_alert_statuses_by_fingerprints covering: single alert, resolved status, unknown fingerprints, empty input, and multiple fingerprints. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
CI Status NoteThe 3 failing E2E checks (mysql-with-redis, postgresql-without-redis, sqlite-without-redis) are pre-existing flaky tests unrelated to this PR:
Why these are not caused by this PR:
A re-run should clear these. Happy to rebase if needed. |
Closes #6602
Summary
poll_alerts_payload()helper inkeep/api/consts.pyto centralize payload construction.get_last_alert_statuses_by_fingerprints()batch query inkeep/api/core/db.py.set_last_alert()overwrites it inprocess_event_task.py.alerts,statuses, andresolved_fingerprintsfields in thepoll-alertsPusher payload when transition data is available.poll_alerts_payload(fingerprints)(fingerprints-only mode).PollAlertsPayloadTypeScript type with optional transition fields (no behavioral change).fingerprintsfield andFINGERPRINT_PAYLOAD_LIMITfallback behavior.Why
poll-alertscurrently only includes fingerprints. Downstream consumers that need to know whether an alert resolved (or transitioned to any other status) must make a follow-up REST call to/alerts/query. With status metadata in the payload, consumers can act event-driven without extra API calls.Extends #6591.
Test plan
pytest tests/test_poll_alerts_payload.py-- 11 new cases for payload builder (fingerprints-only, transitions, over-limit fallback)jest utils/hooks/__tests__/useAlertPolling.test.ts-- 2 new cases proving extra fields don't break fingerprint parsing