Skip to content

feat: include alert status transitions in poll-alerts payload#6601

Open
harisrujanc wants to merge 4 commits into
keephq:mainfrom
harisrujanc:harisrujanc-poll-alerts-status-transition
Open

feat: include alert status transitions in poll-alerts payload#6601
harisrujanc wants to merge 4 commits into
keephq:mainfrom
harisrujanc:harisrujanc-poll-alerts-status-transition

Conversation

@harisrujanc

@harisrujanc harisrujanc commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Closes #6602

Summary

  • Add poll_alerts_payload() helper in keep/api/consts.py to centralize payload construction.
  • Add get_last_alert_statuses_by_fingerprints() batch query in keep/api/core/db.py.
  • Capture previous alert status before set_last_alert() overwrites it in process_event_task.py.
  • Include alerts, statuses, and resolved_fingerprints fields in the poll-alerts Pusher payload when transition data is available.
  • Update route-based trigger sites to use poll_alerts_payload(fingerprints) (fingerprints-only mode).
  • Widen PollAlertsPayload TypeScript type with optional transition fields (no behavioral change).
  • Preserve existing fingerprints field and FINGERPRINT_PAYLOAD_LIMIT fallback behavior.

Why

poll-alerts currently only includes fingerprints. Downstream consumers that need to know whether an alert resolved (or transitioned to any other status) must make a follow-up REST call to /alerts/query. With status metadata in the payload, consumers can act event-driven without extra API calls.

Extends #6591.

Test plan

  • pytest tests/test_poll_alerts_payload.py -- 11 new cases for payload builder (fingerprints-only, transitions, over-limit fallback)
  • jest utils/hooks/__tests__/useAlertPolling.test.ts -- 2 new cases proving extra fields don't break fingerprint parsing

Extend the poll-alerts Pusher payload with per-alert status transition
data so downstream consumers can detect alert status changes without
follow-up REST calls.

New payload fields (backward compatible, additive):
- alerts: list of {fingerprint, status, previous_status} dicts
- statuses: fingerprint -> current status map
- resolved_fingerprints: convenience subset of resolved alerts

Implementation:
- Add poll_alerts_payload() helper in keep/api/consts.py
- Add get_last_alert_statuses_by_fingerprints() DB query
- Capture previous status before set_last_alert() in process_event_task
- Update routes/alerts.py trigger sites to use poll_alerts_payload()
- Widen PollAlertsPayload TypeScript type (no behavioral UI change)
- Add comprehensive tests for payload builder and UI parser

Extends keephq#6591 (fingerprints in poll-alerts).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. Enhancement New feature or request labels Jun 26, 2026
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@harisrujanc harisrujanc changed the title feat: add status-transition metadata to poll-alerts Pusher payload feat: include alert status transitions in poll-alerts payload Jun 26, 2026

@shahargl shahargl left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extension of #6591 — clean helper, backward-compatible payload, and the previous-status capture is correctly placed before set_last_alert overwrites it. One real blocker around cross-dialect JSON handling, plus a correctness caveat and a couple of nits. Requesting changes.

Blocking

1. get_last_alert_statuses_by_fingerprints reads JSON in Python — returns None on MySQL/SQLite

The new query selects the whole Alert.event column and reads it in Python:

select(LastAlert.fingerprint, Alert.event)...
...
if isinstance(event, dict):
    status_dict[fingerprint] = event.get("status")
else:
    status_dict[fingerprint] = None

Alert.event is Column(JSON). On Postgres it deserializes to a dict (so this works — which is why the Postgres CI job passes), but on MySQL and SQLite a JSON column commonly comes back as a string, so isinstance(event, dict) is False and previous_status becomes None for every alert on those backends. That silently breaks the headline feature (transition detection) for MySQL/SQLite deployments.

Every other status/JSON read in the codebase uses the dialect-aware helper get_json_extract_field (see keep/api/core/db_utils.py and the many get_json_extract_field(session, Alert.event, "status") call sites in db.py). Please follow that pattern — it also avoids hauling the entire event blob over the wire:

from keep.api.core.db_utils import get_json_extract_field

with Session(engine) as session:
    status_field = get_json_extract_field(session, Alert.event, "status")
    query = (
        select(LastAlert.fingerprint, status_field)
        .join(Alert, LastAlert.alert_id == Alert.id)
        .where(LastAlert.tenant_id == tenant_id)
        .where(LastAlert.fingerprint.in_(fingerprints))
    )
    results = session.execute(query).all()
return {fingerprint: status for fingerprint, status in results}

Please also add a small DB-backed test (run on sqlite) asserting get_last_alert_statuses_by_fingerprints returns the real status — the current tests only cover the pure poll_alerts_payload builder (dialect-independent), which is why this slipped through.

Should address (correctness caveat)

2. previous_status ignores enrichment-based status overrides

It reads Alert.event.status, but the effective status can be overridden by AlertEnrichment.enrichments.status (the existing status queries in db.py, e.g. get_alerts_metadata_by_ids, always coalesce both enrichments.status and event.status). So a dismissed/suppressed alert whose status lives in enrichments will report the wrong previous_status. If that's acceptable for the intended consumers, please add a comment documenting that previous_status reflects the raw event status, not the enrichment-effective status.

Nits

  • resolved_fingerprints is derivable from statuses/alerts, so it's slightly redundant payload — fine to keep as a convenience, just noting it.
  • Good cleanup removing the unused fingerprints_for_poll_payload imports; payload stays backward-compatible and the UI parser correctly ignores the new fields.

CI note (not caused by this PR)

The two failing DB jobs (mysql-with-redis, sqlite-without-redis) fail on flaky Playwright E2E tests unrelated to this change:

  • test_filter_search_timeframe_combination_with_queryparams — strict mode violation: [data-testid='facet'] resolved to 2 elements
  • test_run_workflow_from_alert_and_incident — "Workflow started successfully" resolved to 2 elements

Postgres + unit + integration + frontend all pass. These just need a re-run; they don't reflect a problem in this PR.

Summary

The only blocker is the cross-dialect JSON read (#1) so the feature works on MySQL/SQLite. Fix that (+ a sqlite-backed test and a note on enrichment status), re-run the flaky E2E jobs, and this is good to merge.

harisrujanc and others added 2 commits June 26, 2026 22:44
- Replace Python-side event dict parsing with get_json_extract_field()
  which handles Postgres, MySQL, and SQLite JSON column differences.
- Only extracts the 'status' field server-side instead of the full
  event blob, reducing wire traffic.
- Add docstring noting that previous_status reflects raw event status,
  not enrichment-effective status (acceptable for transition detection).
- Add DB-backed sqlite tests for get_last_alert_statuses_by_fingerprints
  covering: single alert, resolved status, unknown fingerprints, empty
  input, and multiple fingerprints.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@harisrujanc

Copy link
Copy Markdown
Contributor Author

CI Status Note

The 3 failing E2E checks (mysql-with-redis, postgresql-without-redis, sqlite-without-redis) are pre-existing flaky tests unrelated to this PR:

  • test_filter_search_timeframe_combination_with_queryparams — strict mode violation: [data-testid='facet'] resolved to 2 elements (duplicate severity/status facet)
  • test_filter_timeframe_combination_with_queryparams — same duplicate facet issue

Why these are not caused by this PR:

  • This PR only modifies the poll-alerts Pusher payload structure and a backend DB query — no facet/filter/sort UI logic was changed.
  • The same failures appear on unrelated branches (e.g. feat/disable-ai-temperature in run 28284870982).
  • All tests that exercise this PR's code pass: unit tests, frontend tests, integration tests, build.

A re-run should clear these. Happy to rebase if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include alert status transitions in poll-alerts Pusher payload

2 participants