Skip to content

signalsLatest/lastSeen via signal_latest projection#284

Open
zer0stars wants to merge 1 commit intomainfrom
feat/signals-latest-projection
Open

signalsLatest/lastSeen via signal_latest projection#284
zer0stars wants to merge 1 commit intomainfrom
feat/signals-latest-projection

Conversation

@zer0stars
Copy link
Copy Markdown
Member

Summary

  • Rewrites getLatestQuery so the aggregate shapes match the signal_latest_by_subject_source_name projection added in DIMO-Network/model-garage#257 — non-location signals use argMax + max(timestamp), location signals use argMaxIf + maxIf with the same non-(0, 0) filter the projection carries, mixed queries UNION ALL the two branches.
  • Drops DEFAULT_LOOKBACK / Service.lookbackFrom and the timestamp WHERE clause on both getLatestQuery and getLastSeenQuery. The timestamp filter prevents projection matching, and the projection makes the lookback unnecessary — queries now read a handful of pre-aggregated rows per (subject, source, name) regardless of history depth.

Why

signalsLatest / lastSeen previously needed a 60-day lookback to keep scans bounded. Vehicles that hadn't reported in the last 60 days returned empty from signalsLatest. The projection stores one pre-aggregated row per (subject, source, name), so the lookback is no longer load-bearing and the full history is cheap to query.

Verification

ClickHouse 25.3 (podman), 20M-row ReplacingMergeTree signal table, 200 subjects, 8 partitions, with the new projection materialized:

Query system.query_log.projections match Rows read
non-location latest (speed, isIgnitionOn) 99.5k
location latest (argMaxIf + maxIf non-(0,0)) 99.5k
UNION ALL mixed + source filter 199k (both branches)
lastSeen (max(timestamp)) 99.5k

Results byte-identical between projection and force-base runs (SETTINGS optimize_use_projections = 0).

Deploy order

Must deploy only after DIMO-Network/model-garage#257 is materialized in the target environment. Without the projection, the now-unbounded queries will do full-history scans — regression vs today's 60-day lookback. Hold the deploy until system.mutations shows the MATERIALIZE PROJECTION mutation is done on signal and event.

Removed config

  • DEFAULT_LOOKBACK env var (was 60 in both values.yaml and values-prod.yaml)
  • DefaultLookbackDays field in config.Settings
  • Service.defaultLookback field, Service.lookbackFrom(), defaultLookbackDays const
  • Comment block in settings.sample.yaml

Test plan

  • go test ./... passes (unit + e2e testcontainers)
  • Live projection-match verification against ClickHouse 25.3 for every query shape the new code emits
  • Dev deploy after model-garage#257 is materialized in dev; confirm system.query_log.projections on live traffic shows the projection being hit
  • Prod deploy after materialization in prod; monitor signalsLatest p99
  • Spot-check a quiet vehicle (no reports in 60+ days) now returns data via signalsLatest

Rewrites getLatestQuery to emit aggregate shapes that ClickHouse's
projection matcher picks up against the signal_latest_by_subject_source_name
projection added in model-garage. Non-location signals use
argMax + max(timestamp); location signals use argMaxIf + maxIf with the
same non-(0, 0) filter the projection carries. Mixed queries UNION ALL
the two shapes so each branch matches cleanly.

Drops DEFAULT_LOOKBACK / Service.lookbackFrom and the timestamp WHERE
clause on both getLatestQuery and getLastSeenQuery. The timestamp filter
prevents projection matching, and the projection itself makes the
lookback unnecessary -- queries now read a handful of pre-aggregated
rows per (subject, source, name) regardless of history depth.

Verified on ClickHouse 25.3 with a 20M-row ReplacingMergeTree signal
table: all four query shapes (non-location latest, location latest,
mixed UNION ALL with source filter, lastSeen) match the projection via
system.query_log.projections, and results are byte-identical to
force-base (optimize_use_projections=0) runs.

Must deploy only after DIMO-Network/model-garage#257 is materialized in
the target environment; otherwise these queries fall back to
full-history scans.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant