Skip to content

Bug/double heartbeat#157

Merged
freol35241 merged 4 commits into
mainfrom
bug/double-heartbeat
Jun 11, 2026
Merged

Bug/double heartbeat#157
freol35241 merged 4 commits into
mainfrom
bug/double-heartbeat

Conversation

@luisheres

Copy link
Copy Markdown
Contributor

Fix double-heartbeat flip-flop; move health to raw subjects

The mavlink connector published entity_health from two HEARTBEAT/SYS_STATUS mappers onto the same key with
disagreeing payloads, so it (and vehicle_mode/vehicle_armed) flip-flopped at ~1 Hz when more than one component
shared a system id.

  • Guard vehicle_mode/vehicle_armed to the autopilot — drop HEARTBEATs with autopilot == MAV_AUTOPILOT_INVALID
    (GCS/companion/gimbal).
  • Stop publishing entity_health — computing EntityHealth is the entity_health connector's job; the mavlink connector
    now publishes raw health data instead.
  • sensor_status — per-subsystem health from SYS_STATUS, fanned out by source_id suffix (gps, compass, gyroscope, …);
    present/enabled/health → RUNNING/ERROR/STANDBY.
  • vehicle_state — new typed keelson.VehicleState enum carrying MAV_STATE (the failsafe/severity axis sensors can't
    express).

Tests + docs updated; covered end-to-end by tlog-replay and SITL e2e.

⚠️ Adds a proto (VehicleState) — run generate_python.sh/generate_javascript.sh after pulling.

luisheres and others added 4 commits June 10, 2026 11:22
…AT only

With the default --target-component 0 ("any component"), every component on
the target system reaches dispatch. A GCS, companion computer, gimbal, or
MAVProxy forward also emits HEARTBEAT at ~1 Hz, and map_heartbeat mapped all
of them onto the same source_id — so vehicle_mode/vehicle_armed flip-flopped
every second between the real autopilot and the impostor.

Drop non-autopilot HEARTBEATs in map_heartbeat using autopilot !=
MAV_AUTOPILOT_INVALID — the same predicate _wait_boot_heartbeat already uses
to pick the autopilot out of a shared system id. No signature change; with an
explicit --target-component it's a harmless no-op.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The connector published entity_health from two mappers — map_heartbeat
(from MAV_STATE) and map_sys_status (from the SYS_STATUS sensor bitmask) —
onto the same key at ~1 Hz each, with structurally different payloads. Last-
writer-wins made the subject oscillate every second (level computed two ways,
per-sensor detail and rate_hz blinking in/out). It also baked health *policy*
into the connector.

Remove all entity_health publishing: the helpers, the EntityHealth imports,
and the yields from both mappers (keeping fence_enabled). Producing
EntityHealth is the entity_health connector's job; the mavlink connector will
instead publish raw health data (sensor_status, vehicle_state — follow-up
commits). Update tests and docs accordingly, and the connectors/CLAUDE.md
liveliness guidance that previously told connectors to republish entity_health.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Translate the SYS_STATUS subsystem bitmask into the existing keelson
sensor_status subject, fanned out by source_id suffix — one publish per
*present* subsystem (e.g. sensor_status/<source-id>/gps). Curated to the
~9 subsystems relevant to a surface/autonomous vehicle: attitude_estimator,
gps, arming_checks, remote_control, compass, gyroscope, accelerometer,
geofence, data_logging.

present/enabled/health bits map to OperatingMode: enabled+healthy -> RUNNING,
enabled+unhealthy -> ERROR, present+disabled -> STANDBY, not-present -> no
publish (absence = "not equipped", same convention as fence_enabled). The
connector only translates the bits; the NOMINAL/CRITICAL health verdict is the
entity_health connector's job. The geofence sensor_status (subsystem health)
and fence_enabled (enforcement state) intentionally coexist.

Adds a SYS_STATUS frame to the tlog-replay e2e fixture so sensor_status is
covered end-to-end; new unit tests for the mode mapping and suffix fan-out.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Removing entity_health dropped the vehicle-level severity axis (failsafe /
emergency) that the per-sensor sensor_status fan-out cannot express. Restore
it as a new typed, vehicle-agnostic payload keelson.VehicleState + vehicle_state
subject, published from map_heartbeat (under the existing autopilot-only guard).

MAV_STATE maps 1:1 onto VehicleState.State (UNINIT->UNKNOWN, BOOT->BOOTING,
... CRITICAL->CRITICAL, EMERGENCY->EMERGENCY, FLIGHT_TERMINATION->TERMINATED).
A typed enum (not a TimestampedString like vehicle_mode) is the right call here:
MAV_STATE is a standardized closed enum and operational lifecycle/severity is a
universal concept that abstracts across vehicles — unlike autopilot-specific
flight modes. entity_health config can band vehicle_state.state by enum name.

Adds the proto, the subjects.yaml entry, the mapper + MAV_STATE lookup, unit
tests for the mapping, and vehicle_state to the e2e expected-channel sets
(covered end-to-end by the tlog-replay test). Generated SDK/docs are
regenerated out-of-band (gitignored).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@luisheres luisheres requested a review from freol35241 June 10, 2026 20:00

@freol35241 freol35241 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@freol35241 freol35241 merged commit fe0dd95 into main Jun 11, 2026
8 checks passed
@freol35241 freol35241 deleted the bug/double-heartbeat branch June 11, 2026 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants