Can Espectre be extended to provide bounding boxes or skeleton tracking for detected motion? #73

sanjulaonline · 2026-01-19T12:04:02Z

sanjulaonline
Jan 19, 2026

Espectre provides strong motion and anomaly detection using spatio-temporal statistics, but it does not output spatial coordinates (bounding boxes or keypoints).

I am exploring whether:

Espectre can be extended to approximate person localization (e.g., bounding box via spatial variance maps), or

Espectre is better used as a motion trigger combined with a downstream object detection or pose estimation model.

My target use case is mobile camera input (non-VR), where a moving person is detected and visually tracked with a bounding box or skeleton overlay.

Has anyone experimented with:

Using Espectre’s spatial features as an ROI selector?

Combining Espectre with MediaPipe / YOLO / OpenPose?

Mathematical approaches to infer approximate position from spatio-temporal variance?

Any insights or prior work would be appreciated.

Answered by francescopace

Jan 20, 2026

Hi @sanjulaonline, this is a very interesting topic!
Espectre does not output explicit spatial coordinates (bounding boxes or keypoints), and it is not designed as a geometric localizer.

That said, its spatio-temporal statistics can be interpreted as motion / saliency signals:
• Spatial variance or energy maps may provide a very coarse ROI, e.g. via thresholding and connected-component clustering
• This is inherently noisy and unstable (background motion, lighting changes, overlapping motion), so it should be seen as a weak spatial prior, not a reliable localization method

For your use case, the most practical approach is:

Espectre as motion / attention trigger → downstream CV model
• Use…

View full answer

francescopace · 2026-01-20T20:58:00Z

francescopace
Jan 20, 2026
Maintainer

Hi @sanjulaonline, this is a very interesting topic!
Espectre does not output explicit spatial coordinates (bounding boxes or keypoints), and it is not designed as a geometric localizer.

That said, its spatio-temporal statistics can be interpreted as motion / saliency signals:
• Spatial variance or energy maps may provide a very coarse ROI, e.g. via thresholding and connected-component clustering
• This is inherently noisy and unstable (background motion, lighting changes, overlapping motion), so it should be seen as a weak spatial prior, not a reliable localization method

For your use case, the most practical approach is:

Espectre as motion / attention trigger → downstream CV model
• Use Espectre to gate frames temporally or propose a rough ROI
• Use YOLO / MediaPipe / OpenPose for bounding boxes and pose/keypoints

Without a learned spatial model, variance-based methods cannot reliably infer:
• object boundaries,
• articulated body structure,
• stable 2D/3D coordinates.

So Espectre works well as an anomaly / motion detector or attention mechanism, but localization and pose estimation are better handled by standard vision models downstream.

Note: if by “localization” you mean physical position estimation from RF signals, that is a different problem space. WiFi CSI–based localization typically requires ≥3 phase-synchronized devices to estimate Angle of Arrival (AoA) and triangulate 2D/3D coordinates: YouTube Video

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can Espectre be extended to provide bounding boxes or skeleton tracking for detected motion? #73

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Can Espectre be extended to provide bounding boxes or skeleton tracking for detected motion? #73

Uh oh!

sanjulaonline Jan 19, 2026

Replies: 1 comment

Uh oh!

francescopace Jan 20, 2026 Maintainer

sanjulaonline
Jan 19, 2026

francescopace
Jan 20, 2026
Maintainer