Skip to content

[TASK] Sonar segmentation/detection support #33

@jorgenfj

Description

@jorgenfj

Description of task

With the addition of the new sonar_info message, we can now extract 2D point coordinates directly in the sonar sensor plane from the sonar image. This makes it possible to spatially align detections across multiple sensor modalities.

If the sonar, color camera, and stereo camera all share the same coordinate origin (or have known extrinsic transforms), we should be able to project segmentation masks from the camera domains into the sonar image frame. From the color and stereo camera, we can obtain 3d position of all masks. Then it is just a matter of correctly projection this information onto the sonar image. This enables creation of aligned sonar–camera training pairs.

With this projection capability, the sonar images can be incorporated into the existing dataset annotation pipeline with minimal changes, essentially “dropping in” as an additional modality.

Suggested Workflow

  1. Configure Sensor Suite in Stonefish

    • Set up the sonar, RGB camera, and stereo camera within Stonefish so that their extrinsics are aligned (shared origin or known transforms).
    • Ensure the cameras have a sufficiently wide horizontal FOV to fully encompass the sonar’s imaging region.
    • Verify that all sensors produce synchronized data with consistent timestamps for later annotation alignment.
  2. Initial Verification via Brute-Force Projection

    • For each 3D annotation (object pose / bounding geometry), project it into the sonar image plane without filtering based on sensor FOV.
    • Overlay these projections on the sonar image to visually confirm that spatial alignment is correct and that the sonar_info–based projection behaves as expected.
    • Use this step to catch extrinsic transform errors, scaling inconsistencies, or sonar coordinate interpretation issues early.
  3. Refine Projection to Respect Sensor FOVs

    • Once positional correctness is validated, implement FOV-aware filtering:

      • Only project annotations whose 3D position lies within the sonar detection volume.
      • If the camera FOV and sonar FOV differ, apply appropriate visibility checks to avoid projecting objects the sonar cannot see.
    • This prevents spurious or misleading annotations during dataset generation.

  4. Integrate With Annotation Pipeline

    • Feed the validated and FOV-filtered sonar projections into the existing annotation scripts.
    • Confirm that the sonar modality behaves consistently with camera-based mask generation.
    • Perform end-to-end testing with multi-modal (sonar + camera) annotation output.

Specifications

  • Ability to map camera-based segmentation masks into the sonar frame.
  • Sonar images compatible with existing annotation scripts.

Future Goals

  • Support for generating synchronized multi-modal labeled datasets (sonar + RGB (+ stereo)).

Contacts

@jorgenfj
@kluge7

Code Quality

  • Every function in header files are documented (inputs/returns/exceptions)
  • The project has automated tests that cover MOST of the functions and branches in functions (pytest/gtest)
  • The code is documented on the wiki (provide link)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions