Skip to content

valeoai/LOSC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LOSC: LiDAR Open-voc Segmentation Consolidator

Nermin Samet·Gilles Puy·Renaud Marlet

Valeo.ai, Paris, France  

📃 Paper

3DV 2026, Oral

✨ What's LOSC?

We push the boundaries of open-vocabulary for 3D LiDAR segmentation without any manual annotations. Using only off-the-shelf Vision Foundation Models (VFMs) and unlabeled image+LiDAR data, we achieve state-of-the-art results on both semantic and panoptic segmentation tasks across two challenging benchmarks: nuScenes and SemanticKITTI.

✅ No annotation required

✅ Pure LiDAR inference - no images needed at test time

✅ Simple pipeline, no prompt engineering or VLM tweaking

✅ Leverages image-based VLMs + smart label consolidation with spatio-temporal and augmentation consistency

✅ Outperforms all existing methods, even those using images at inference.

📊 +3.2 mIoU on nuScenes / +6.5 mIoU on SemanticKITTI compared to prior lidar only SOTA

Citation

If you find LOSC useful for your research, please cite our paper as follows.

N. Samet, G. Puy, R. Marlet, "LOSC: LiDAR Open-voc Segmentation Consolidator", In International Conference on 3D Vision (3DV), 2026.

BibTeX entry:

@inproceedings{losc,
  author = {Nermin Samet and Gilles Puy and Renaud Marlet},
  title = {LOSC: LiDAR Open-voc Segmentation Consolidator},
  booktitle = {International Conference on 3D Vision (3DV)},
  year = {2026},
}

Installation

Please follow the ScaLR installation instructions to set up the required environment. Also, please install any other required packages as needed via pip during the setup process.

Run

1. Initial Zero-shot Semantic Segmentation of LiDAR Scans

The first stage of the pipeline generates raw, zero-shot annotations by leveraging OpenSeeD. Begin by cloning the OpenSeeD repository and then transfer all Python files located in OpenSeed_files into the OpenSeeD demo directory. Once the files are positioned correctly, use the following commands to annotate each image.

# 1. Install local dependencies
cd <ROOT_DIR>/detectron2
python -m pip install -e .

# 2. Build OpenSeeD
cd <ROOT_DIR>/OpenSeeD/openseed/body/encoder/ops
python -m pip install -e .

# 3. Install required packages
pip install albumentations nuscenes-devkit pillow timm transformers supervision addict pycocotools yapf

# 4. Run nuScenes zero-shot annotation for images
cd <ROOT_DIR>/OpenSeeD/
aug=$1 # Options: None, horizontal-flip, hue-saturation, blur, color-jitter, auto-contrast, sharpen, chromatic-aberration, defocus, emboss, fancy-pca, iso-noise, clahe, gauss-noise
split=train

# For nuScenes
python demo/nuscenes_semseg.py evaluate \
 --conf_files ./configs/openseed/openseed_swint_lang.yaml \
 --output_dir <OUTPUT_DIR>/f_nuscenes_openseed_${aug}_${split} \
 --aug_type $aug \
 --split $split \
 --overrides WEIGHT ./model_state_dict_swint_51.2ap.pt

# For SemanticKITTI
python demo/semantic_kitti_semseg.py evaluate \
 --conf_files ./configs/openseed/openseed_swint_lang.yaml \
 --output_dir <OUTPUT_DIR>/f_semantic_kitti_openseed_${aug} \
 --input_dir <DATASET_DIR>/semantic_kitti/dataset/sequences/ \
 --aug_type $aug \
 --overrides WEIGHT ./model_state_dict_swint_51.2ap.pt

Following the initial 2D annotation, the next step of the pipeline lifts these results into the 3D to generate point cloud annotations. This process must be performed for both augmented and non-augmented data. Below is an example of how to generate 3D labels for the non-augmented dataset:

# For nuScenes
python auto_annotation.py --vfm_path <OUTPUT_DIR>/f_nuscenes_openseed/ --split train

# For SemanticKITTI 
# Repeat for all splits: 00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 10
python auto_annotation.py --vfm_path <OUTPUT_DIR>/f_semantic_kitti_openseed/ --split 00

By default, the resulting 3D labels are saved to <OUTPUT_DIR>/3D_labels/<dataset>/ where <dataset> is either nuscenes or semantic_kitti.

2. Augmentation and Time Based Consolidations

Augmentation-based consolidation: This step applies augmentation-based consolidation. Before running this, ensure you have already generated 3D annotations for each data augmentation using auto_annotation.py as described in previous steps.

For nuScenes:

python augmentation_based_consolidation.py \
  --label_path <OUTPUT_DIR>/3D_labels/nuscenes/f_nuscenes_openseed \
  --save_path <OUTPUT_DIR>/3D_labels/nuscenes/f_nuscenes_openseed_w_augmentation_consolidation

For SemanticKITTI:

python augmentation_based_consolidation.py \
  --label_path <OUTPUT_DIR>/3D_labels/semantic_kitti/f_semantic_kitti_openseed
  --save_path <OUTPUT_DIR>/3D_labels/semantic_kitti/f_semantic_kitti_openseed_w_augmentation_consolidation

Pre-processing: To ensure compatibility and prevent data type errors, run the following script before applying time-based consolidation to both the standard (f_nuscenes_openseed) and augmentation-consolidated (f_nuscenes_openseed_w_augmentation_consolidation) labels. This step applies to both the nuScenes and SemanticKITTI datasets.

For nuScenes

python convert.py \
    --old_path <OUTPUT_DIR>/3D_labels/nuscenes/f_nuscenes_openseed \ 
    --new_path <OUTPUT_DIR>/3D_labels/nuscenes/predictions/f_nuscenes_openseed \

python convert.py \
    --old_path <OUTPUT_DIR>/3D_labels/nuscenes/f_nuscenes_openseed_w_augmentation_consolidation \ 
    --new_path <OUTPUT_DIR>/3D_labels/nuscenes/predictions/f_nuscenes_openseed_w_augmentation_consolidation \

For SemanticKITTI

python convert.py \
    --old_path <OUTPUT_DIR>/3D_labels/semantic_kitti/f_semantic_kitti_openseed \ 
    --new_path <OUTPUT_DIR>/3D_labels/semantic_kitti/predictions/f_semantic_kitti_openseed \

python convert.py \
    --old_path <OUTPUT_DIR>/3D_labels/semantic_kitti/f_semantic_kitti_openseed_w_augmentation_consolidation \ 
    --new_path <OUTPUT_DIR>/3D_labels/semantic_kitti/predictions/f_semantic_kitti_openseed_w_augmentation_consolidation \

Time-based consolidation: The next step involves time-based consolidation by aggregating the existing 3D labels to create a new label sets. Apply time-based consolidation to the f_<dataset>_openseed and f_<dataset>_openseed_w_augmentation_consolidation labels. This process generates f_<dataset>_openseed_from_accumulation and f_<dataset>_openseed_w_augmentation_consolidation_from_accumulation labels.

For nuScenes, to speed up processing, the nuScenes dataset is divided into 10 subsets (split_id from 0 to 9). Use the following commands to accumulate predictions and re-annotate the training data:

cd <ROOT_DIR>/LOSC/annotation_tools

# Arguments:
# $1: split_id [0-9]
# $2: title (input directory, e.g., f_nuscenes_openseed)
# $3: new_title (output directory, e.g., f_nuscenes_openseed_from_accumulation)

split_id=$1
title=$2
new_title=$3

main_dev_path=<OUTPUT_DIR>/3D_labels/nuscenes
annotation_path=${main_dev_path}/annotations/
tiles_save_path_for_accumulations=${main_dev_path}/accumulations/${title}
curr_labels=${main_dev_path}/predictions/${title}/

# 1. Accumulate from predictions
python accumulate_nuscenes.py \
  --split_id $split_id \
  --save_path $tiles_save_path_for_accumulations \
  --curr_labels $curr_labels 

# 2. Re-annotate training data
python annotate_from_accumulation_nuscenes.py \
  --split_id $split_id \
  --annotation_path $annotation_path \
  --title ${new_title} \
  --tile_path $tiles_save_path_for_accumulations

For SemanticKITTI, the process is executed per sequence (seq from 0 to 10).

cd <ROOT_DIR>/annotation_tools

# Arguments:
# $1: seq [0,1,2,3,4,5,6,7,9,10]
# $2: title (e.g., f_semantic_kitti_openseed)
# $3: new_title (e.g., f_semantic_kitti_openseed_from_accumulation)

seq=$1
title=$2
new_title=$3

main_dev_path=<OUTPUT_DIR>/3D_labels/semantic_kitti
annotation_path=${main_dev_path}/annotations/
tiles_save_path_for_accumulations=${main_dev_path}/accumulations/${title}
curr_labels=${main_dev_path}/predictions/${title}/

# 1. Accumulate using predictions
python accumulate_kitti.py \
  --seq $seq \
  --save_path $tiles_save_path_for_accumulations \
  --saved_predictions $curr_labels 

# 2. Re-annotate training data
python annotate_from_accumulation.py \
  --seq $seq \
  --title ${new_title} \
  --annotation_path $annotation_path \
  --tile_path $tiles_save_path_for_accumulations

Merging labels from time and augmentation-based consolidations: Finally, we merge the previously generated labels into a single consolidated set using the command below. For SemanticKITTI, simply update the path and title arguments accordingly.

python annotate_from_mix.py \
    -t f_nuscenes_openseed_from_accumulation \
    -a f_nuscenes_openseed_w_augmentation_consolidation_from_accumulation \
    --root_path <OUTPUT_DIR>/3D_labels/nuscenes/annotations/ \
    --title f_nuscenes_openseed_merged

3. Iterative finetunning with ScaLR

Using the consolidated labels, we finetune WaffleIron, initialized with ScaLR weights. Download the pretrained ScaLR weights here and ensure the checkpoint is placed at the following path: logs/pretraining/WI_768-DINOv2_ViT_L_14-NS_KI_PD/model.pth. Executing the following command. Once training is complete, it saves the generated annotations for the training set to disk. If you wish to proceed with iterative finetuning, apply time-based consolidation to these new predictions and repeat the process.

cd <ROOT_DIR>/LOSC/ScaLR

EXP=f_nuscenes_openseed_merged
PATH_DATASET=/datasets/nuscenes/
PATH_LABELS=<OUTPUT_DIR>/3D_labels/nuscenes/annotations/$EXP/
LOG_PATH=./logs/exp/${EXP}
WHERE_TO_SAVE_RESULT=<OUTPUT_DIR>/3D_labels/nuscenes/predictions/$EXP

python finetune.py \
--dataset nuscenes \
--config_downstream configs/downstream/nuscenes/WI_768_finetune_full.yaml \
--log_path $LOG_PATH \
--path_to_labels $PATH_LABELS \
--path_dataset $PATH_DATASET \
--pretrained_ckpt logs/pretraining/WI_768-DINOv2_ViT_L_14-NS_KI_PD/model.pth \
--config_pretrain configs/pretrain/WI_768_pretrain.yaml \
--fp16

python write_prediction_on_disk_ns.py \
--dataset nuscenes \
--config_downstream configs/downstream/nuscenes/WI_768_finetune_full.yaml \
--log_path $LOG_PATH \
--path_to_labels $PATH_LABELS \
--path_dataset $PATH_DATASET \
--pretrained_ckpt logs/pretraining/WI_768-DINOv2_ViT_L_14-NS_KI_PD/model.pth \
--config_pretrain configs/pretrain/WI_768_pretrain.yaml \
--fp16 \
--restart \
--path_results $WHERE_TO_SAVE_RESULT

Notes

  • Quality Evaluation: To evaluate the quality of the generated 3D labels at any stage of the pipeline, you can use the quality_annotation.py.
  • Dataset Paths: Our code assumes datasets are located at /datasets/semantic_kitti and /datasets/nuscenes. Please verify these paths in your environment and update them accordingly if your data is stored elsewhere.
  • Path Consistency: While we have cleaned the code, some hardcoded paths may remain. If you encounter any path-related errors, please let us know .

About

[3DV 2026, Oral] Official PyTorch implementation of LOSC: LiDAR Open-voc Segmentation Consolidator

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages