Skip to content

TRAILab/DeBias-CLIP

Repository files navigation

DeBias-CLIP

This is the official implementation of DeBias-CLIP.

CLIP Is Shortsighted: Paying Attention Beyond the First Sentence
Marc-Antoine Lavoie, Anas Mahmoud, Aldo Zaimi, Arsène Fansi Tchango, Steven L. Waslander

Overview

Both pretrained CLIP and models finetuned on long captions (e.g. Long-CLIP) show significant biases towards early and information dense summary senstences in captions. This reduces long-text retrieval performance and makes these models very sensitive to sentence ordering in captions. We resolve this with three simple, caption-level augmentations: (i) we remove the summary sentence, (ii) randomly sample remaining sentences and (iii) add padding to the tokenized captions. DeBias-CLIP is a drop-in replacement for Long-CLIP.

Results and Pretrained Weights

Checkpoint Size Urban1k
T2I
DCI
T2I
DOCCI
T2I
COCO
T2I
FLickr
T2I
debias_vitb_3e.pt ViT-B-16 93.0 67.6 80.0 43.0 36.6
debias_vitl_3e.pt ViT-L-14 95.2 73.5 85.6 48.1 43.9

Install

Install the environment packages with pip install -r requirements.txt

Usage

Data organization

Please refer to data_parsers/data_folder.md for dataset download instructions and required directory organization for training and evaluation.

Training

We provide a training script train_script.sh with default parameters. You will need to set some arguments for your local environment.

Evaluation

We provide an example testing script in test_script.sh.

Acknowledgements

Our code is based on the OpenCLIP codebase and builds upon Long-CLIP. Evaluation implementation is derived from COSMOS.

Citation

If you find DeBias-CLIP useful for your work please cite:

@article{lavoie2026clip,
  title={CLIP Is Shortsighted: Paying Attention Beyond the First Sentence},
  author={Lavoie, Marc-Antoine and Mahmoud, Anas and Zaimi, Aldo and Tchango, Arsene Fansi and Waslander, Steven L},
  journal={arXiv preprint arXiv:2602.22419},
  year={2026}
}

About

Source code for "CLIP Is Shortsighted: Paying Attention Beyond the First Sentence"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors