DeBias-CLIP

This is the official implementation of DeBias-CLIP.

CLIP Is Shortsighted: Paying Attention Beyond the First Sentence
Marc-Antoine Lavoie, Anas Mahmoud, Aldo Zaimi, Arsène Fansi Tchango, Steven L. Waslander

Overview

Both pretrained CLIP and models finetuned on long captions (e.g. Long-CLIP) show significant biases towards early and information dense summary senstences in captions. This reduces long-text retrieval performance and makes these models very sensitive to sentence ordering in captions. We resolve this with three simple, caption-level augmentations: (i) we remove the summary sentence, (ii) randomly sample remaining sentences and (iii) add padding to the tokenized captions. DeBias-CLIP is a drop-in replacement for Long-CLIP.

Results and Pretrained Weights

Checkpoint	Size	Urban1k T2I	DCI T2I	DOCCI T2I	COCO T2I	FLickr T2I
debias_vitb_3e.pt	ViT-B-16	93.0	67.6	80.0	43.0	36.6
debias_vitl_3e.pt	ViT-L-14	95.2	73.5	85.6	48.1	43.9

Install

Install the environment packages with pip install -r requirements.txt

Usage

Data organization

Please refer to data_parsers/data_folder.md for dataset download instructions and required directory organization for training and evaluation.

Training

We provide a training script train_script.sh with default parameters. You will need to set some arguments for your local environment.

Evaluation

We provide an example testing script in test_script.sh.

Acknowledgements

Our code is based on the OpenCLIP codebase and builds upon Long-CLIP. Evaluation implementation is derived from COSMOS.

Citation

If you find DeBias-CLIP useful for your work please cite:

@article{lavoie2026clip,
  title={CLIP Is Shortsighted: Paying Attention Beyond the First Sentence},
  author={Lavoie, Marc-Antoine and Mahmoud, Anas and Zaimi, Aldo and Tchango, Arsene Fansi and Waslander, Steven L},
  journal={arXiv preprint arXiv:2602.22419},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
data_parsers		data_parsers
dataloaders		dataloaders
local_clip		local_clip
model_configs		model_configs
scripts		scripts
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
parse_params_file.py		parse_params_file.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeBias-CLIP

Overview

Results and Pretrained Weights

Install

Usage

Data organization

Training

Evaluation

Acknowledgements

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeBias-CLIP

Overview

Results and Pretrained Weights

Install

Usage

Data organization

Training

Evaluation

Acknowledgements

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages