REPA-G - Official implementation of "Test-Time Conditioning with Representation-Aligned Visual Features"

Nicolas Sereyjol-Garros¹ · Ellington Kirby¹ · Victor Letzelter^1,2 · Victor Besnier¹ Nermin Samet¹ ·

¹ Valeo.ai, Paris, France ² LTCI, Télécom Paris, Institut Polytechnique de Paris, France

Overview

While representation alignment with selfsupervised models has been shown to improve diffusion model training, its potential for enhancing inference-time conditioning remains largely unexplored. We introduce Representation-Aligned Guidance (REPA-G), a framework that leverages these aligned representations, with rich semantic properties, to enable test-time conditioning from features in generation. By optimizing a similarity objective (the potential) at inference, we steer the denoising process toward a conditioned representation extracted from a pre-trained feature extractor. Our method provides versatile control at multiple scales, ranging from fine-grained texture matching via single patches to broad semantic guidance using global image feature tokens. We further extend this to multi-concept composition, allowing for the faithful combination of distinct concepts. REPA-G operates entirely at inference time, offering a flexible and precise alternative to often ambiguous text prompts or coarse class labels. We theoretically justify how this guidance enables sampling from the potential-induced tilted distribution. Quantitative results on ImageNet and COCO demonstrate that our approach achieves high-quality, diverse generations.

📚 Citation

If you find our work useful, please consider citing:

@misc{sereyjol2026repag,
      title={Test-Time Conditioning with Representation-Aligned Visual Features}, 
      author={Nicolas Sereyjol-Garros and Ellington Kirby and Victor Letzelter and Victor Besnier and Nermin Samet},
      year={2026},
      eprint={2602.03753},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2602.03753}, 
}

Getting Started

1. Environment Setup

To install depedencies, please run:

pip install -r requirements.txt

2. Download the pretrained model

To Download together REPA-E, REPA and SiT without alignmnet, run the script

bash scripts/download/download_sit.sh

3. Demo

Run the demo with

streamlit run app/home.py

A toy example is also provided in toy_example/toy_example.ipynb

(Optional) Download additional visual backbone for evaluation

For evaluation of alignment with anchors with additional image backbone, download the image backbones needed and put them in ckpts

mocov3 : this link and place it as ./ckpts/mocov3_vitb.pth
JEPA : this link and place it as ./ckpts/ijepa_vith.pth
MAE : this link and place it as ./ckpts/mae_vitl.pth

or run the script

bash scripts/download/download_image_backbone.sh

4. Prepare ImageNet

Download and extract the training split of the ImageNet-1K dataset. Once it's ready, run the following command to preprocess the dataset:

python preprocessing.py --imagenet-path /PATH/TO/IMAGENET_TRAIN

Replace /PATH/TO/IMAGENET_TRAIN with the actual path to the extracted training images.

5. Evaluate

Download reference file for ImageNet with

bash scripts/download/download_ref_in.sh

Example scripts for generation and evaluation (average feature conditioning) are provided in scripts/eval. Change --data-dir argument with the correct path to ImageNet and run for example,

bash scripts/eval/eval_imagenet_repae.sh

Acknowledgement

This codebase is largely built upon:

We sincerely thank the authors for making their work publicly available.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
app		app
assets		assets
configs		configs
loss		loss
models		models
scripts		scripts
toy_example		toy_example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
batched_quality_evaluation.py		batched_quality_evaluation.py
cache_latents.py		cache_latents.py
clustering.py		clustering.py
dataset.py		dataset.py
dataset_coco.py		dataset_coco.py
euclidean_embedding.py		euclidean_embedding.py
evaluator.py		evaluator.py
generate.py		generate.py
multipotential_generate.py		multipotential_generate.py
potential.py		potential.py
preprocess_pca_masks.py		preprocess_pca_masks.py
preprocessing.py		preprocessing.py
requirements.txt		requirements.txt
samplers.py		samplers.py
train_ldm_only.py		train_ldm_only.py
train_repae.py		train_repae.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REPA-G - Official implementation of "Test-Time Conditioning with Representation-Aligned Visual Features"

Overview

📚 Citation

Getting Started

1. Environment Setup

2. Download the pretrained model

3. Demo

(Optional) Download additional visual backbone for evaluation

4. Prepare ImageNet

5. Evaluate

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

REPA-G - Official implementation of "Test-Time Conditioning with Representation-Aligned Visual Features"

Overview

📚 Citation

Getting Started

1. Environment Setup

2. Download the pretrained model

3. Demo

(Optional) Download additional visual backbone for evaluation

4. Prepare ImageNet

5. Evaluate

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages