Skip to content

xmed-lab/AttTok

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

5 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“š ATTTOK: MARRYING ATTRIBUTE TOKENS WITH GENERATIVE PRE-TRAINED VISION-LANGUAGE MODELS TOWARDS MEDICAL IMAGE UNDERSTANDING (ICLR 26)

Pipeline Diagram

πŸ”¨ Code Overview

πŸ“– Datasets

  • Base Dataset

    In src/datasets/base_dataset.py, we implement a base dataset using torch.utils.data.Dataset for training Qwen2.5-VL.
    Based on it, you can plug in your own data augmentation pipeline to enable online image augmentations during training.

    Debug command

    python -m datasets.base_dataset
  • Attribute Dataset

    In src/datasets/attribute_dataset.py, VQA samples with predefined attributes are parsed to produce the "class_label". Demo JSON files and attribute lists are provided in src/datasets/demo/.

    Debug command

    python -m datasets.attribute_dataset_dataset

Note: The five images in the demo folder are solely for fast debugging purposes, and their labels are randomly assigned. Please replace them with your actual training data for real experiments.

πŸ”§ Training

πŸ“ Citation

@inproceedings{
wang2026atttok,
title={AttTok: Marrying Attribute Tokens with Generative Pre-trained Vision-Language Models towards Medical Image Understanding},
author={Hualiang Wang and Xinyue Xu and Lehan Wang and Bin Pu and Xiaomeng Li},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=UjSoF5CM09}
}

πŸ™ Code Acknowledgments

During the development of this project, we were inspired and supported by the following outstanding open-source projects, and we would like to express our sincere gratitude to them: Qwen-VL-Series-Finetune, transformers, and LlamaFactory

About

ICLR 2026 AttTok: Marrying Attribute Tokens with Generative Pre-trained Vision-Language Models towards Medical Image Understanding

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages