Skip to content

[Benchmark] Add support for MedQ-DEG-Bench#1482

Open
liujiyaoFDU wants to merge 3 commits intoopen-compass:mainfrom
liujiyaoFDU:medq-deg
Open

[Benchmark] Add support for MedQ-DEG-Bench#1482
liujiyaoFDU wants to merge 3 commits intoopen-compass:mainfrom
liujiyaoFDU:medq-deg

Conversation

@liujiyaoFDU
Copy link
Copy Markdown
Contributor

Add MedQ-DEG BenchDataset for medical image degradation robustness evaluation. Data loaded from HuggingFace (jiyaoliufd/MedQ-DEG-Bench).

Add MedQDEGBenchDataset for medical image degradation robustness evaluation.
Supports 4 splits: simulate_dev, simulate_test, good_dev, good_test.
Data loaded from HuggingFace (jiyaoliufd/MedQ-DEG-Bench).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
sub = data[data['source'] == src]
result[f'Source_{src}'] = sub['hit'].mean()

score_file = eval_file.replace('.xlsx', '_score.json')
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval_file does not always have the .xlsx suffix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I have updated it to use:
score_file = get_intermediate_file_path(eval_file, '_score', 'json')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants