[Benchmark] Add support for MedQ-DEG-Bench by liujiyaoFDU · Pull Request #1482 · open-compass/VLMEvalKit

liujiyaoFDU · 2026-03-16T05:29:09Z

Add MedQ-DEG BenchDataset for medical image degradation robustness evaluation. Data loaded from HuggingFace (jiyaoliufd/MedQ-DEG-Bench).

Add MedQDEGBenchDataset for medical image degradation robustness evaluation. Supports 4 splits: simulate_dev, simulate_test, good_dev, good_test. Data loaded from HuggingFace (jiyaoliufd/MedQ-DEG-Bench). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mzr1996 · 2026-03-17T09:57:24Z

vlmeval/dataset/medq_deg_bench.py

+                sub = data[data['source'] == src]
+                result[f'Source_{src}'] = sub['hit'].mean()
+
+        score_file = eval_file.replace('.xlsx', '_score.json')


The eval_file does not always have the .xlsx suffix.

Thanks for pointing this out. I have updated it to use:
score_file = get_intermediate_file_path(eval_file, '_score', 'json')

…json')

mzr1996 approved these changes Mar 17, 2026

View reviewed changes

mzr1996 requested changes Mar 17, 2026

View reviewed changes

liujiyaoFDU and others added 2 commits March 18, 2026 13:49

update score_file = get_intermediate_file_path(eval_file, '_score', '…

44411de

…json')

Merge branch 'open-compass:main' into medq-deg

b1aec6e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Benchmark] Add support for MedQ-DEG-Bench#1482

[Benchmark] Add support for MedQ-DEG-Bench#1482
liujiyaoFDU wants to merge 3 commits intoopen-compass:mainfrom
liujiyaoFDU:medq-deg

liujiyaoFDU commented Mar 16, 2026

Uh oh!

mzr1996 Mar 17, 2026

Uh oh!

liujiyaoFDU Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liujiyaoFDU commented Mar 16, 2026

Uh oh!

mzr1996 Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

liujiyaoFDU Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants