[Benchmark] Add support for MMOral-OPG-Open benchmark by isjinghao · Pull Request #1484 · open-compass/VLMEvalKit

isjinghao · 2026-03-16T09:14:04Z

Summary

This PR adds MMOral_OPG_OPEN, an open-ended VQA benchmark for panoramic radiograph analysis, to VLMEvalKit.
The benchmark is from the NeurIPS 2025 paper “Towards Better Dental AI: A Multimodal Benchmark and Instruction Dataset for Panoramic X-ray Analysis” (arXiv:2509.09254).

Dataset

Task: open-ended question answering on OPG images, requiring detailed clinical reasoning.
URL: https://huggingface.co/datasets/OralGPT/MMOral-OPG-Bench/resolve/main/MMOral-OPG-Bench-Open-Ended.tsv

Evaluation

Models generate free-form textual answers for each question.
A separate LLM judge scores the predictions using MMOral_opg_auxeval and MMOral_opg_acc from vlmeval/dataset/utils/mmoral_opg.py.
We report aggregated scores (overall and per category) in:
- <model>_MMOral_OPG_OPEN_<judge>.xlsx (per-sample scores and logs)
- <model>_MMOral_OPG_OPEN_<judge>_score.csv
- <model>_MMOral_OPG_OPEN_<judge>_score_fine.csv.

Copilot

Pull request overview

Adds the MMOral_OPG_OPEN open-ended VQA benchmark (panoramic dental radiographs) to VLMEvalKit, including LLM-judge scoring utilities and dataset registration so it can be built/evaluated like existing benchmarks.

Changes:

Introduces MMOral_OPG_OPEN dataset class with image dumping, prompting, and judge-based evaluation.
Adds MMOral-OPG judge prompt + aux evaluation (MMOral_opg_auxeval) and score aggregation (MMOral_opg_acc).
Registers the dataset in vlmeval.dataset so it’s discoverable via build_dataset / supported dataset lists.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File	Description
`vlmeval/dataset/utils/mmoral_opg.py`	Adds LLM-judge prompt construction + aux-eval and score aggregation for MMOral-OPG.
`vlmeval/dataset/mmoral_opg_open.py`	Implements the `MMOral_OPG_OPEN` dataset class, prompt building, and judge-based evaluation pipeline.
`vlmeval/dataset/__init__.py`	Exposes and registers `MMOral_OPG_OPEN` in the dataset registry/lists.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

vlmeval/dataset/mmoral_opg_open.py

vlmeval/dataset/utils/mmoral_opg.py

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

mzr1996

Please fix lint.

isjinghao added 4 commits March 16, 2026 16:42

[Benchmark] Add support for MMOral-OPG-Closed

0c336b3

Prepare MMOral OPG utils and dataset wiring

0976081

[Benchmark] Add support for MMOral-OPG-OPEN benchmark

f8a2fcd

fix

4236605

Copilot AI review requested due to automatic review settings March 16, 2026 09:14

Copilot started reviewing on behalf of isjinghao March 16, 2026 09:14 View session

Copilot AI reviewed Mar 16, 2026

View reviewed changes

vlmeval/dataset/mmoral_opg_open.py Show resolved Hide resolved

vlmeval/dataset/utils/mmoral_opg.py Outdated Show resolved Hide resolved

isjinghao and others added 3 commits March 16, 2026 17:18

Potential fix for pull request finding

08002b6

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Fix pre-commit/flake8 formatting

b961914

Merge branch 'main' into feature/mmoral-opg-open

9827b2a

mzr1996 requested changes Mar 25, 2026

View reviewed changes

isjinghao added 2 commits March 25, 2026 17:24

Fix flake8 issues in mmoral_opg dataset utils

ade68b6

Fix flake8/isort for MMOral-OPG open dataset

7ca330f

isjinghao requested a review from mzr1996 March 26, 2026 01:58

mzr1996 approved these changes Mar 27, 2026

View reviewed changes

mzr1996 merged commit 589fe36 into open-compass:main Mar 27, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Benchmark] Add support for MMOral-OPG-Open benchmark#1484

[Benchmark] Add support for MMOral-OPG-Open benchmark#1484
mzr1996 merged 9 commits intoopen-compass:mainfrom
isjinghao:feature/mmoral-opg-open

isjinghao commented Mar 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

mzr1996 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

isjinghao commented Mar 16, 2026

Summary

Dataset

Evaluation

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

mzr1996 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants