Incorrect ground-truth answers in BLINK Relative_Reflectance TSV

Hi, thanks for maintaining this benchmark.

I found that the ground-truth answers for the BLINK Relative_Reflectance task in the VLM Eval Kit appear to be incorrect.

### Problem

The task is distributed in TSV format in the eval kit, but the answer field for Relative_Reflectance seems to be wrong.

On the Open VLM Leaderboard, model performance on BLINK Relative_Reflectance is consistently very close to random guessing (~0.33).

Open VLM Leaderboard: BLINK Relative_Reflectance
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard

In addition, I manually inspected multiple samples and found clear mismatches between the images and the provided ground-truth answers.

### Verification

I checked the original dataset viewer here:

Original dataset:
https://huggingface.co/datasets/BLINK-Benchmark/BLINK/viewer/Relative_Reflectance

Using the correct data from there, I created a fixed TSV version:

Fixed TSV:
https://huggingface.co/buckets/Ryoo72/BLINK/resolve/BLINK.fixed.tsv?download=true

### Request

Could you please verify the Relative_Reflectance annotations / answer column in the current TSV distributed with the VLM Eval Kit?

If helpful, I’d be happy to provide more details or help compare the current file against the corrected version.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect ground-truth answers in BLINK Relative_Reflectance TSV #1486

Problem

Verification

Request

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect ground-truth answers in BLINK Relative_Reflectance TSV #1486

Description

Problem

Verification

Request

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions