Skip to content

Incorrect ground-truth answers in BLINK Relative_Reflectance TSV #1486

@Ryoo72

Description

@Ryoo72

Hi, thanks for maintaining this benchmark.

I found that the ground-truth answers for the BLINK Relative_Reflectance task in the VLM Eval Kit appear to be incorrect.

Problem

The task is distributed in TSV format in the eval kit, but the answer field for Relative_Reflectance seems to be wrong.

On the Open VLM Leaderboard, model performance on BLINK Relative_Reflectance is consistently very close to random guessing (~0.33).

Open VLM Leaderboard: BLINK Relative_Reflectance
https://huggingface.co/spaces/opencompass/open_vlm_leaderboard

In addition, I manually inspected multiple samples and found clear mismatches between the images and the provided ground-truth answers.

Verification

I checked the original dataset viewer here:

Original dataset:
https://huggingface.co/datasets/BLINK-Benchmark/BLINK/viewer/Relative_Reflectance

Using the correct data from there, I created a fixed TSV version:

Fixed TSV:
https://huggingface.co/buckets/Ryoo72/BLINK/resolve/BLINK.fixed.tsv?download=true

Request

Could you please verify the Relative_Reflectance annotations / answer column in the current TSV distributed with the VLM Eval Kit?

If helpful, I’d be happy to provide more details or help compare the current file against the corrected version.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions