14 Mar 14:35

KennethEnevoldsen

4d7e36a

2.10.12 Latest

Latest

2.10.12 (2026-03-14)

Documentation

docs: remove mieb and mmteb contribution docs (#4227)

I don't think we maintain these anymore. I think they are fine to remove (823236c)

docs: fix docs paths (#4224)

fix docs paths (0d07d33)

docs: fix naming on contributing docs (6787a17)

Fix

fix: metadata getting computed for existing MTEB model (#4231)
Fix behaviour while getting metadata of existing MTEB model
Added basic metadata in overwrite
Updated CrossEncoderWrapper with same changes (973a5a1)

Unknown

Model: Add new model revision of Querit/Querit (#4215)

New model revision (f913ed8)

Fix zeroentropy/zembed-1 metadata (#4233)

Fix zeroentropy/zembed-1 metadata (revision, release_date, max_tokens)

The metadata added in #4202 had incorrect values for three fields:

revision: pointed to wrong HuggingFace commit
release_date: was "2025-09-16", should be "2026-03-02"
max_tokens: was 40960, should be 32768 (3cd67fd)

Add Zeroentropy models (#4228)
Add Zeroentropy models
correct metadata
Correct loader_kwargs for rerankers (791a185)
model: nvidia/llama-nemotron-embed-vl-1b-v2 for ViDoRe (#4192)
Adds nvidia/llama-nemotron-embed-vl-1b-v2 model
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Fixing tests and linting issues
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Nemotron Embed VL 1B: Setting the number of tiles an image can be split
Fixing lint issue
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py
Disabling image modality by default
Update mteb/models/model_implementations/nvidia_nemotron_vl_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Setting default modality of Nemotron Embed VL 1B as image + text (when available)

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (2a8c2d3)

dataset: Add IRPAPERS (#4225)
dataset: Add IRPAPERS
change dataset path
dataset transform
remove samples without text
add t2i category
delete former stats and remove column (abd5048)
Model: Add F2LLM-v2 (#4222)
Add f2llm-v2
lint codefuse models
Fix error in prompt (42c0d51)

Assets 6

11 Mar 09:59

KennethEnevoldsen

2.10.11

8e167b4

2.10.11

2.10.11 (2026-03-11)

Fix

fix: Error in siglip output conversion (#4205)
fix: Error in siglib output conversion
add mean pool to siglip
format
Apply suggestions from code review

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

add missing depencies
added fix for siglip dependencies
format
fix dependencies
added image normalization

This should happen here:
https://github.com/embeddings-benchmark/mteb/blob/ce7590dcc9c620450ca192a3ec101a62631e6b55/mteb/_create_dataloaders.py#L291-L292

Not sure why it is needed

relax protobuf dependency
lint
update pyproject.toml dependencies

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (ec20d1e)

Assets 6

11 Mar 08:06

KennethEnevoldsen

2.10.10

c87573e

2.10.10

2.10.10 (2026-03-11)

Fix

fix: Add ViDoRe(v3.1) (#4220)
fix: Add ViDoRe(v3.1)
Apply suggestion from @Samoed
add to init

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (5530493)

Contributors

Samoed

Assets 6

10 Mar 11:00

KennethEnevoldsen

2.10.9

482930f

2.10.9

2.10.9 (2026-03-10)

Documentation

docs: migrate to zensical (#4203)
migrate to zeniscal
added breadcrumbs
added navigation icons
minor docs fix
fix annotations
change to links
fixed overview for models and benchmarks
try to use zensical

Conflicts:

docs/overview/create_available_benchmarks.py

docs/overview/create_available_models.py

add copy paste button for models
add copy-paste button to tasks and benchmarks as well
remove plugins
get back mieb and mmteb
rename api back
add tasks to overview
reorder overview page
update lock file

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> (a484cfd)

Fix

fix: Display main score in task results (#4214)

Now diplay main score in task results. As well as the task_res.main_score property.

Also added "..." to indicate that there are more attributed than what is being shown.

res = mteb.evaluate(model, task)
res
res[0]
# currently displays:
# ModelResult(model_name=mteb/baseline-random-encoder, model_revision=1, task_results=[...](#1))
# TaskResult(task_name=LccSentimentClassification, scores=...)

# with PR:
# ModelResult(model_name=mteb/baseline-random-encoder, model_revision=1, task_results=[...](#1), ...)
# TaskResult(task_name=LccSentimentClassification, main_score=0.32, scores=...)
``` ([`7c831b0`](https://github.com/embeddings-benchmark/mteb/commit/7c831b068b0e8341485b745e05803239a5d16c2f))

## Unknown

* model: Qwen3-VL-Embedding (#4198)

* add qwen3-vl-embedding implementation

* lint and test

* lint

* handle image+text mode

* address review comments

* address comments

* fix resolve dependency

---------

Co-authored-by: Kenneth Enevoldsen &lt;kennethcenevoldsen@gmail.com&gt; ([`0bb0917`](https://github.com/embeddings-benchmark/mteb/commit/0bb0917c596247b1fa336a73cbbc71b3a1ac01f9))

* Added model zeroentropy/zembed-1 (#4202)

* Added model zeroentropy/zembed-1

- [y] I have filled out the ModelMeta object to the extent possible
- [y] I have ensured that my model can be loaded using
  - [y] `mteb.get_model(model_name, revision)` and
  - [y] `mteb.get_model_meta(model_name, revision)`
- [y] I have tested the implementation works on a representative set of tasks.
- [y] The model is public, i.e., is available either as an API or the weights are publicly available to download

* Apply suggestion from @Samoed

Co-authored-by: Roman Solomatin &lt;samoed.roman@gmail.com&gt;

* lint

---------

Co-authored-by: Ryan Wang &lt;ryanwang@DN0a249162.SUNet&gt;
Co-authored-by: Roman Solomatin &lt;samoed.roman@gmail.com&gt;
Co-authored-by: Roman Solomatin &lt;36135455+Samoed@users.noreply.github.com&gt; ([`69421f9`](https://github.com/embeddings-benchmark/mteb/commit/69421f95fafb55e7f183b89544a31fed4e56fa18))

* Add Reason-ModernColBERT (#4218)

* Add Reason-ModernColBERT

* Fix variable name + add size

* Fix variable name + add size ([`b877424`](https://github.com/embeddings-benchmark/mteb/commit/b8774246a712cc5c9878a8abc022d1e155158708))

Assets 6

07 Mar 15:52

KennethEnevoldsen

2.10.8

eec3bec

2.10.8

2.10.8 (2026-03-07)

Fix

fix: remove n_jobs-1 from logistic regression (#4211)

It is currently ignored and gives a future warning.

closes #4210 (52ec861)

Unknown

model: add the colbert zero serie (#4206)
ColBERT-Zero serie
Add CodeSearchNet to training data
Exact n_parameters + memory usage + embed_dim
Factorize nomic embed training datasets definition
Factorize citations (5b8131f)

Assets 6

06 Mar 16:02

KennethEnevoldsen

2.10.7

1d0e099

2.10.7

2.10.7 (2026-03-06)

Fix

fix: Code leaderboard is failing (#4207)

fix code leaderboard (dec66d6)

Unknown

model: Add nomic-ai/nomic-embed-multimodal-7b dense embedding model (#4186)
feat: Add nomic-ai/nomic-embed-multimodal-7b dense embedding model

Add BiQwen2_5Wrapper and ModelMeta for nomic-embed-multimodal-7b,
a dense (single-vector) multimodal embedding model for visual
document retrieval using cosine similarity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: Use correct set type for TRAINING_DATA

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: Add nomic-embed-multimodal-7b to _MISSING_N_EMBEDDING_MODELS

PEFT adapter repo has no config.json or model.safetensors,
so _from_hub cannot extract n_embedding_parameters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

fix: Use processor.score instead of custom similarity in BiQwen2_5Wrapper

Remove custom similarity method from BiQwen2_5Wrapper to use the
processor's built-in scoring functionality, following the established
pattern used by other ColPali models.

Addresses review feedback in PR #4186.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

fix: Use parent class methods instead of custom overrides in BiQwen2_5Wrapper

Remove custom get_image_embeddings and get_text_embeddings methods
to use the inherited ColPaliEngineWrapper implementations. For dense
embedding models with fixed-size vectors, the parent class methods
(extend + pad_sequence) produce equivalent results to the custom
implementation (append + torch.cat).

Also remove unused tqdm import.

Addresses second review feedback in PR #4186.

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

feat: Add ModelMeta for nomic-ai/nomic-embed-multimodal-3b

Add nomic_embed_multimodal_3b ModelMeta with 3B parameters
Update BiQwen2_5Wrapper default to use 3B model for better accessibility
Scale memory usage proportionally (6200 MB vs 14400 MB for 7B)
Maintain same architectural specs (embed_dim=128, max_tokens=128000)

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

feat: Add nomic-ai/nomic-embed-multimodal-3b to test exceptions

Add nomic-embed-multimodal-3b to _MISSING_N_EMBEDDING_MODELS list
Update test configuration to handle PEFT adapter repo structure
Matches pattern of existing 7B model exception

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>

fix nomic
add 3b revision and lint
add fixes
rem
cleanup
Update mteb/models/model_implementations/nomic_multimodal.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

add embedding parameters
add base revision
remove exceptions from test
added training data
lint
Update mteb/models/model_implementations/nomic_multimodal.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Your Name <you@example.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (b398ea7)

dataset: Update vidore tasks to include the OCR'd text (#4191)
dataset: Add OCR adaption of vidore tasks
Add beta tag to the new tasks
add nuclear and telecom
Apply suggestions from code review
Update mteb/tasks/retrieval/multilingual/vidore3_bench_retrieval.py
updated description and added version
add superseeded from
fix imports
fix init
fix private test
add some tasks statics
add nuclear

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (9a6b98f)

final reupload tasks from mmteb (#4200)

final reupload from mmteb (7037bfe)

Assets 6

05 Mar 13:04

KennethEnevoldsen

2.10.6

5f9207b

2.10.6

2.10.6 (2026-03-05)

Fix

fix: fleurs loading (#4197)
fix fleurs
fix dataset transform signature (009209a)

Unknown

Start video (#4148)
start video integration
start video integration
upd task structure
upd batched input
upd video input type
combine video and audio to dict
use only one video per time
remove __main__
remove PostProcessingCollator (34d060c)
model: Add Perplexity pplx-embed-v1 models (0.6B and 4B) (#4189)
model: Add Perplexity pplx-embed-v1 models (0.6B and 4B)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Update mteb/models/model_implementations/perplexity_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Update mteb/models/model_implementations/perplexity_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

Add training datasets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com> (1c90bfd)

Assets 6

03 Mar 21:23

KennethEnevoldsen

2.10.5

64f2f81

2.10.5

2.10.5 (2026-03-03)

Fix

fix: vn models (#4190)

fix vn models (a80fae9)

Unknown

model: LateOn-Code models definition (#4175)
First draft of LateOn code models definition
Fix reference for LateOn-Code
Fix reference LateOn code edge pretrain
Add memory_usage_mb (and embed_dim)
fix lint
Add training datasets

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com> (4345e63)

model: Vietnamese model for VN-MTEB (#4187)
[ADD] Vietnamese model for VN-MTEB
[ADD] Vietnamese model for VN-MTEB (rename variable) (3ab29f4)

Assets 6

02 Mar 06:56

KennethEnevoldsen

2.10.4

bce3fac

2.10.4

2.10.4 (2026-03-02)

Fix

fix: remove select column from dataloader (#4185)
remove select column
fix sts (7477902)

Assets 6

28 Feb 15:26

KennethEnevoldsen

2.10.3

b8fe9bd

2.10.3

2.10.3 (2026-02-28)

Fix

fix: Add repr to benchmarks to avoid excessive prints (#4180)
fix: don't specify keyerror when it is a keyerror
fix: Add repr for benchmarks

It now looks like:

Benchmark(name=&#39;BEIR&#39;, desciption=&#39;BEIR is a heterogeneous benchmark containing diver..., tasks=[...] (#15), ...)

format (8b6cd0c)

Assets 6

Releases: embeddings-benchmark/mteb

2.10.12

2.10.12 (2026-03-14)

Documentation

Fix

Unknown

Uh oh!

2.10.11

2.10.11 (2026-03-11)

Fix

Uh oh!

2.10.10

2.10.10 (2026-03-11)

Fix

Contributors

Uh oh!

2.10.9

2.10.9 (2026-03-10)

Documentation

Conflicts:

docs/overview/create_available_benchmarks.py

docs/overview/create_available_models.py

Fix

Uh oh!

2.10.8

2.10.8 (2026-03-07)

Fix

Unknown

Uh oh!

2.10.7

2.10.7 (2026-03-06)

Fix

Unknown

Uh oh!

2.10.6

2.10.6 (2026-03-05)

Fix

Unknown

Uh oh!

2.10.5

2.10.5 (2026-03-03)

Fix

Unknown

Uh oh!

2.10.4

2.10.4 (2026-03-02)

Fix

Uh oh!

2.10.3

2.10.3 (2026-02-28)

Fix

Uh oh!