Laravel

The Massive Text Embedding Benchmark (MTEB) is a new standard for evaluating text embedding models across diverse tasks. With 56 datasets covering 8 tasks and results from over 2000 model evaluations, MTEB provides a holistic view of model performance.

Text embeddings convert text into numerical vectors that capture semantic meaning, enabling applications like search, clustering, and classification. The quality of these embeddings depends heavily on the model used. MTEB aims to simplify the selection process by offering a leaderboard that ranks models by performance on tasks such as classification, clustering, and semantic similarity.

Why Text Embeddings Matter

Text embeddings are crucial for many NLP applications. For instance, Google uses embeddings to power its search engine, and they are also used for clustering large text corpora or as input for classification models.

MTEB Features

Massive: 56 datasets across 8 tasks with over 2000 results.
Multilingual: Supports up to 112 languages for tasks like bitext mining and classification.
Extensible: Open to new tasks, datasets, and contributions via GitHub.

Model Selection Guide

Models are grouped into three categories based on speed and performance:

Maximum Speed: Models like Glove offer fast inference but lower accuracy.
Balanced Speed and Performance: All-mpnet-base-v2 and all-MiniLM-L6-v2 provide a good trade-off.
Maximum Performance: Large models like ST5-XXL and GTR-XXL achieve top scores but require more resources.

How to Benchmark Your Model

Using the MTEB library, you can evaluate any embedding model and submit results to the leaderboard. Here's a quick example:

Install MTEB: pip install mteb
Run an evaluation on a dataset:

from mteb import MTEB
from sentence_transformers import SentenceTransformer

model_name = "average_word_embeddings_komninos"
model = SentenceTransformer(model_name)
evaluation = MTEB(tasks=["Banking77Classification"])
results = evaluation.run(model, output_folder=f"results/{model_name}")

Generate metadata for submission using the script mteb_meta.py.

Visit the leaderboard to see top models and contribute your own.

MTEB: A Comprehensive Benchmark for Text Embedding Models

Why Text Embeddings Matter

MTEB Features

Model Selection Guide

How to Benchmark Your Model

We Care About Your Privacy

How and why we process data