AI & ML interests

Massive Text Embeddings Benchmark

Recent Activity

Samoedย  updated a dataset about 7 hours ago
mteb/SoundDescsA2TRetrieval
Samoedย  published a dataset about 7 hours ago
mteb/SoundDescsA2TRetrieval
Samoedย  updated a dataset about 8 hours ago
mteb/SoundDescsT2ARetrieval
View all activity

Organization Card

MTEB is a Python framework for evaluating embeddings and retrieval systems for both text and image. MTEB covers more than 1000 languages and diverse tasks, from classics like classification and clustering to use-case specialized tasks such as legal, code, or healthcare retrieval.

You can get started using mteb.

Overview
๐Ÿ“ˆ Leaderboard The interactive leaderboard of the benchmark
Get Started.
๐Ÿƒ Get Started Overview of how to use mteb
๐Ÿค– Defining Models How to use existing model and define custom ones
๐Ÿ“‹ Selecting tasks How to select tasks, benchmarks, splits etc.
๐Ÿญ Running Evaluation How to run the evaluations, including cache management, speeding up evaluations etc.
๐Ÿ“Š Loading Results How to load and work with existing model results
Overview.
๐Ÿ“‹ Tasks Overview of available tasks
๐Ÿ“ Benchmarks Overview of available benchmarks
๐Ÿค– Models Overview of available Models
Contributing
๐Ÿค– Adding a model How to submit a model to MTEB and to the leaderboard
๐Ÿ‘ฉโ€๐Ÿ’ป Adding a dataset How to add a new task/dataset to MTEB
๐Ÿ‘ฉโ€๐Ÿ’ป Adding a benchmark How to add a new benchmark to MTEB and to the leaderboard
๐Ÿค Contributing How to contribute to MTEB and set it up for development