add opensearch sparse encoding model results by zhichao-aws · Pull Request #239 · embeddings-benchmark/results

zhichao-aws · 2025-07-21T02:25:09Z

Add opensearch sparse encoding model results. Results on BEIR dataset.

Checklist

My model has a model sheet, report or similar
My model has a reference implementation in mteb/models/ this can be as an API. Instruction on how to add a model can be found here
- No, but there is an existing PR ___
The results submitted is obtained using the reference implementation
My model is available, either as a publicly accessible API or publicly on e.g., Huggingface
I solemnly swear that for all results submitted I have not on the evaluation dataset including training splits. If I have I have disclosed it clearly.

github-actions · 2025-07-21T02:38:10Z

Model Results Comparison

Reference models: intfloat/multilingual-e5-large, google/gemini-embedding-001
New models evaluated: opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill, opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini, opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill, opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte
Tasks: ArguAna, CQADupstackAndroidRetrieval, CQADupstackEnglishRetrieval, CQADupstackGamingRetrieval, CQADupstackGisRetrieval, CQADupstackMathematicaRetrieval, CQADupstackPhysicsRetrieval, CQADupstackProgrammersRetrieval, CQADupstackRetrieval, CQADupstackStatsRetrieval, CQADupstackTexRetrieval, CQADupstackUnixRetrieval, CQADupstackWebmastersRetrieval, CQADupstackWordpressRetrieval, ClimateFEVER, DBPedia, FEVER, FiQA2018, HotpotQA, MSMARCO, NFCorpus, NQ, QuoraRetrieval, SCIDOCS, SciFact, TRECCOVID, Touche2020

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill`

task_name	google/gemini-embedding-001	intfloat/multilingual-e5-large	opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill	Max result
ArguAna	0.86	0.54	0.5	0.90
CQADupstackAndroidRetrieval	nan	0.49	0.45	0.74
CQADupstackEnglishRetrieval	nan	0.46	0.41	0.70
CQADupstackGamingRetrieval	0.71	0.59	0.53	0.79
CQADupstackGisRetrieval	nan	0.37	0.35	0.63
CQADupstackMathematicaRetrieval	nan	0.28	0.25	0.69
CQADupstackPhysicsRetrieval	nan	0.44	0.4	0.74
CQADupstackProgrammersRetrieval	nan	0.42	0.36	0.66
CQADupstackRetrieval	nan	0.4	0.36	0.68
CQADupstackStatsRetrieval	nan	0.32	0.33	0.62
CQADupstackTexRetrieval	nan	0.28	0.28	0.63
CQADupstackUnixRetrieval	0.54	0.4	0.36	0.72
CQADupstackWebmastersRetrieval	nan	0.4	0.34	0.68
CQADupstackWordpressRetrieval	nan	0.32	0.31	0.59
ClimateFEVER	nan	0.26	0.22	0.57
DBPedia	nan	0.41	0.42	0.53
FEVER	nan	0.83	0.82	0.96
FiQA2018	0.62	0.44	0.36	0.80
HotpotQA	nan	0.71	0.67	0.88
MSMARCO	nan	0.44	0.41	0.48
NFCorpus	nan	0.34	0.34	0.56
NQ	nan	0.64	0.53	0.82
QuoraRetrieval	nan	0.89	0.84	0.92
SCIDOCS	0.25	0.17	0.17	0.35
SciFact	nan	0.7	0.71	0.87
TRECCOVID	0.86	0.71	0.69	0.95
Touche2020	nan	0.23	0.29	0.39
Average	0.64	0.46	0.43	0.70

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini`

task_name	google/gemini-embedding-001	intfloat/multilingual-e5-large	opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini	Max result
ArguAna	0.86	0.54	0.48	0.90
CQADupstackAndroidRetrieval	nan	0.49	0.43	0.74
CQADupstackEnglishRetrieval	nan	0.46	0.39	0.70
CQADupstackGamingRetrieval	0.71	0.59	0.52	0.79
CQADupstackGisRetrieval	nan	0.37	0.35	0.63
CQADupstackMathematicaRetrieval	nan	0.28	0.25	0.69
CQADupstackPhysicsRetrieval	nan	0.44	0.39	0.74
CQADupstackProgrammersRetrieval	nan	0.42	0.35	0.66
CQADupstackRetrieval	nan	0.4	0.35	0.68
CQADupstackStatsRetrieval	nan	0.32	0.32	0.62
CQADupstackTexRetrieval	nan	0.28	0.27	0.63
CQADupstackUnixRetrieval	0.54	0.4	0.34	0.72
CQADupstackWebmastersRetrieval	nan	0.4	0.34	0.68
CQADupstackWordpressRetrieval	nan	0.32	0.3	0.59
ClimateFEVER	nan	0.26	0.22	0.57
DBPedia	nan	0.41	0.41	0.53
FEVER	nan	0.83	0.81	0.96
FiQA2018	0.62	0.44	0.34	0.80
HotpotQA	nan	0.71	0.67	0.88
MSMARCO	nan	0.44	0.4	0.48
NFCorpus	nan	0.34	0.34	0.56
NQ	nan	0.64	0.51	0.82
QuoraRetrieval	nan	0.89	0.83	0.92
SCIDOCS	0.25	0.17	0.16	0.35
SciFact	nan	0.7	0.7	0.87
TRECCOVID	0.86	0.71	0.71	0.95
Touche2020	nan	0.23	0.29	0.39
Average	0.64	0.46	0.42	0.70

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill`

task_name	google/gemini-embedding-001	intfloat/multilingual-e5-large	opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill	Max result
ArguAna	0.86	0.54	0.52	0.90
CQADupstackAndroidRetrieval	nan	0.49	0.45	0.74
CQADupstackEnglishRetrieval	nan	0.46	0.41	0.70
CQADupstackGamingRetrieval	0.71	0.59	0.54	0.79
CQADupstackGisRetrieval	nan	0.37	0.36	0.63
CQADupstackMathematicaRetrieval	nan	0.28	0.27	0.69
CQADupstackPhysicsRetrieval	nan	0.44	0.4	0.74
CQADupstackProgrammersRetrieval	nan	0.42	0.37	0.66
CQADupstackRetrieval	nan	0.4	0.37	0.68
CQADupstackStatsRetrieval	nan	0.32	0.33	0.62
CQADupstackTexRetrieval	nan	0.28	0.28	0.63
CQADupstackUnixRetrieval	0.54	0.4	0.36	0.72
CQADupstackWebmastersRetrieval	nan	0.4	0.36	0.68
CQADupstackWordpressRetrieval	nan	0.32	0.3	0.59
ClimateFEVER	nan	0.26	0.24	0.57
DBPedia	nan	0.41	0.42	0.53
FEVER	nan	0.83	0.84	0.96
FiQA2018	0.62	0.44	0.36	0.80
HotpotQA	nan	0.71	0.69	0.88
MSMARCO	nan	0.44	0.42	0.48
NFCorpus	nan	0.34	0.34	0.56
NQ	nan	0.64	0.54	0.82
QuoraRetrieval	nan	0.89	0.86	0.92
SCIDOCS	0.25	0.17	0.16	0.35
SciFact	nan	0.7	0.71	0.87
TRECCOVID	0.86	0.71	0.72	0.95
Touche2020	nan	0.23	0.29	0.39
Average	0.64	0.46	0.44	0.70

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte`

task_name	google/gemini-embedding-001	intfloat/multilingual-e5-large	opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte	Max result
ArguAna	0.86	0.54	0.52	0.90
CQADupstackAndroidRetrieval	nan	0.49	0.46	0.74
CQADupstackEnglishRetrieval	nan	0.46	0.44	0.70
CQADupstackGamingRetrieval	0.71	0.59	0.55	0.79
CQADupstackGisRetrieval	nan	0.37	0.36	0.63
CQADupstackMathematicaRetrieval	nan	0.28	0.26	0.69
CQADupstackPhysicsRetrieval	nan	0.44	0.4	0.74
CQADupstackProgrammersRetrieval	nan	0.42	0.38	0.66
CQADupstackRetrieval	nan	0.4	0.38	0.68
CQADupstackStatsRetrieval	nan	0.32	0.33	0.62
CQADupstackTexRetrieval	nan	0.28	0.28	0.63
CQADupstackUnixRetrieval	0.54	0.4	0.36	0.72
CQADupstackWebmastersRetrieval	nan	0.4	0.39	0.68
CQADupstackWordpressRetrieval	nan	0.32	0.32	0.59
ClimateFEVER	nan	0.26	0.31	0.57
DBPedia	nan	0.41	0.45	0.53
FEVER	nan	0.83	0.86	0.96
FiQA2018	0.62	0.44	0.41	0.80
HotpotQA	nan	0.71	0.72	0.88
MSMARCO	nan	0.44	0.43	0.48
NFCorpus	nan	0.34	0.36	0.56
NQ	nan	0.64	0.58	0.82
QuoraRetrieval	nan	0.89	0.87	0.92
SCIDOCS	0.25	0.17	0.17	0.35
SciFact	nan	0.7	0.73	0.87
TRECCOVID	0.86	0.71	0.73	0.95
Touche2020	nan	0.23	0.39	0.39
Average	0.64	0.46	0.46	0.70

Samoed · 2025-07-21T06:30:04Z

results/opensearch-project__opensearch-neural-sparse-encoding-doc-v2-distill/main/ArguAna.json

You need to specifiy revision of model as in embeddings-benchmark/mteb#2919 instead of main

Got it. Using the latest commit id should be fine. I see you have created a PR in mteb to fix it.

So I should rename the dir in this PR to make the revision consistent with the one in mteb?

Yes, that's correct

...ural-sparse-encoding-doc-v2-distill/8921a26c78b8559d6604eb1f5c0b74c079bee38f/model_meta.json

add opensearch beir results

9e478a4

Samoed approved these changes Jul 21, 2025

View reviewed changes

Samoed reviewed Jul 21, 2025

View reviewed changes

renaming using revision id

9ce4f85

Samoed reviewed Jul 21, 2025

View reviewed changes

...ural-sparse-encoding-doc-v2-distill/8921a26c78b8559d6604eb1f5c0b74c079bee38f/model_meta.json Outdated Show resolved Hide resolved

change revision in model meta

039af24

Samoed merged commit 5fcbf06 into embeddings-benchmark:main Jul 21, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add opensearch sparse encoding model results#239

add opensearch sparse encoding model results#239
Samoed merged 3 commits intoembeddings-benchmark:mainfrom
zhichao-aws:main

zhichao-aws commented Jul 21, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 21, 2025

Uh oh!

Samoed Jul 21, 2025

Uh oh!

zhichao-aws Jul 21, 2025

Uh oh!

Samoed Jul 21, 2025

Uh oh!

zhichao-aws Jul 21, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zhichao-aws commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

github-actions bot commented Jul 21, 2025

Model Results Comparison

Results for opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill

Results for opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini

Results for opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill

Results for opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte

Uh oh!

Samoed Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

zhichao-aws Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Samoed Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

zhichao-aws Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhichao-aws commented Jul 21, 2025 •

edited

Loading

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill`

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini`

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v3-distill`

Results for `opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte`