Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
mam10eks authored Jul 18, 2023
1 parent 344b8ae commit 99d0426
Showing 1 changed file with 37 additions and 37 deletions.
74 changes: 37 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,51 +5,51 @@ software has to process them in a sandbox that prevents data leaks.

The platform currently includes 15 corpora (1.9 billion documents) on which 32 well-known shared tasks are based, as well as Docker images of 50 standard retrieval approaches. Within this setup, we were able to automatically run and evaluate the 50 approaches on the 32 tasks (1600 runs) in less than a week.

The hosted version of the IR Experiment Platform is open for submissions at [https://www.tira.io/task/ir-benchmarks](https://www.tira.io/task/ir-benchmarks).
The hosted version of the IR Experiment Platform is open for submissions at [https://www.tira.io/task-overview/ir-benchmarks](https://www.tira.io/task-overview/ir-benchmarks).

## Experiments

All evaluations and analysis (including those reported in the paper) are located in [analysis-of-submissions](analysis-of-submissions).

## Up-To-Date Leaderboards

Comparing the leaderboards accross different tasks is quite interesting (we have a large scale evaluation on that in the paper), e.g., compare [MS MARCO DL 2019](https://www.tira.io/task/ir-benchmarks#msmarco-passage-trec-dl-2019-judged-20230107-training) with [Antique](https://www.tira.io/task/ir-benchmarks#antique-test-20230107-training) or [Args.me](https://www.tira.io/task/ir-benchmarks#argsme-touche-2020-task-1-20230209-training): On MS MARCO, all kinds of deep learning models are at the top, which totally reverses for other corpora, e.g., Args.me or Antique.


The current leaderboards can be viewed in [tira.io](https://www.tira.io/task/ir-benchmarks):

- [Antique](https://www.tira.io/task/ir-benchmarks#antique-test-20230107-training)
- [Args.me 2020 Task 1](https://www.tira.io/task/ir-benchmarks#argsme-touche-2020-task-1-20230209-training)
- [Args.me 2021 Task 1](https://www.tira.io/task/ir-benchmarks#argsme-touche-2021-task-1-20230209-training)
- [Cranfield](https://www.tira.io/task/ir-benchmarks#cranfield-20230107-training)
- [TREC COVID](https://www.tira.io/task/ir-benchmarks#cord19-fulltext-trec-covid-20230107-training)
- [TREC Deep Learning 2019 (passage)](https://www.tira.io/task/ir-benchmarks#msmarco-passage-trec-dl-2019-judged-20230107-training)
- [TREC Deep Learning 2020 (passage)](https://www.tira.io/task/ir-benchmarks#msmarco-passage-trec-dl-2020-judged-20230107-training)
- [TREC Genomics 2004](https://www.tira.io/task/ir-benchmarks#medline-2004-trec-genomics-2004-20230107-training)
- [TREC Genomics 2005](https://www.tira.io/task/ir-benchmarks#medline-2004-trec-genomics-2005-20230107-training)
- [TREC 7](https://www.tira.io/task/ir-benchmarks#disks45-nocr-trec7-20230209-training)
- [TREC 8](https://www.tira.io/task/ir-benchmarks#disks45-nocr-trec8-20230209-training)
- [Robust04](https://www.tira.io/task/ir-benchmarks#disks45-nocr-trec-robust-2004-20230209-training)
- [TREC Web Track 2002 (gov)](https://www.tira.io/task/ir-benchmarks#gov-trec-web-2002-20230209-training)
- [TREC Web Track 2003 (gov)](https://www.tira.io/task/ir-benchmarks#gov-trec-web-2003-20230209-training)
- [TREC Web Track 2004 (gov)](https://www.tira.io/task/ir-benchmarks#gov-trec-web-2004-20230209-training)
- [TREC Web Track 2009 (ClueWeb09)](https://www.tira.io/task/ir-benchmarks#clueweb09-en-trec-web-2009-20230107-training)
- [TREC Web Track 2010 (ClueWeb09)](https://www.tira.io/task/ir-benchmarks#clueweb09-en-trec-web-2010-20230107-training)
- [TREC Web Track 2011 (ClueWeb09)](https://www.tira.io/task/ir-benchmarks#clueweb09-en-trec-web-2011-20230107-training)
- [TREC Web Track 2012 (ClueWeb09)](https://www.tira.io/task/ir-benchmarks#clueweb09-en-trec-web-2012-20230107-training)
- [TREC Web Track 2013 (ClueWeb12)](https://www.tira.io/task/ir-benchmarks#clueweb12-trec-web-2013-20230107-training)
- [TREC Web Track 2014 (ClueWeb12)](https://www.tira.io/task/ir-benchmarks#clueweb12-trec-web-2014-20230107-training)
- [Touché 2020 Task 2 (ClueWeb12)](https://www.tira.io/task/ir-benchmarks#clueweb12-touche-2020-task-2-20230209-training)
- [Touché 2021 Task 2 (ClueWeb12)](https://www.tira.io/task/ir-benchmarks#clueweb12-touche-2021-task-2-20230209-training)
Comparing the leaderboards accross different tasks is quite interesting (we have a large scale evaluation on that in the paper), e.g., compare [MS MARCO DL 2019](https://www.tira.io/task-overview/ir-benchmarks/msmarco-passage-trec-dl-2019-judged-20230107-training) with [Antique](https://www.tira.io/task-overview/ir-benchmarks/antique-test-20230107-training) or [Args.me](https://www.tira.io/task-overview/ir-benchmarks/argsme-touche-2020-task-1-20230209-training): On MS MARCO, all kinds of deep learning models are at the top, which totally reverses for other corpora, e.g., Args.me or Antique.


The current leaderboards can be viewed in [tira.io](https://www.tira.io/task-overview/ir-benchmarks):

- [Antique](https://www.tira.io/task-overview/ir-benchmarks/antique-test-20230107-training)
- [Args.me 2020 Task 1](https://www.tira.io/task-overview/ir-benchmarks/argsme-touche-2020-task-1-20230209-training)
- [Args.me 2021 Task 1](https://www.tira.io/task-overview/ir-benchmarks/argsme-touche-2021-task-1-20230209-training)
- [Cranfield](https://www.tira.io/task-overview/ir-benchmarks/cranfield-20230107-training)
- [TREC COVID](https://www.tira.io/task-overview/ir-benchmarks/cord19-fulltext-trec-covid-20230107-training)
- [TREC Deep Learning 2019 (passage)](https://www.tira.io/task-overview/ir-benchmarks/msmarco-passage-trec-dl-2019-judged-20230107-training)
- [TREC Deep Learning 2020 (passage)](https://www.tira.io/task-overview/ir-benchmarks/msmarco-passage-trec-dl-2020-judged-20230107-training)
- [TREC Genomics 2004](https://www.tira.io/task-overview/ir-benchmarks/medline-2004-trec-genomics-2004-20230107-training)
- [TREC Genomics 2005](https://www.tira.io/task-overview/ir-benchmarks/medline-2004-trec-genomics-2005-20230107-training)
- [TREC 7](https://www.tira.io/task-overview/ir-benchmarks/disks45-nocr-trec7-20230209-training)
- [TREC 8](https://www.tira.io/task-overview/ir-benchmarks/disks45-nocr-trec8-20230209-training)
- [Robust04](https://www.tira.io/task-overview/ir-benchmarks/disks45-nocr-trec-robust-2004-20230209-training)
- [TREC Web Track 2002 (gov)](https://www.tira.io/task-overview/ir-benchmarks/gov-trec-web-2002-20230209-training)
- [TREC Web Track 2003 (gov)](https://www.tira.io/task-overview/ir-benchmarks/gov-trec-web-2003-20230209-training)
- [TREC Web Track 2004 (gov)](https://www.tira.io/task-overview/ir-benchmarks/gov-trec-web-2004-20230209-training)
- [TREC Web Track 2009 (ClueWeb09)](https://www.tira.io/task-overview/ir-benchmarks/clueweb09-en-trec-web-2009-20230107-training)
- [TREC Web Track 2010 (ClueWeb09)](https://www.tira.io/task-overview/ir-benchmarks/clueweb09-en-trec-web-2010-20230107-training)
- [TREC Web Track 2011 (ClueWeb09)](https://www.tira.io/task-overview/ir-benchmarks/clueweb09-en-trec-web-2011-20230107-training)
- [TREC Web Track 2012 (ClueWeb09)](https://www.tira.io/task-overview/ir-benchmarks/clueweb09-en-trec-web-2012-20230107-training)
- [TREC Web Track 2013 (ClueWeb12)](https://www.tira.io/task-overview/ir-benchmarks/clueweb12-trec-web-2013-20230107-training)
- [TREC Web Track 2014 (ClueWeb12)](https://www.tira.io/task-overview/ir-benchmarks/clueweb12-trec-web-2014-20230107-training)
- [Touché 2020 Task 2 (ClueWeb12)](https://www.tira.io/task-overview/ir-benchmarks/clueweb12-touche-2020-task-2-20230209-training)
- [Touché 2021 Task 2 (ClueWeb12)](https://www.tira.io/task-overview/ir-benchmarks/clueweb12-touche-2021-task-2-20230209-training)
- Touché 2023 Task 2 (ClueWeb22) ([Task is still ongoing](https://www.tira.io/task/touche-2023-task-2), so the leaderboard is not yet public)
- [TREC Terabyte 2004 (gov2)](https://www.tira.io/task/ir-benchmarks#gov2-trec-tb-2004-20230209-training)
- [TREC Terabyte 2005 (gov2)](https://www.tira.io/task/ir-benchmarks#gov2-trec-tb-2005-20230209-training)
- [TREC Terabyte 2006 (gov2)](https://www.tira.io/task/ir-benchmarks#gov2-trec-tb-2006-20230209-training)
- [NFCorpus](https://www.tira.io/task/ir-benchmarks#nfcorpus-test-20230107-training)
- [Vaswani](https://www.tira.io/task/ir-benchmarks#vaswani-20230107-training)
- [TREC Core 2018 (wapo)](https://www.tira.io/task/ir-benchmarks#wapo-v2-trec-core-2018-20230107-training)
- [TREC Precision Medicine 2017](https://www.tira.io/task/ir-benchmarks#medline-2017-trec-pm-2017-20230211-training)
- [TREC Precision Medicine 2018](https://www.tira.io/task/ir-benchmarks#medline-2017-trec-pm-2018-20230211-training)
- [TREC Terabyte 2004 (gov2)](https://www.tira.io/task-overview/ir-benchmarks/gov2-trec-tb-2004-20230209-training)
- [TREC Terabyte 2005 (gov2)](https://www.tira.io/task-overview/ir-benchmarks/gov2-trec-tb-2005-20230209-training)
- [TREC Terabyte 2006 (gov2)](https://www.tira.io/task-overview/ir-benchmarks/gov2-trec-tb-2006-20230209-training)
- [NFCorpus](https://www.tira.io/task-overview/ir-benchmarks/nfcorpus-test-20230107-training)
- [Vaswani](https://www.tira.io/task-overview/ir-benchmarks/vaswani-20230107-training)
- [TREC Core 2018 (wapo)](https://www.tira.io/task-overview/ir-benchmarks/wapo-v2-trec-core-2018-20230107-training)
- [TREC Precision Medicine 2017](https://www.tira.io/task-overview/ir-benchmarks/medline-2017-trec-pm-2017-20230211-training)
- [TREC Precision Medicine 2018](https://www.tira.io/task-overview/ir-benchmarks/medline-2017-trec-pm-2018-20230211-training)

## Import new Datasets

Expand Down

0 comments on commit 99d0426

Please sign in to comment.