diff --git a/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb b/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb new file mode 100644 index 0000000..b998c13 --- /dev/null +++ b/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb @@ -0,0 +1,344 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "74212152-7d6a-42af-a431-2b972f30ed54", + "metadata": { + "tags": [] + }, + "source": [ + "# Tutorial with Full-Rank Retrievers\n", + "\n", + "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n", + "\n", + "To start the notebook, please clone the archived shared task repository:\n", + "\n", + "```\n", + "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n", + "```\n", + "\n", + "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n", + "```\n", + "make jupyterlab\n", + "```\n", + "\n", + "The notebook covers how to run full-rank appraoches submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n", + "\n", + "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n" + ] + }, + { + "cell_type": "markdown", + "id": "6c4c7d74-ae9f-44e6-9d76-970425673879", + "metadata": {}, + "source": [ + "## Import Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "6fe2ebee-9626-4858-bd0b-246a66b286e6", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "pd.set_option('display.max_colwidth', 0)\n", + "\n", + "from tira.local_client import Client\n", + "tira = Client()\n", + "\n", + "import pyterrier as pt\n", + "if not pt.started():\n", + " pt.init()\n" + ] + }, + { + "cell_type": "markdown", + "id": "60d5e570-a7a6-4461-b796-b0f2505dada0", + "metadata": {}, + "source": [ + "### Initialize A Full-Rank Retriever\n", + "\n", + "We create a pyterrier retriever called `submitted_baseline` that is an approach submitted to a shared task in TIRA.\n", + "The approach is identified by the name `ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)`, i.e., a software `BM25 (tira-ir-starter-pyterrier)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n", + "This software consists of two stages: First, a first software component builds an PyTerrier Index, and the second software does the actual retrieval with BM25.\n", + "\n", + "With this API, any full-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n", + "\n", + "We can run the retriever on any dataset integrated in `ir_dataset`.\n", + "Here, we use `vaswani` to show the overall functionality with a fast example." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "c6784eea-1c8a-4b53-89f1-4ccfdb2e91a4", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "submitted_baseline = tira.pt.retriever(\n", + " 'ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)',\n", + " dataset='vaswani',\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "fa1e37f3-6341-4ecf-9e96-c5f92ea53d9b", + "metadata": {}, + "source": [ + "Next, we can make the actual retrieval, here on two topics to keep the result set size small." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "b107a2de-f0c0-4807-a172-5f91487dcf35", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
qidqueryq0docnorankscoresystem
01MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ08172124.566031pyterrier.default_pipelines.wmodel_batch_retrieve
11MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ09881222.110514pyterrier.default_pipelines.wmodel_batch_retrieve
21MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ05502321.717148pyterrier.default_pipelines.wmodel_batch_retrieve
31MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ01502419.478355pyterrier.default_pipelines.wmodel_batch_retrieve
41MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ09859518.626342pyterrier.default_pipelines.wmodel_batch_retrieve
........................
19952MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ048339965.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19962MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ035299975.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19972MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ02719985.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19982MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ024299995.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19992MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ01710005.161525pyterrier.default_pipelines.wmodel_batch_retrieve
\n", + "

2000 rows × 7 columns

\n", + "
" + ], + "text/plain": [ + " qid \\\n", + "0 1 \n", + "1 1 \n", + "2 1 \n", + "3 1 \n", + "4 1 \n", + "... .. \n", + "1995 2 \n", + "1996 2 \n", + "1997 2 \n", + "1998 2 \n", + "1999 2 \n", + "\n", + " query \\\n", + "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "... ... \n", + "1995 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1996 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1997 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1998 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1999 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "\n", + " q0 docno rank score \\\n", + "0 Q0 8172 1 24.566031 \n", + "1 Q0 9881 2 22.110514 \n", + "2 Q0 5502 3 21.717148 \n", + "3 Q0 1502 4 19.478355 \n", + "4 Q0 9859 5 18.626342 \n", + "... .. ... .. ... \n", + "1995 Q0 4833 996 5.161525 \n", + "1996 Q0 3529 997 5.161525 \n", + "1997 Q0 271 998 5.161525 \n", + "1998 Q0 2429 999 5.161525 \n", + "1999 Q0 17 1000 5.161525 \n", + "\n", + " system \n", + "0 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "2 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "3 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "4 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "... ... \n", + "1995 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1996 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1997 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1998 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1999 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "\n", + "[2000 rows x 7 columns]" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "topics = pd.DataFrame([\n", + " {'qid': 1, 'query': 'MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES'},\n", + " {'qid': 2, 'query': 'MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS'},\n", + "])\n", + "\n", + "submitted_baseline(topics)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/reproducibility-experiments/interoparability-tutorial.ipynb b/reproducibility-experiments/interoparability-tutorial.ipynb new file mode 100644 index 0000000..43834a2 --- /dev/null +++ b/reproducibility-experiments/interoparability-tutorial.ipynb @@ -0,0 +1,670 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "cae8986f-c23c-4d45-8a77-232488d76244", + "metadata": { + "tags": [] + }, + "source": [ + "# Tutorial Showing the Interoperability of Full-Rank-Retrievers and Re-Rankers Submitted to TIRA\n", + "\n", + "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n", + "\n", + "To start the notebook, please clone the archived shared task repository:\n", + "\n", + "```\n", + "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n", + "```\n", + "\n", + "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n", + "```\n", + "make jupyterlab\n", + "```\n", + "\n", + "The notebook covers how to combine full-rank appraoches submitted to TIRA with re-rank approaches submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n", + "\n", + "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "3d3ddb8a-3dc6-4655-8cc2-a4835a49f13b", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "pd.set_option('display.max_colwidth', 0)\n", + "\n", + "from tira.local_client import Client\n", + "tira = Client()\n", + "\n", + "import pyterrier as pt\n", + "if not pt.started():\n", + " pt.init()\n" + ] + }, + { + "cell_type": "markdown", + "id": "5189ec00-b516-4af8-b1c8-45d510ca6c7d", + "metadata": {}, + "source": [ + "### Initialize A Full-Rank Retriever\n", + "\n", + "We create a pyterrier retriever called `submitted_baseline` that is an approach submitted to a shared task in TIRA.\n", + "The approach is identified by the name `ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)`, i.e., a software `BM25 (tira-ir-starter-pyterrier)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n", + "This software consists of two stages: First, a first software component builds an PyTerrier Index, and the second software does the actual retrieval with BM25.\n", + "\n", + "With this API, any full-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n", + "\n", + "We can run the retriever on any dataset integrated in `ir_dataset`.\n", + "Here, we use `vaswani` to show the overall functionality with a fast example." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "8fc74a1d-1a6a-404f-80c3-d5023fd5058a", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "submitted_baseline = tira.pt.retriever(\n", + " 'ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)',\n", + " dataset='vaswani',\n", + ")\n" + ] + }, + { + "cell_type": "markdown", + "id": "74b53692-6cda-4e8f-a79b-e5a6fc87000d", + "metadata": {}, + "source": [ + "Next, we can make the actual retrieval, here on two topics to keep the result set size small." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "43b45070-825f-42d4-aad9-a995f71ac3e2", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
qidqueryq0docnorankscoresystem
01MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ08172124.566031pyterrier.default_pipelines.wmodel_batch_retrieve
11MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ09881222.110514pyterrier.default_pipelines.wmodel_batch_retrieve
21MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ05502321.717148pyterrier.default_pipelines.wmodel_batch_retrieve
31MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ01502419.478355pyterrier.default_pipelines.wmodel_batch_retrieve
41MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUESQ09859518.626342pyterrier.default_pipelines.wmodel_batch_retrieve
........................
19952MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ048339965.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19962MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ035299975.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19972MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ02719985.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19982MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ024299995.161525pyterrier.default_pipelines.wmodel_batch_retrieve
19992MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONSQ01710005.161525pyterrier.default_pipelines.wmodel_batch_retrieve
\n", + "

2000 rows × 7 columns

\n", + "
" + ], + "text/plain": [ + " qid \\\n", + "0 1 \n", + "1 1 \n", + "2 1 \n", + "3 1 \n", + "4 1 \n", + "... .. \n", + "1995 2 \n", + "1996 2 \n", + "1997 2 \n", + "1998 2 \n", + "1999 2 \n", + "\n", + " query \\\n", + "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "... ... \n", + "1995 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1996 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1997 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1998 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "1999 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "\n", + " q0 docno rank score \\\n", + "0 Q0 8172 1 24.566031 \n", + "1 Q0 9881 2 22.110514 \n", + "2 Q0 5502 3 21.717148 \n", + "3 Q0 1502 4 19.478355 \n", + "4 Q0 9859 5 18.626342 \n", + "... .. ... .. ... \n", + "1995 Q0 4833 996 5.161525 \n", + "1996 Q0 3529 997 5.161525 \n", + "1997 Q0 271 998 5.161525 \n", + "1998 Q0 2429 999 5.161525 \n", + "1999 Q0 17 1000 5.161525 \n", + "\n", + " system \n", + "0 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "2 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "3 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "4 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "... ... \n", + "1995 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1996 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1997 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1998 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "1999 pyterrier.default_pipelines.wmodel_batch_retrieve \n", + "\n", + "[2000 rows x 7 columns]" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "topics = pd.DataFrame([\n", + " {'qid': 1, 'query': 'MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES'},\n", + " {'qid': 2, 'query': 'MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS'},\n", + "])\n", + "\n", + "submitted_baseline(topics)" + ] + }, + { + "cell_type": "markdown", + "id": "a518e5ad-cec7-4ac3-a327-9927ce974c79", + "metadata": {}, + "source": [ + "Next, we create an `advanced_baseline` that re-ranks the top 10 results of the `submitted_baseline` that was submitted to TIRA with another re-ranker that was also submitted to TIRA, i.e., with `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)`.\n", + "\n", + "All full-rank approaches submitted in TIRA can be the first stage for any second-stage re-ranker (or longer chains).\n", + "This is ensured by the ir_datasets integration that ensures that all softwares are interoperable.\n", + "In this case, the ir_datasets integration automatically runs in between.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "5c1ff476-3b48-455b-880e-11e5eaa037c9", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Task: Re-Rank -> create files: \n", + " rerank.jsonl \n", + " qrels.txt \n", + " at /output/\n", + "Get Documents\n", + "Produce rerank data.\n", + "Write rerank data.\n", + "Done rerank data was written.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Get Docs: 100%|██████████| 18/18 [00:00<00:00, 3821.69it/s]\n", + "Produce Rerank File.: 18it [00:00, 12110.60it/s]\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
qidquerydocnoq0rankscoresystem
01MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES81720737.618500multi-qa-MiniLM-L6-dot-v1-dot
11MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES98810339.492393multi-qa-MiniLM-L6-dot-v1-dot
21MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES55020145.553276multi-qa-MiniLM-L6-dot-v1-dot
31MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES15020245.179565multi-qa-MiniLM-L6-dot-v1-dot
41MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES98590936.344490multi-qa-MiniLM-L6-dot-v1-dot
51MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES48710538.651424multi-qa-MiniLM-L6-dot-v1-dot
61MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES48170438.860371multi-qa-MiniLM-L6-dot-v1-dot
71MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES82760637.756004multi-qa-MiniLM-L6-dot-v1-dot
81MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES72340837.546501multi-qa-MiniLM-L6-dot-v1-dot
92MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS28500438.820084multi-qa-MiniLM-L6-dot-v1-dot
102MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS37810536.782967multi-qa-MiniLM-L6-dot-v1-dot
112MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS71130339.168648multi-qa-MiniLM-L6-dot-v1-dot
122MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS51240930.557392multi-qa-MiniLM-L6-dot-v1-dot
132MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS50120636.533386multi-qa-MiniLM-L6-dot-v1-dot
142MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS22840734.100830multi-qa-MiniLM-L6-dot-v1-dot
152MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS82530239.920330multi-qa-MiniLM-L6-dot-v1-dot
162MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS22180141.553677multi-qa-MiniLM-L6-dot-v1-dot
172MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS88030833.829384multi-qa-MiniLM-L6-dot-v1-dot
\n", + "
" + ], + "text/plain": [ + " qid \\\n", + "0 1 \n", + "1 1 \n", + "2 1 \n", + "3 1 \n", + "4 1 \n", + "5 1 \n", + "6 1 \n", + "7 1 \n", + "8 1 \n", + "9 2 \n", + "10 2 \n", + "11 2 \n", + "12 2 \n", + "13 2 \n", + "14 2 \n", + "15 2 \n", + "16 2 \n", + "17 2 \n", + "\n", + " query \\\n", + "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "5 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "6 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "7 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "8 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n", + "9 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "10 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "11 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "12 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "13 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "14 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "15 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "16 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "17 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n", + "\n", + " docno q0 rank score system \n", + "0 8172 0 7 37.618500 multi-qa-MiniLM-L6-dot-v1-dot \n", + "1 9881 0 3 39.492393 multi-qa-MiniLM-L6-dot-v1-dot \n", + "2 5502 0 1 45.553276 multi-qa-MiniLM-L6-dot-v1-dot \n", + "3 1502 0 2 45.179565 multi-qa-MiniLM-L6-dot-v1-dot \n", + "4 9859 0 9 36.344490 multi-qa-MiniLM-L6-dot-v1-dot \n", + "5 4871 0 5 38.651424 multi-qa-MiniLM-L6-dot-v1-dot \n", + "6 4817 0 4 38.860371 multi-qa-MiniLM-L6-dot-v1-dot \n", + "7 8276 0 6 37.756004 multi-qa-MiniLM-L6-dot-v1-dot \n", + "8 7234 0 8 37.546501 multi-qa-MiniLM-L6-dot-v1-dot \n", + "9 2850 0 4 38.820084 multi-qa-MiniLM-L6-dot-v1-dot \n", + "10 3781 0 5 36.782967 multi-qa-MiniLM-L6-dot-v1-dot \n", + "11 7113 0 3 39.168648 multi-qa-MiniLM-L6-dot-v1-dot \n", + "12 5124 0 9 30.557392 multi-qa-MiniLM-L6-dot-v1-dot \n", + "13 5012 0 6 36.533386 multi-qa-MiniLM-L6-dot-v1-dot \n", + "14 2284 0 7 34.100830 multi-qa-MiniLM-L6-dot-v1-dot \n", + "15 8253 0 2 39.920330 multi-qa-MiniLM-L6-dot-v1-dot \n", + "16 2218 0 1 41.553677 multi-qa-MiniLM-L6-dot-v1-dot \n", + "17 8803 0 8 33.829384 multi-qa-MiniLM-L6-dot-v1-dot " + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "advanced_baseline = submitted_baseline %10 >> tira.pt.reranker(\n", + " 'ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)',\n", + " irds_id='vaswani'\n", + ")\n", + "\n", + "advanced_baseline(topics)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/reproducibility-experiments/re-rank-reproducibility.ipynb b/reproducibility-experiments/re-rank-reproducibility.ipynb new file mode 100644 index 0000000..5dc942c --- /dev/null +++ b/reproducibility-experiments/re-rank-reproducibility.ipynb @@ -0,0 +1,388 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "08ce106f-b0ee-476b-9a46-710cd09f954f", + "metadata": {}, + "source": [ + "# Tutorial with Re-Rankers\n", + "\n", + "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n", + "\n", + "To start the notebook, please clone the archived shared task repository:\n", + "\n", + "```\n", + "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n", + "```\n", + "\n", + "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n", + "```\n", + "make jupyterlab\n", + "```\n", + "\n", + "The notebook covers how to run re-rankers submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n", + "\n", + "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n" + ] + }, + { + "cell_type": "markdown", + "id": "3c36da51-6d12-444a-8385-a37f42af7781", + "metadata": {}, + "source": [ + "## Import Dependencies" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "ce1afbb2-01e9-4ed3-af69-fb1848d634ac", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "import pandas as pd\n", + "pd.set_option('display.max_colwidth', 0)\n", + "\n", + "from tira.local_client import Client\n", + "tira = Client()\n", + "\n", + "import pyterrier as pt\n", + "if not pt.started():\n", + " pt.init()\n" + ] + }, + { + "cell_type": "markdown", + "id": "31d1134a-2c01-4549-9b3b-74fe987ab304", + "metadata": {}, + "source": [ + "### Initialize A Re-Ranker\n", + "\n", + "We create a pyterrier re-ranker called `advanced_pipeline` that re-ranks BM25 with an re-ranker submitted to a shared task in TIRA.\n", + "The reranker is identified by the name `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)`, i.e., a software `SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n", + "This software consists a single stage that re-ranks with a dense retrieval approach implemented in BEIR.\n", + "\n", + "With this API, any re-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n", + "\n", + "We can run the re-ranker on any dataset integrated in `ir_dataset` or any dataframe.\n", + "Here, we use a small artificial reranking dataset `data_to_rerank` to show the overall functionality with a fast example." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "c7c75b21-3dca-4251-a703-acff1feb6e67", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
docnobodyqidquery
0d1this is the first document of many documents1first document
1d2this is another document1first document
2d3the topic of this document is unknown1first document
\n", + "
" + ], + "text/plain": [ + " docno body qid query\n", + "0 d1 this is the first document of many documents 1 first document\n", + "1 d2 this is another document 1 first document\n", + "2 d3 the topic of this document is unknown 1 first document" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "data_to_rerank = pd.DataFrame([\n", + " [\"d1\", \"this is the first document of many documents\", \"1\", \"first document\"],\n", + " [\"d2\", \"this is another document\", \"1\", \"first document\"],\n", + " [\"d3\", \"the topic of this document is unknown\", \"1\", \"first document\"]\n", + " ], columns=[\"docno\", \"body\", \"qid\", \"query\"])\n", + "\n", + "data_to_rerank" + ] + }, + { + "cell_type": "markdown", + "id": "6d523fe7-adc0-4dbd-9574-cc33e2900b4e", + "metadata": {}, + "source": [ + "First, we re-rank this via BM25." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "8e19fe7a-63fe-4b1f-8bc2-40cea6cedc26", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
docnobodyqidrankscorequery
0d1this is the first document of many documents105.560003e-01first document
1d2this is another document12-3.085859e-10first document
2d3the topic of this document is unknown115.681316e-02first document
\n", + "
" + ], + "text/plain": [ + " docno body qid rank score \\\n", + "0 d1 this is the first document of many documents 1 0 5.560003e-01 \n", + "1 d2 this is another document 1 2 -3.085859e-10 \n", + "2 d3 the topic of this document is unknown 1 1 5.681316e-02 \n", + "\n", + " query \n", + "0 first document \n", + "1 first document \n", + "2 first document " + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "bm25 = pt.text.scorer(wmmodel='bm25')\n", + "\n", + "bm25(data_to_rerank)" + ] + }, + { + "cell_type": "markdown", + "id": "05943808-1fc0-4098-9822-1de3b4db4867", + "metadata": {}, + "source": [ + "Next, we use the re-ranker `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)` as second stage re-ranker, after BM25.\n", + "\n", + "This shows that re-rankers in TIRA are interoperable with other re-rankers.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "3666ba24-21ce-43aa-a9f1-17bfc4502a7b", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
docnobodyqidqueryq0rankscoresystem
0d1this is the first document of many documents1first document0146.084885multi-qa-MiniLM-L6-dot-v1-dot
1d2this is another document1first document0240.802025multi-qa-MiniLM-L6-dot-v1-dot
2d3the topic of this document is unknown1first document0337.294750multi-qa-MiniLM-L6-dot-v1-dot
\n", + "
" + ], + "text/plain": [ + " docno body qid query q0 \\\n", + "0 d1 this is the first document of many documents 1 first document 0 \n", + "1 d2 this is another document 1 first document 0 \n", + "2 d3 the topic of this document is unknown 1 first document 0 \n", + "\n", + " rank score system \n", + "0 1 46.084885 multi-qa-MiniLM-L6-dot-v1-dot \n", + "1 2 40.802025 multi-qa-MiniLM-L6-dot-v1-dot \n", + "2 3 37.294750 multi-qa-MiniLM-L6-dot-v1-dot " + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "advanced_pipeline = bm25 >> tira.pt.reranker('ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)')\n", + "\n", + "advanced_pipeline(data_to_rerank)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.9" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}