diff --git a/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb b/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb
new file mode 100644
index 0000000..b998c13
--- /dev/null
+++ b/reproducibility-experiments/full-rank-retriever-reproducibility.ipynb
@@ -0,0 +1,344 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "74212152-7d6a-42af-a431-2b972f30ed54",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Tutorial with Full-Rank Retrievers\n",
+ "\n",
+ "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n",
+ "\n",
+ "To start the notebook, please clone the archived shared task repository:\n",
+ "\n",
+ "```\n",
+ "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n",
+ "```\n",
+ "\n",
+ "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n",
+ "```\n",
+ "make jupyterlab\n",
+ "```\n",
+ "\n",
+ "The notebook covers how to run full-rank appraoches submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n",
+ "\n",
+ "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6c4c7d74-ae9f-44e6-9d76-970425673879",
+ "metadata": {},
+ "source": [
+ "## Import Dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "6fe2ebee-9626-4858-bd0b-246a66b286e6",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "pd.set_option('display.max_colwidth', 0)\n",
+ "\n",
+ "from tira.local_client import Client\n",
+ "tira = Client()\n",
+ "\n",
+ "import pyterrier as pt\n",
+ "if not pt.started():\n",
+ " pt.init()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "60d5e570-a7a6-4461-b796-b0f2505dada0",
+ "metadata": {},
+ "source": [
+ "### Initialize A Full-Rank Retriever\n",
+ "\n",
+ "We create a pyterrier retriever called `submitted_baseline` that is an approach submitted to a shared task in TIRA.\n",
+ "The approach is identified by the name `ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)`, i.e., a software `BM25 (tira-ir-starter-pyterrier)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n",
+ "This software consists of two stages: First, a first software component builds an PyTerrier Index, and the second software does the actual retrieval with BM25.\n",
+ "\n",
+ "With this API, any full-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n",
+ "\n",
+ "We can run the retriever on any dataset integrated in `ir_dataset`.\n",
+ "Here, we use `vaswani` to show the overall functionality with a fast example."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "c6784eea-1c8a-4b53-89f1-4ccfdb2e91a4",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "submitted_baseline = tira.pt.retriever(\n",
+ " 'ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)',\n",
+ " dataset='vaswani',\n",
+ ")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "fa1e37f3-6341-4ecf-9e96-c5f92ea53d9b",
+ "metadata": {},
+ "source": [
+ "Next, we can make the actual retrieval, here on two topics to keep the result set size small."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "b107a2de-f0c0-4807-a172-5f91487dcf35",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "
\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " qid | \n",
+ " query | \n",
+ " q0 | \n",
+ " docno | \n",
+ " rank | \n",
+ " score | \n",
+ " system | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 8172 | \n",
+ " 1 | \n",
+ " 24.566031 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 9881 | \n",
+ " 2 | \n",
+ " 22.110514 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 5502 | \n",
+ " 3 | \n",
+ " 21.717148 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 1502 | \n",
+ " 4 | \n",
+ " 19.478355 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 9859 | \n",
+ " 5 | \n",
+ " 18.626342 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " 1995 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 4833 | \n",
+ " 996 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1996 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 3529 | \n",
+ " 997 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1997 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 271 | \n",
+ " 998 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1998 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 2429 | \n",
+ " 999 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1999 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 17 | \n",
+ " 1000 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2000 rows × 7 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " qid \\\n",
+ "0 1 \n",
+ "1 1 \n",
+ "2 1 \n",
+ "3 1 \n",
+ "4 1 \n",
+ "... .. \n",
+ "1995 2 \n",
+ "1996 2 \n",
+ "1997 2 \n",
+ "1998 2 \n",
+ "1999 2 \n",
+ "\n",
+ " query \\\n",
+ "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "... ... \n",
+ "1995 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1996 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1997 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1998 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1999 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "\n",
+ " q0 docno rank score \\\n",
+ "0 Q0 8172 1 24.566031 \n",
+ "1 Q0 9881 2 22.110514 \n",
+ "2 Q0 5502 3 21.717148 \n",
+ "3 Q0 1502 4 19.478355 \n",
+ "4 Q0 9859 5 18.626342 \n",
+ "... .. ... .. ... \n",
+ "1995 Q0 4833 996 5.161525 \n",
+ "1996 Q0 3529 997 5.161525 \n",
+ "1997 Q0 271 998 5.161525 \n",
+ "1998 Q0 2429 999 5.161525 \n",
+ "1999 Q0 17 1000 5.161525 \n",
+ "\n",
+ " system \n",
+ "0 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "2 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "3 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "4 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "... ... \n",
+ "1995 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1996 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1997 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1998 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1999 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "\n",
+ "[2000 rows x 7 columns]"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "topics = pd.DataFrame([\n",
+ " {'qid': 1, 'query': 'MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES'},\n",
+ " {'qid': 2, 'query': 'MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS'},\n",
+ "])\n",
+ "\n",
+ "submitted_baseline(topics)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.9"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/reproducibility-experiments/interoparability-tutorial.ipynb b/reproducibility-experiments/interoparability-tutorial.ipynb
new file mode 100644
index 0000000..43834a2
--- /dev/null
+++ b/reproducibility-experiments/interoparability-tutorial.ipynb
@@ -0,0 +1,670 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "cae8986f-c23c-4d45-8a77-232488d76244",
+ "metadata": {
+ "tags": []
+ },
+ "source": [
+ "# Tutorial Showing the Interoperability of Full-Rank-Retrievers and Re-Rankers Submitted to TIRA\n",
+ "\n",
+ "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n",
+ "\n",
+ "To start the notebook, please clone the archived shared task repository:\n",
+ "\n",
+ "```\n",
+ "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n",
+ "```\n",
+ "\n",
+ "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n",
+ "```\n",
+ "make jupyterlab\n",
+ "```\n",
+ "\n",
+ "The notebook covers how to combine full-rank appraoches submitted to TIRA with re-rank approaches submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n",
+ "\n",
+ "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "3d3ddb8a-3dc6-4655-8cc2-a4835a49f13b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "pd.set_option('display.max_colwidth', 0)\n",
+ "\n",
+ "from tira.local_client import Client\n",
+ "tira = Client()\n",
+ "\n",
+ "import pyterrier as pt\n",
+ "if not pt.started():\n",
+ " pt.init()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5189ec00-b516-4af8-b1c8-45d510ca6c7d",
+ "metadata": {},
+ "source": [
+ "### Initialize A Full-Rank Retriever\n",
+ "\n",
+ "We create a pyterrier retriever called `submitted_baseline` that is an approach submitted to a shared task in TIRA.\n",
+ "The approach is identified by the name `ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)`, i.e., a software `BM25 (tira-ir-starter-pyterrier)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n",
+ "This software consists of two stages: First, a first software component builds an PyTerrier Index, and the second software does the actual retrieval with BM25.\n",
+ "\n",
+ "With this API, any full-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n",
+ "\n",
+ "We can run the retriever on any dataset integrated in `ir_dataset`.\n",
+ "Here, we use `vaswani` to show the overall functionality with a fast example."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "8fc74a1d-1a6a-404f-80c3-d5023fd5058a",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "submitted_baseline = tira.pt.retriever(\n",
+ " 'ir-benchmarks/tira-ir-starter/BM25 (tira-ir-starter-pyterrier)',\n",
+ " dataset='vaswani',\n",
+ ")\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "74b53692-6cda-4e8f-a79b-e5a6fc87000d",
+ "metadata": {},
+ "source": [
+ "Next, we can make the actual retrieval, here on two topics to keep the result set size small."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "43b45070-825f-42d4-aad9-a995f71ac3e2",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " qid | \n",
+ " query | \n",
+ " q0 | \n",
+ " docno | \n",
+ " rank | \n",
+ " score | \n",
+ " system | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 8172 | \n",
+ " 1 | \n",
+ " 24.566031 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 9881 | \n",
+ " 2 | \n",
+ " 22.110514 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 5502 | \n",
+ " 3 | \n",
+ " 21.717148 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 1502 | \n",
+ " 4 | \n",
+ " 19.478355 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " Q0 | \n",
+ " 9859 | \n",
+ " 5 | \n",
+ " 18.626342 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ " ... | \n",
+ "
\n",
+ " \n",
+ " 1995 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 4833 | \n",
+ " 996 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1996 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 3529 | \n",
+ " 997 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1997 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 271 | \n",
+ " 998 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1998 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 2429 | \n",
+ " 999 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ " 1999 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " Q0 | \n",
+ " 17 | \n",
+ " 1000 | \n",
+ " 5.161525 | \n",
+ " pyterrier.default_pipelines.wmodel_batch_retrieve | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
2000 rows × 7 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " qid \\\n",
+ "0 1 \n",
+ "1 1 \n",
+ "2 1 \n",
+ "3 1 \n",
+ "4 1 \n",
+ "... .. \n",
+ "1995 2 \n",
+ "1996 2 \n",
+ "1997 2 \n",
+ "1998 2 \n",
+ "1999 2 \n",
+ "\n",
+ " query \\\n",
+ "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "... ... \n",
+ "1995 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1996 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1997 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1998 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "1999 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "\n",
+ " q0 docno rank score \\\n",
+ "0 Q0 8172 1 24.566031 \n",
+ "1 Q0 9881 2 22.110514 \n",
+ "2 Q0 5502 3 21.717148 \n",
+ "3 Q0 1502 4 19.478355 \n",
+ "4 Q0 9859 5 18.626342 \n",
+ "... .. ... .. ... \n",
+ "1995 Q0 4833 996 5.161525 \n",
+ "1996 Q0 3529 997 5.161525 \n",
+ "1997 Q0 271 998 5.161525 \n",
+ "1998 Q0 2429 999 5.161525 \n",
+ "1999 Q0 17 1000 5.161525 \n",
+ "\n",
+ " system \n",
+ "0 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "2 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "3 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "4 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "... ... \n",
+ "1995 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1996 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1997 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1998 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "1999 pyterrier.default_pipelines.wmodel_batch_retrieve \n",
+ "\n",
+ "[2000 rows x 7 columns]"
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "topics = pd.DataFrame([\n",
+ " {'qid': 1, 'query': 'MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES'},\n",
+ " {'qid': 2, 'query': 'MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS'},\n",
+ "])\n",
+ "\n",
+ "submitted_baseline(topics)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a518e5ad-cec7-4ac3-a327-9927ce974c79",
+ "metadata": {},
+ "source": [
+ "Next, we create an `advanced_baseline` that re-ranks the top 10 results of the `submitted_baseline` that was submitted to TIRA with another re-ranker that was also submitted to TIRA, i.e., with `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)`.\n",
+ "\n",
+ "All full-rank approaches submitted in TIRA can be the first stage for any second-stage re-ranker (or longer chains).\n",
+ "This is ensured by the ir_datasets integration that ensures that all softwares are interoperable.\n",
+ "In this case, the ir_datasets integration automatically runs in between.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "id": "5c1ff476-3b48-455b-880e-11e5eaa037c9",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Task: Re-Rank -> create files: \n",
+ " rerank.jsonl \n",
+ " qrels.txt \n",
+ " at /output/\n",
+ "Get Documents\n",
+ "Produce rerank data.\n",
+ "Write rerank data.\n",
+ "Done rerank data was written.\n"
+ ]
+ },
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "Get Docs: 100%|██████████| 18/18 [00:00<00:00, 3821.69it/s]\n",
+ "Produce Rerank File.: 18it [00:00, 12110.60it/s]\n"
+ ]
+ },
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " qid | \n",
+ " query | \n",
+ " docno | \n",
+ " q0 | \n",
+ " rank | \n",
+ " score | \n",
+ " system | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 8172 | \n",
+ " 0 | \n",
+ " 7 | \n",
+ " 37.618500 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 9881 | \n",
+ " 0 | \n",
+ " 3 | \n",
+ " 39.492393 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 5502 | \n",
+ " 0 | \n",
+ " 1 | \n",
+ " 45.553276 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 3 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 1502 | \n",
+ " 0 | \n",
+ " 2 | \n",
+ " 45.179565 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 4 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 9859 | \n",
+ " 0 | \n",
+ " 9 | \n",
+ " 36.344490 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 5 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 4871 | \n",
+ " 0 | \n",
+ " 5 | \n",
+ " 38.651424 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 6 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 4817 | \n",
+ " 0 | \n",
+ " 4 | \n",
+ " 38.860371 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 7 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 8276 | \n",
+ " 0 | \n",
+ " 6 | \n",
+ " 37.756004 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 8 | \n",
+ " 1 | \n",
+ " MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES | \n",
+ " 7234 | \n",
+ " 0 | \n",
+ " 8 | \n",
+ " 37.546501 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 9 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 2850 | \n",
+ " 0 | \n",
+ " 4 | \n",
+ " 38.820084 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 10 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 3781 | \n",
+ " 0 | \n",
+ " 5 | \n",
+ " 36.782967 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 11 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 7113 | \n",
+ " 0 | \n",
+ " 3 | \n",
+ " 39.168648 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 12 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 5124 | \n",
+ " 0 | \n",
+ " 9 | \n",
+ " 30.557392 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 13 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 5012 | \n",
+ " 0 | \n",
+ " 6 | \n",
+ " 36.533386 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 14 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 2284 | \n",
+ " 0 | \n",
+ " 7 | \n",
+ " 34.100830 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 15 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 8253 | \n",
+ " 0 | \n",
+ " 2 | \n",
+ " 39.920330 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 16 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 2218 | \n",
+ " 0 | \n",
+ " 1 | \n",
+ " 41.553677 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 17 | \n",
+ " 2 | \n",
+ " MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS | \n",
+ " 8803 | \n",
+ " 0 | \n",
+ " 8 | \n",
+ " 33.829384 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " qid \\\n",
+ "0 1 \n",
+ "1 1 \n",
+ "2 1 \n",
+ "3 1 \n",
+ "4 1 \n",
+ "5 1 \n",
+ "6 1 \n",
+ "7 1 \n",
+ "8 1 \n",
+ "9 2 \n",
+ "10 2 \n",
+ "11 2 \n",
+ "12 2 \n",
+ "13 2 \n",
+ "14 2 \n",
+ "15 2 \n",
+ "16 2 \n",
+ "17 2 \n",
+ "\n",
+ " query \\\n",
+ "0 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "1 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "2 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "3 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "4 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "5 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "6 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "7 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "8 MEASUREMENT OF DIELECTRIC CONSTANT OF LIQUIDS BY THE USE OF MICROWAVE TECHNIQUES \n",
+ "9 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "10 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "11 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "12 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "13 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "14 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "15 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "16 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "17 MATHEMATICAL ANALYSIS AND DESIGN DETAILS OF WAVEGUIDE FED MICROWAVE RADIATIONS \n",
+ "\n",
+ " docno q0 rank score system \n",
+ "0 8172 0 7 37.618500 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "1 9881 0 3 39.492393 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "2 5502 0 1 45.553276 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "3 1502 0 2 45.179565 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "4 9859 0 9 36.344490 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "5 4871 0 5 38.651424 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "6 4817 0 4 38.860371 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "7 8276 0 6 37.756004 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "8 7234 0 8 37.546501 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "9 2850 0 4 38.820084 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "10 3781 0 5 36.782967 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "11 7113 0 3 39.168648 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "12 5124 0 9 30.557392 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "13 5012 0 6 36.533386 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "14 2284 0 7 34.100830 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "15 8253 0 2 39.920330 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "16 2218 0 1 41.553677 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "17 8803 0 8 33.829384 multi-qa-MiniLM-L6-dot-v1-dot "
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "advanced_baseline = submitted_baseline %10 >> tira.pt.reranker(\n",
+ " 'ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)',\n",
+ " irds_id='vaswani'\n",
+ ")\n",
+ "\n",
+ "advanced_baseline(topics)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.9"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/reproducibility-experiments/re-rank-reproducibility.ipynb b/reproducibility-experiments/re-rank-reproducibility.ipynb
new file mode 100644
index 0000000..5dc942c
--- /dev/null
+++ b/reproducibility-experiments/re-rank-reproducibility.ipynb
@@ -0,0 +1,388 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "08ce106f-b0ee-476b-9a46-710cd09f954f",
+ "metadata": {},
+ "source": [
+ "# Tutorial with Re-Rankers\n",
+ "\n",
+ "This notebook shows how post-hoc experiments of the IR Experiment Platform can be conducted.\n",
+ "\n",
+ "To start the notebook, please clone the archived shared task repository:\n",
+ "\n",
+ "```\n",
+ "git@github.com:tira-io/ir-experiment-platform-benchmarks.git\n",
+ "```\n",
+ "\n",
+ "Inside the cloned repository, you can start the Jupyter notebook which automatically installs a minimal virtual environment using:\n",
+ "```\n",
+ "make jupyterlab\n",
+ "```\n",
+ "\n",
+ "The notebook covers how to run re-rankers submitted to TIRA in reproducibility/replicability experiments on the same or new data.\n",
+ "\n",
+ "For each of the softwares submitted to TIRA, the `tira` integration to PyTerrier loads the Docker Image submitted to TIRA to execute it in PyTerrier pipelines (i.e., a first execution could take sligthly longer).\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3c36da51-6d12-444a-8385-a37f42af7781",
+ "metadata": {},
+ "source": [
+ "## Import Dependencies"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "ce1afbb2-01e9-4ed3-af69-fb1848d634ac",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "import pandas as pd\n",
+ "pd.set_option('display.max_colwidth', 0)\n",
+ "\n",
+ "from tira.local_client import Client\n",
+ "tira = Client()\n",
+ "\n",
+ "import pyterrier as pt\n",
+ "if not pt.started():\n",
+ " pt.init()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "31d1134a-2c01-4549-9b3b-74fe987ab304",
+ "metadata": {},
+ "source": [
+ "### Initialize A Re-Ranker\n",
+ "\n",
+ "We create a pyterrier re-ranker called `advanced_pipeline` that re-ranks BM25 with an re-ranker submitted to a shared task in TIRA.\n",
+ "The reranker is identified by the name `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)`, i.e., a software `SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)` submitted to `ir-benchmarks` by the team `tira-ir-starter` (that hosts baselines).\n",
+ "This software consists a single stage that re-ranks with a dense retrieval approach implemented in BEIR.\n",
+ "\n",
+ "With this API, any re-rank approach submitted in TIRA can be executed and re-executed, e.g., on new data.\n",
+ "\n",
+ "We can run the re-ranker on any dataset integrated in `ir_dataset` or any dataframe.\n",
+ "Here, we use a small artificial reranking dataset `data_to_rerank` to show the overall functionality with a fast example."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "c7c75b21-3dca-4251-a703-acff1feb6e67",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " docno | \n",
+ " body | \n",
+ " qid | \n",
+ " query | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " d1 | \n",
+ " this is the first document of many documents | \n",
+ " 1 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " d2 | \n",
+ " this is another document | \n",
+ " 1 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " d3 | \n",
+ " the topic of this document is unknown | \n",
+ " 1 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " docno body qid query\n",
+ "0 d1 this is the first document of many documents 1 first document\n",
+ "1 d2 this is another document 1 first document\n",
+ "2 d3 the topic of this document is unknown 1 first document"
+ ]
+ },
+ "execution_count": 3,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "data_to_rerank = pd.DataFrame([\n",
+ " [\"d1\", \"this is the first document of many documents\", \"1\", \"first document\"],\n",
+ " [\"d2\", \"this is another document\", \"1\", \"first document\"],\n",
+ " [\"d3\", \"the topic of this document is unknown\", \"1\", \"first document\"]\n",
+ " ], columns=[\"docno\", \"body\", \"qid\", \"query\"])\n",
+ "\n",
+ "data_to_rerank"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "6d523fe7-adc0-4dbd-9574-cc33e2900b4e",
+ "metadata": {},
+ "source": [
+ "First, we re-rank this via BM25."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "8e19fe7a-63fe-4b1f-8bc2-40cea6cedc26",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " docno | \n",
+ " body | \n",
+ " qid | \n",
+ " rank | \n",
+ " score | \n",
+ " query | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " d1 | \n",
+ " this is the first document of many documents | \n",
+ " 1 | \n",
+ " 0 | \n",
+ " 5.560003e-01 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " d2 | \n",
+ " this is another document | \n",
+ " 1 | \n",
+ " 2 | \n",
+ " -3.085859e-10 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " d3 | \n",
+ " the topic of this document is unknown | \n",
+ " 1 | \n",
+ " 1 | \n",
+ " 5.681316e-02 | \n",
+ " first document | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " docno body qid rank score \\\n",
+ "0 d1 this is the first document of many documents 1 0 5.560003e-01 \n",
+ "1 d2 this is another document 1 2 -3.085859e-10 \n",
+ "2 d3 the topic of this document is unknown 1 1 5.681316e-02 \n",
+ "\n",
+ " query \n",
+ "0 first document \n",
+ "1 first document \n",
+ "2 first document "
+ ]
+ },
+ "execution_count": 4,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "bm25 = pt.text.scorer(wmmodel='bm25')\n",
+ "\n",
+ "bm25(data_to_rerank)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "05943808-1fc0-4098-9822-1de3b4db4867",
+ "metadata": {},
+ "source": [
+ "Next, we use the re-ranker `ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)` as second stage re-ranker, after BM25.\n",
+ "\n",
+ "This shows that re-rankers in TIRA are interoperable with other re-rankers.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "3666ba24-21ce-43aa-a9f1-17bfc4502a7b",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [
+ {
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " | \n",
+ " docno | \n",
+ " body | \n",
+ " qid | \n",
+ " query | \n",
+ " q0 | \n",
+ " rank | \n",
+ " score | \n",
+ " system | \n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 | \n",
+ " d1 | \n",
+ " this is the first document of many documents | \n",
+ " 1 | \n",
+ " first document | \n",
+ " 0 | \n",
+ " 1 | \n",
+ " 46.084885 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 1 | \n",
+ " d2 | \n",
+ " this is another document | \n",
+ " 1 | \n",
+ " first document | \n",
+ " 0 | \n",
+ " 2 | \n",
+ " 40.802025 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ " 2 | \n",
+ " d3 | \n",
+ " the topic of this document is unknown | \n",
+ " 1 | \n",
+ " first document | \n",
+ " 0 | \n",
+ " 3 | \n",
+ " 37.294750 | \n",
+ " multi-qa-MiniLM-L6-dot-v1-dot | \n",
+ "
\n",
+ " \n",
+ "
\n",
+ "
"
+ ],
+ "text/plain": [
+ " docno body qid query q0 \\\n",
+ "0 d1 this is the first document of many documents 1 first document 0 \n",
+ "1 d2 this is another document 1 first document 0 \n",
+ "2 d3 the topic of this document is unknown 1 first document 0 \n",
+ "\n",
+ " rank score system \n",
+ "0 1 46.084885 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "1 2 40.802025 multi-qa-MiniLM-L6-dot-v1-dot \n",
+ "2 3 37.294750 multi-qa-MiniLM-L6-dot-v1-dot "
+ ]
+ },
+ "execution_count": 5,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "advanced_pipeline = bm25 >> tira.pt.reranker('ir-benchmarks/tira-ir-starter/SBERT multi-qa-MiniLM-L6-dot-v1 (tira-ir-starter-beir)')\n",
+ "\n",
+ "advanced_pipeline(data_to_rerank)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.9"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}