GitHub - gipplab/math-content-similarity: Experiments with LLMs for finding math content similarity

About

This repository contains the resources used for RecSys'2024 submission "Towards Better STEM Recommendations: A Gold-Standard Dataset with Math content"

Install Dependencies

Please run the following to install dependencies for running the scripts to obtain results (it is recommended to create a virtual environment first).

pip install -r requirements.txt

Dataset

The documents are obtained via zbMATHOpen API https://api.zbmath.org/v1/ The ID of documents can be obtained from repo: https://zenodo.org/records/5062959

To obtain metadata if all documents such as title, abstract, keywords, MSCs, citations, author names, etc please run

python /src/hybrid/feature_simil/getDataset.py

Evaluation results

Baselin models (Table 2)

To get evaluation results of basline, please go to follow the mentioned steps:

reproducing_results/Baseline/

Initial Ranker candidates (Table 3)

To get evaluation results of Initial Ranker, go to follow the mentioned steps:

reproducing_results/InitialRanker/

Re-anker candidates (Table 4)

To get evaluation results of Re-ranker, go to follow the mentioned steps:

reproducing_results/Re-ranker/

User study (HyMathRec Evaluation, Table 5 and Table 6)

Annotations of 4 annotators are available in the folder

src/hybrid/originalAnno/

To calculate Kappa scores and evaluation scores (Precision, Recall, F1, MRR, nDCG), run

python src/hybrid/userAnnoEval.py

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
archive		archive
data_		data_
reproducing_results		reproducing_results
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
config.py		config.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Install Dependencies

Dataset

Evaluation results

Baselin models (Table 2)

Initial Ranker candidates (Table 3)

Re-anker candidates (Table 4)

User study (HyMathRec Evaluation, Table 5 and Table 6)

About

Releases

Packages

Contributors 3

Languages

gipplab/math-content-similarity

Folders and files

Latest commit

History

Repository files navigation

About

Install Dependencies

Dataset

Evaluation results

Baselin models (Table 2)

Initial Ranker candidates (Table 3)

Re-anker candidates (Table 4)

User study (HyMathRec Evaluation, Table 5 and Table 6)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages