BioASQ 2020 Challenge

This repository contains our team's solution for the BioASQ 2020 challenge, a large-scale biomedical semantic indexing and question answering competition.

Challenge Overview

The BioASQ challenge focuses on biomedical semantic indexing and question answering. It aims to push the frontiers of large-scale biomedical semantic indexing and question answering systems. The challenge consists of various tasks, and our team participated in Task 8b, which involves biomedical question answering.

Repository Structure

The repository is organized into several main directories:

data/: Contains the dataset files used for training and testing.
task_factoid/: Code and notebooks for the factoid question answering task.
task_list/: Code and notebooks for the list question answering task.
task_yesno/: Code and notebooks for the yes/no question answering task.
utils/: Utility functions and helper scripts used across different tasks.
transformers_models/: Pre-trained models used in the project.

Tasks

We addressed three main types of questions in the BioASQ challenge:

Factoid Questions: These are questions that require a specific fact or short phrase as an answer.
List Questions: These questions expect a list of items as the answer.
Yes/No Questions: These are binary questions that can be answered with either "yes" or "no".

Approach

Our approach to solving these tasks involved:

Data Preprocessing: We used various NLP techniques to clean and prepare the biomedical text data.
Embeddings: We utilized pre-trained biomedical embeddings, including BioBERT and ELMo, to represent the text data.
Model Architecture: We implemented deep learning models using TensorFlow and Keras, with BERT-based architectures for factoid and list questions, and a custom neural network for yes/no questions.
Training: We fine-tuned our models on the BioASQ dataset, using techniques like early stopping to prevent overfitting.
Evaluation: We used various metrics to evaluate our models, including strict accuracy, lenient accuracy, and mean reciprocal rank for factoid and list questions, and standard classification metrics for yes/no questions.

Key Files

task_factoid/train_factoid.py: Main script for training the factoid question answering model.
task_list/functions_list.py: Contains functions for the list question answering task.
task_yesno/notebooks/simple_embeddings.ipynb: Notebook demonstrating the use of embeddings for yes/no questions.
utils/data.py: Utility functions for data loading and preprocessing.
utils/evaluate.py: Functions for evaluating model performance.

Requirements

The project requires several Python libraries, which are listed in the requirements.txt file. Key dependencies include:

TensorFlow
PyTorch
Transformers
Flair
NLTK
NumPy
Pandas

Usage

To run the models:

Install the required dependencies: pip install -r requirements.txt
Prepare the data by running the data preprocessing scripts in the utils/ directory.
Train the models using the respective training scripts in each task directory.
Evaluate the models using the provided evaluation functions.

Results

Our models achieved competitive results on the BioASQ 2020 challenge. Detailed performance metrics and analysis can be found in the PDF summary in the main folder.

Acknowledgements

We would like to thank the BioASQ organizers for providing this challenging and important competition in the field of biomedical text mining and question answering.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
.vscode		.vscode
data		data
embedding_files		embedding_files
task_factoid		task_factoid
task_list		task_list
task_yesno		task_yesno
transformers_models		transformers_models
utils		utils
.gitignore		.gitignore
BioASQ_2020__a_biomedical_Question_Answering_System.pdf		BioASQ_2020__a_biomedical_Question_Answering_System.pdf
DataAugmentation.ipynb		DataAugmentation.ipynb
DatasetAnalysis.ipynb		DatasetAnalysis.ipynb
README.md		README.md
load_training_data.ipynb		load_training_data.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BioASQ 2020 Challenge

Challenge Overview

Repository Structure

Tasks

Approach

Key Files

Requirements

Usage

Results

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

fsabiu/BioASQ2020

Folders and files

Latest commit

History

Repository files navigation

BioASQ 2020 Challenge

Challenge Overview

Repository Structure

Tasks

Approach

Key Files

Requirements

Usage

Results

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages