Skip to content

Main repository for "Antimicrobial drug recommendation from MALDI-TOF MS with statistical guarantees using conformal prediction"

Notifications You must be signed in to change notification settings

BorgwardtLab/ConformalAMR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ConformalAMR

This repository contains the code used to conduct experiments presented in the paper Antimicrobial drug recommendation from MALDI-TOF mass spectrometry with statistical guarantees using conformal prediction, by Nina Corvelo Benz, Lucas Miranda, Dexiong Chen, Janko Sattler, and Karsten Borgwardt.

Abstract

Antimicrobial resistance (AMR) is a global health challenge, complicating the treatment of bacterial infections and leading to higher patient morbidity and mortality. Rapid and reliable identification of resistant pathogens is crucial to guide early and effective therapeutic interventions. However, traditional culture-based methods are time-consuming, highlighting the need for faster predictive approaches. Machine learning models trained on MALDI-TOF mass spectrometry data, readily collected in most clinics for fast species identification, offer promise but face limitations in clinical applicability, particularly due to their lack of comprehensive, statistically valid uncertainty estimates. Here, we introduce a novel AMR prediction framework that addresses this gap with a novel knowledge graphenhanced conformal predictor. Conformal prediction (CP) constructs prediction sets with statistical coverage guarantees, ensuring that bacterial resistance to a certain antibiotic is flagged with a specified error rate.

Conformal framework

Our proposed conformal predictor constructs improved prediction sets over standard CP approaches by using a knowledge graph capturing the interdependencies in antibiotic resistance patterns.

Conformal performance

In addition, we introduce a novel classifier framework that improves upon previous multimodal models by incorporating multigraph-based antibiotic representations using state-of-the-art self-supervised methods. Besides increasing resistance detection for most tested species-drug combinations, the presented architecture, termed ResMLP-GNN, overcomes the limitations of previous efforts and supports multi-drug antibiotics that are highly relevant in clinical practice. We successfully evaluated our approach on a set of highly-relevant antibiotics, commonly used in clinics to treat infections with Klebsiella pneumoniae and Escherichia coli

ROC and PR curves

Installation

We provide pyproject.toml and poetry.lock files for reproducible installation of the environment used in our experiments using poetry. We recommend doing this within a conda/mamba isolated environment, running Python^=3.11. To install the environment, just run the following commands (assuming you have poetry up and running):

mamba create -n ConformalAMR python=3.11
poetry install

This should install both the required dependencies, and the conformal_amr package as such.

Training base ResAMR-GNN models

To train the base ResAMR-GNN models, you can run the following command:

poetry run python scripts/train_ResAMR_classifier.py --driams_dataset A --driams_long_table data/DRIAMS_combined_long_table_multidrug.csv --drugs_df data/DRIAMS_Mole-BERT_drug_embeddings.csv --spectra_matrix path/to/spectra/DRIAMS-A/spectra_binned_6000_all_multidrug.npy --output_folder results

Where A can be replaced by B or C or D to train on the respective DRIAMS datasets, and the spectra matrix is generated using the notebooks/Process_DRIAMS_data.ipynb.py notebook, upon downloading the DRIAMS dataset.

Finetuning ResMLP-GNN models

To finetune a base ResMLP-GNN model on a fixed species-drug combination, you can run the following command:

poetry run python scripts/finetune_ResAMR_classifier.py --pretrained_checkpoints_folder /path/to/pretrained/base/models --driams_dataset A --driams_long_table data/DRIAMS_combined_long_table_multidrug.csv --drugs_df data/DRIAMS_Mole-BERT_drug_embeddings.csv --spectra_matrix path/to/spectra/DRIAMS-A/spectra_binned_6000_all_multidrug.npy --splits_file path/to/splits/file --n_epochs 150 --output_folder results --species_drug_combination 'Escherichia coli_Ceftriaxone' --freeze_drug_emb
    

Where A can be replaced by B or C or D to train on the respective DRIAMS datasets, the pretrained checkpoints and the splits files are generated after training the base models, and 'Escherichia coli_Ceftriaxone' can be replaced by any other species-drug combination of interest.

Evaluating the conformal predictor described in the paper

To evaluate knowledge-graph enhanced conformal predictor with the hyperparameters described in the paper, you can simply run:

poetry run python scripts/evaluate_conformal.py

About

Main repository for "Antimicrobial drug recommendation from MALDI-TOF MS with statistical guarantees using conformal prediction"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published