Skip to content

fanglab/LongTrack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

6128668 · Feb 10, 2025

History

7 Commits
Jan 29, 2025
Jan 29, 2025
Feb 10, 2025
Jan 29, 2025

Repository files navigation

LongTrack

Description

LongTrack is a novel framework that uses long-read metagenomic assemblies and reliable informatics tailored for FMT strain tracking. The core idea of LongTrack is based on (1) long read metagenomic sequencing data generated for the donors’ and recipients’ samples before FMT to construct de novo metagenome-assembled genomes (long-read MAGs), (2) selecting strain-specific unique k-mers from long read MAGs, and (3) the use of unique k-mers and short read metagenomic data for precision strain tracking.

Installation

Fundamental dependencies

  1. Python version 2.7.16
  2. Bowtie version 2.2.8

Python packages

  1. numpy >=1.7.1
  2. HTSeq >=0.5.3p9
  3. matplotlib >= 1.0.0
  4. seaborn >= 0.5.0
  5. pandas >= 0.7.3

Tool showcase

To showcase the toolbox applications, we provide the following demonstration (which takes ~5 minutes in total) that integrates two major steps together: 1) an illustrative run that performs strain tracking for 5 long-read MAGs across 3 post-FMT samples; 2) summarizing strain tracking

Running the LongTrack demo

STEP 1: Download example data from zenodo

The example data, LongTrack_test_data.zip, can be downloaded at the following zenodo URLs https://zenodo.org/records/14765650

STEP 2: Running the LongTrack demo

Prepare for LongTrack_demo.sh

unzip LongTrack_test_data.zip
mv Data LongTrack/

Add directory of python2 and bowtie2 to your $PATH environment variable

module load python/2.7.16 bowtie2

or

export PATH=$PATH:[bowtie2_path]

Running LongTrack_demo.sh

cd LongTrack/code
sh LongTrack_demo.sh

Explanation of inputs in LongTrack_demo.sh

Inputs

  1. MAG: This folder includes long-read MAGs (.fna) that de novo assembled from the donors. And the k-mer (k=31) database for each MAG (_kmcdb) generated by KMC v3.1.0
Akkermansia_muciniphila_D1.fna
Akkermansia_muciniphila_D1_kmcdb_dump
Akkermansia_muciniphila_D1_kmcdb.kmc_pre
Akkermansia_muciniphila_D1_kmcdb.kmc_suf
…
  1. metagenome: This folder includes the short-read metagenomic data of post-FMT recipients across 3 time points and unrelated samples as the negative control (NC1 and NC2). (Paired-end data: *_sample_PE1.fasta *_sample_PE2.fasta)
NC1_sample_PE1.fasta
NC1_sample_PE2.fasta
postFMT1W4_sample_PE1.fasta
postFMT1W4_sample_PE2.fasta
…
  1. unique_kmer: This folder includes the unique k-mers from each long-read MAG
Akkermansia_muciniphila_D1_kmcdb_dump_withpos
…
  1. conflict_table: This file lists, for each sample, its conflicts (no-relationship samples). For example, negative controls are in conflict with every sample, which would be used as no-relationship samples to calculate confidence scores
postFMT1W4  	NC1,NC2
postFMT1W8  	NC1,NC2
postFMT1Y5  	NC1,NC2
NC1 	NC2,postFMT1W4,postFMT1W8,postFMT1Y5
NC2 	NC1,postFMT1W4,postFMT1W8,postFMT1Y5

Outputs

Once the above scripts completes, the following files and figures will be generated in the folders described below.

  1. Strain tracking table: Tracking_results/results_readdistribution_actualreads_confidencescores, Presence (1) or absence (0) of each long-read MAG across different post-FMT samples collected at time points and negative controls.
strain	NC1	NC2	postFMT1W4	postFMT1W8	postFMT1Y5
Akkermansia_muciniphila_D1  	0	0	1	1	1
Alistipes_onderdonkii_D1    	0	0	1	1	1
Bifidobacterium_longum.D1.str1    	0	0	1	1	1
Bifidobacterium_longum.D1.str2   	0	0	1	1	1
Gemmiger_formicilis_D1  	0	0	1	1	1
  1. Strain tracking summarized in a heatmap: Tracking_results/Strain_tracking_results.png. Presence (green) or absence (gray) of strains in post-FMT recipients determined by strain-specific unique k-mers from long read MAGs.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published