Immune Epitope analyses

This repository contains scripts and processes for the identification of genes whose protein products contains epitope sequences identified by Immune Epitope Database and Analysis Resource (IEDB).

The analysis begin with processing of the epitopes taking a tab file containing the epitopes and reference proteome. We divide the peptides into 3 categories: very small peptides which require separate preprocessing of the reference genome, peptides annotated as matching the input taxon which we will allow up to one mismatch in the alignment, and the rest which we will do exact matching. We also retrieve the original protein sequences from genbank for those peptides which match the input taxon. Full peptide sequence are also matched against the reference proteome.

iedb PEPMatch (https://github.com/IEDB/PEPMatch) is used for peptide matching.

Get Started

To run the work the following dependencies need to be downloaded and installed.

Docker

https://docs.docker.com/engine/install/

Nextflow

curl https://get.nextflow.io | bash

Pull the GitHub with the command below

git pull https://github.com/VEuPathDB/antismash-nextflow.git
Then do > nextflow run main.nf

Alternatively the workflow can be run directly from the online which pull the repo and run it.

nextflow run VEuPathDB/antismash-nextflow -with-trace -c <config_file> -r main The -c <config_file> is the nextflow.config file, an example can found in the repo directory

Input Data

Example of input data can be found in the data directory with following required files

An epitope tab file containing the short epitope sequence from IEDB. See example in data/epitopes.tab
A file containing sequences for peptide source proteins. See example in data/peroepitopesGenpept.fasta

Output Data

The output of the workflow is an indexfile containing the location of the match peptide, indicating if they are exact matches, match the taxon, and the reference source protein. An example can be found in the output folder.

Nextflow workflow diagram

flowchart TB
    subgraph " "
    v0["taxonID"]
    v2["peptideTabfile"]
    v8["refFasta"]
    v9["peptideTabfile"]
    v11["fasta"]
    v16["fasta"]
    v21["fasta"]
    end
    subgraph epitopeMapping
    v1([fetchTaxon])
    v3([peptideProteinAccessionsFilteredByTaxa])
    v5([fetchProtein])
    v10([iedbExactMatches])
    subgraph smallExactPepMatch
    v12([preprocess])
    v13([match])
    v14([filterResults])
    end
    subgraph exactPepMatch
    v17([preprocess])
    v18([match])
    v19([filterResults])
    end
    subgraph inexactForTaxaPeptidesPepMatch
    v22([preprocess])
    v23([match])
    v24([filterResults])
    end
    v28([mergeResultsFiles])
    v30([indexResults])
    v4(( ))
    v6(( ))
    v15(( ))
    v20(( ))
    v25(( ))
    
    end
    v0 --> v1
    v1 --> v3
    v1 --> v10
    v2 --> v3
    v3 --> v13
    v3 --> v4
    v3 --> v15
    v3 --> v20
    v4 --> v5
    v5 --> v6
    v8 --> v10
    v9 --> v10
    v6 --> v10
    v10 --> v28
    v11 --> v12
    v12 --> v13
    v13 --> v14
    v14 --> v25
    v16 --> v17
    v17 --> v18
    v15 --> v18
    v18 --> v19
    v19 --> v25
    v21 --> v22
    v22 --> v23
    v20 --> v23
    v23 --> v24
    v24 --> v25
    v25 --> v28
    v28 --> v30

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
bin		bin
conf		conf
data		data
output		output
subworkflows/local		subworkflows/local
workflows		workflows
Dockerfile		Dockerfile
Jenkinsfile		Jenkinsfile
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Immune Epitope analyses

About

Releases

Packages

Contributors 4

Languages

License

VEuPathDB/iedb-epitope-mapping-nextflow

Folders and files

Latest commit

History

Repository files navigation

Immune Epitope analyses

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages