Skip to content

Latest commit

 

History

History
198 lines (158 loc) · 10.6 KB

PLUGINS.md

File metadata and controls

198 lines (158 loc) · 10.6 KB

SYNOPSIS

SneakerNet is a set of plugins. Each plugin runs a distinct analysis on a MiSeq run and accepts specific flags when it is run. Therefore, any given plugin can run independently of the others (aside from any prerequisite files, e.g., genome assemblies from a previous plugin).

Quick table of contents:

Workflows

SneakerNet workflows define a particular order for the plugins to run. They help resolve dependencies like ensuring that genome assemblies are present before analyzed or enforcing that a report is generated only after all plugins have created their outputs.

Workflows are defined in plugins.conf.

The exact order of plugins for all workflows can be found by running the command SneakerNet.checkdeps.pl --list.

To make your own custuom workflow, edit the file under config/plugins.conf. Plugins are run in the order specified for any given workflow. For example:

default = pluginA.pl, pluginB.pl, pluginZ.pl

In this example, in the default workflow, if pluginB has a dependency on pluginZ, you might want to change the order so that pluginB runs last.

default = pluginA.pl, pluginZ.pl, pluginB.pl

Default

This workflow runs most plugins and assumes that you have some flavor of Illumina (MiSeq, HiSeq, MiniSeq).

Ion Torrent

This workflow runs plugins designed for ion torrent.

Metagenomics

For metagenomics runs.

Assembly

For assembly-only runs (ie, only assemblies and not raw reads in the folder).

sarscov2

For running the SARS-CoV-2 workflow. Plugin(s) are prefixed with sn_sars_.

Workflows and their plugins

Workflows in SneakerNet are defined by which plugins are run and in which order. You can run SneakerNet.checkdeps.pl --list to see which plugins are run and in which order, for any workflow. This script pulls from config/plugins.conf. Below is the output for SneakerNet version 0.11.2. By default, only default will be run in a SneakerNet analysis, but other workflows are available. The pseudo-workflow all is an alphabetical listing of all available plugins.

$ SneakerNet.checkdeps.pl --list
    all
    addReadMetrics.pl, assembleAll.pl, baseBalance.pl, emailWhoever.pl, sn_assemblyWorkflow_init.pl, sn_crypto_assembleAll.pl, sn_crypto_gp60.pl, sn_detectContamination-kraken.pl, sn_detectContamination-mlst.pl, sn_detectContamination.pl, sn_helloWorld.pl, sn_helloWorld.py, sn_helloWorld.sh, sn_immediateStatus.pl, sn_iontorrent_assembleAll.pl, sn_kraken-metagenomics.pl, sn_kraken.pl, sn_mlst.pl, sn_mlst-wg.pl, sn_parseSampleSheet.pl, sn_passfail.pl, sn_report.pl, sn_SalmID.pl, sn_saveFailedGenomes.pl, sn_staramr.pl, transferFilesToRemoteComputers.pl

    assembly
    sn_assemblyWorkflow_init.pl, sn_mlst.pl, sn_staramr.pl, sn_passfail.pl, sn_kraken.pl, sn_detectContamination-kraken.pl, sn_report.pl, emailWhoever.pl

    cryptosporidium
    sn_parseSampleSheet.pl, addReadMetrics.pl, sn_crypto_assembleAll.pl, sn_mlst.pl, sn_kraken.pl, sn_detectContamination-kraken.pl, sn_passfail.pl, transferFilesToRemoteComputers.pl, emailWhoever.pl

    default
    sn_parseSampleSheet.pl, addReadMetrics.pl, assembleAll.pl, sn_mlst.pl, sn_kraken.pl, sn_detectContamination-kraken.pl, sn_detectContamination-mlst.pl, baseBalance.pl, sn_staramr.pl, sn_passfail.pl, transferFilesToRemoteComputers.pl, sn_report.pl, emailWhoever.pl

    iontorrent
    addReadMetrics.pl, sn_iontorrent_assembleAll.pl, sn_mlst.pl, sn_kraken.pl, sn_detectContamination-kraken.pl, sn_passfail.pl, sn_staramr.pl, transferFilesToRemoteComputers.pl, emailWhoever.pl

    metagenomics
    sn_parseSampleSheet.pl, addReadMetrics.pl, sn_kraken.pl, sn_kraken-metagenomics.pl, sn_passfail.pl, sn_report.pl

Diagram of default workflow

graph TD
    A(reads) -->|sn_parseSampleSheet.pl| B(raw reads, samples.tsv)
    B -->|addReadMetrics.pl| C(read metrics)
    B -->|assembleAll.pl| D(assembly, assembly metrics)
    D -->|sn_mlst.pl| E(7-gene MLST results)
    B -->|sn_kraken.pl| F[Kraken results]
    F -->|sn_detectContamination-kraken.pl| G(Kraken contamination report)
    B -->|sn_detectContamination-mlst.pl| H(MLST contamination report)
    B -->|baseBalance.pl| I(base balance report)
    D -->|sn_staramr.pl| J(AMR results)
    G -->|sn_passfail.pl| K(pass or fail results)
    C -->K
    D -->K
    K -->|transferFilesToRemoteComputers.pl| L(transfer report)
    K -->|sn_report.pl| M(sneakernet report)
    M -->|emailWhoever.pl| N(email with SN report)
    B -->M 
    C -->M
    D -->M
    E -->M
    F -->M 
    G -->M 
    H -->M 
    I -->M 
    J -->M 
Loading

Diagram of assembly workflow

graph TD
    A(assemblies) --> |sn_assemblyWorkflow_init.pl| B(Copies of assemblies)
    A --> |sn_assemblyWorkflow_init.pl| C(Gene predictions)
    B --> I(genome metrics)
    C --> I
    B --> |sn_mlst.pl| D(Sequence types)
    B --> |sn_staramr.pl| E(AMR profiles)
    B --> |sn_kraken.pl| F(Kraken results)
    F --> |sn_detectContamination-kraken.pl| G(Contamination results)
    G --> |sn_passfail.pl| H(passfail)
    I --> |sn_passfail.pl| H
    H --> |sn_report.pl| J(SN report)
    F --> J
    G --> J
    D --> J
    E --> J
    I --> J
Loading

Command line

Each plugin can accept the following options. The first positional parameter must be the SneakerNet run.

Flag Default value description
--help generate a help menu
--numcpus 1 Parallelization
--debug generate more messages or any other debugging
--tempdir automatically generated, e.g., with File::Temp or mktemp Where temporary files are located
--force This is loosely defined but can be used for many things like overwriting output files
--version Print a version in the format of X.Y or X.Y.Z
--citation Print a citation statement.
--check-dependencies check all executable dependencies. Print executable dependencies to stdout and version information to stderr. Run SneakerNet.checkdeps.pl to check dependencies on all plugins.

Catalog

Except for the legacy plugins, all plugins are prefixed with sn_. The plugins are not specific to any one language, although the majority are in Perl.

Contributions are welcome for the following plugin documents.

Plugin description
sn_SalmID.pl Salmonella subspecies identification
sn_staramr.pl staramr antimicrobial resistance determinant analysis
sn_passfail.pl Table of pass/fail for each sample
sn_iontorrent_assembleAll.pl Assembly for ion torrent data
addReadMetrics.pl Raw read metrics
sn_helloWorld.pl Example plugin in Perl
sn_helloWorld.sh Example plugin in Bash
sn_helloWorld.py Example plugin in Python
baseBalance.pl Dividing all As by Ts and all Cs by Gs to see if we get a ratio of 1 for each
sn_mlst.pl Runs 7-gene MLST on assemblies
sn_mlst-wg.pl Runs whole-genome MLST on assemblies
transferFilesToRemoteComputers.pl Transfers files to a remote computer
sn_detectContamination.pl Detects potential contamination by kmer counting
emailWhoever.pl Emails all results
sn_detectContamination-mlst.pl Runs 7-gene MLST on raw reads, checking for abnormal number of alleles
sn_iontorrent_parseSampleSheet.pl Turns the sample sheet for ion torrent into SneakerNet format
sn_immediateStatus.pl Emails an immediate report
guessTaxon.pl Runs metagenomics classifier to guess the taxon for a sample
sn_kraken.pl Runs metagenomics classifier on raw reads, or on assemblies if reads are not present. No secondary analysis is performed by this exact plugin. E.g., sn_detectContamination-kraken.pl.
sn_detectContamination-kraken.pl Runs metagenomics classifier to guess the taxon for a sample and list at most a single major contaminant
sn_kraken-metagenomics.pl Analyzes kraken results for a metagenomics sample
assembleAll.pl Assembles Illumina data
sn_assemblyWorkflow_init.pl For workflows that only have assembly data. Initializes the workflow so that other plugins can function properly.
sn_crypto_assembleAll.pl Assembles Illumina data for Cryptosporidium
sn_crypto_gp60.pl Provides the gp60 profile for Cryptosporidium
sn_parseSampleSheet.pl Turns the sample sheet for Illumina into SneakerNet format
sn_report.pl Creates an HTML report from all other plugins
sn_sarscov2_assembleAll.pl Runs assembly for SARS-CoV-2 amplicon-based genomes
sn_assembleAll_reference.pl.md Runs reference assembly
sn_saveFailedGenomes.pl Saves genomes into the destination folder, into a QC_Fails subfolder
sn_cleanup.pl Cleans intermediate files in a SN directory
sn_genotype.pl Runs genotyping on samples, e.g., serogrouping
sn_genotype_escherichia.pl Interprets genotyping on samples for Escherichia