GitHub - ASAGlab/MOI--An-integrated-solution-for-omics-analyses

Introduction

multiOmicsIntegrator is a bioinformatics best-practice analysis pipeline for analysis of multi-Omics data.

The pipeline is built using Nextflow version 23.04.2.5870 (IMPORTANT), a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It uses Docker/Singularity containers making installation trivial and results highly reproducible. The Nextflow DSL2 implementation of this pipeline uses one container per process which makes it much easier to maintain and update software dependencies. Where possible, these processes have been submitted to and installed from nf-core/modules in order to make them available to all nf-core pipelines, and to everyone within the Nextflow community!

Pipeline summary

RNAseq analysis on the level of :
- mRNAs
- miRNAs
- isoforms
Functional annotation of transcripts
Metabolomics analysis
Proteomics analysis
Integration of multi omics data

Supplementary materials for this pipeline can be found at this zenodo repository:

https://zenodo.org/records/10813721

General inputs and outputs

The MOI pipeline is organized into individual modules, each responsible for a specific step in the analysis workflow. The modular design facilitates code flexibility in incorporating new analyses techniques or custom implementations, as well as easy maintenance and scalability.

MOI’s behavior is regulated through the params.yml files, each named to align with the specific analysis segment they govern. In those files the user is tasked with specifying input and output parameters and with the optional fine-tuning intricacies such as algorithm selection and algorithmic configurations.

The pipeline's inputs are streamlined to one csv file. This file accommodates either a solitary column of SRA codes or a directory pointing to the location of fastq files, along with any other metadata pertaining to their samples. If the analysis commences with count matrices the user can specify the directory of the feature matrix along with a phenotype file.

MOI produces extensive outputs, including informative plots and intermediate results in the form of text and RData objects for each module, accommodating users who seek further utilization or detailed inspection of results. Outputs are organized hierarchically based on the user’s parameterization; for example, the pathway enrichment analysis of genes will be located under the directory “/user_defined_output_directory/genes/biotranslator/”.

Most important tools

Omics	Functionality	Tools
Genes, miRNA, isoforms	SRA download	SRA toolkit
Genes, miRNA, isoforms	Quality control	FastQC, trimgalore
Genes, miRNA, isoforms	Align and Assembly	Salmon, samtools, STAR, Hisat2, StringTie2
Genes, miRNA, isoforms, proteins, lipids	Data preprocessing	R packages: edger, limma, sva, ggplot2, ComplexHeatmap
Proteins, lipids	Specific for proteins and lipids	R packages: preprocesscore, mstus normalization
Lipids	Specific for lipids	R packags: lipidr
Genes, miRNA, isoforms, proteins, lipids	Differential expression analyss	R packages: DESeq2, edger, RankProd, ggplot2 ComplexHeatmap
Genes, miRNA, isoforms, proteins, lipids	Correlation analysis	R package stats
Genes, miRNA, isoforms, proteins, lipids	Pathway enrichment analysis	Clusterprofiler, Biotranslator
Lipids	Specific for lipids pathway enrichment analysis	Custom tool: Lipidb
Genes, miRNA, isoforms, proteins	RIDDER (module to identify IRE1 substrates)	gRIDD, RNAeval, fimo
Genes, miRNA, isoforms	Functional annotation	CPAT, signalP, pfam
Genes, miRNA, isoforms, proteins	Secondary structure prediction	RNAfold, RNAeval
Genes, miRNA, isoforms, proteins	Find motif	fimo
Isoforms	Genome wide isoform analysis	IsoformSwitchAnalyzer

Quick Start

Install Nextflow (>=22.10.1)
Install Docker.

Download the pipeline and rename it:

git clone https://github.com/ASAGlab/MOI--An-integrated-solution-for-omics-analyses.git &&
mv MOI--An-integrated-solution-for-omics-analyses multiomicsintegrator

Modify in the params_mcia.yml file the following parameters regarding the location you want your outputs

outdir: yourDir
pathmcia: /path/to/yourDir/mcia
biotrans_all_path : /path/to/yourDir/prepareforbio

Paths of pathmcia and biotrans_all_path should be complete and follow this format:
```
   $outdir/mcia
   $outdir/prepareforbio
```
See format in params_mcia.yml and change accordingly.

In addition check modify resources (in params_mcia.yml) according to your system:
- max_memory : '8.GB'
- max_cpus : 7

Run the pipeline by providing the full path to params-file argument
```
NXF_VER=23.04.2 nextflow run multiomicsintegrator -params-file /full/path/to/params_mcia.yml -profile docker 
```
Note that some form of configuration will be needed so that Nextflow knows how to fetch the required software. This is usually done in the form of a config profile (YOURPROFILE in the example command above). You can chain multiple config profiles in a comma-separated string.
Start running your own analysis!

The above example refers to a simplified version of an integrated analysis. Depending on which part of the pipeline you want to run and your starting point (raw or matrices) modify the respective parameter file:
- params_isoforms.yml
- params_genes.yml
- params_mirna.yml
- params_proteins.yml
- params_lipids.yml
- params_mcia
- params_ridderalone

Common issues:

If an error regarding biomaRt appears:

```bash
 Error in h(simpleError(msg, call)) : 
 error in evaluating the argument 'conn' in selecting a method for function 'dbDisconnect': object 'info' not found
 Calls: useEnsembl ... .sql_disconnect -> dbDisconnect -> .handleSimpleError -> h
 Execution halted
```

or

  Ensembl site unresponsive, trying useast mirror
Ensembl site unresponsive, trying asia mirror
Error in .chooseEnsemblMirror(mirror = mirror, http_config = http_config) : 
 Unable to query any Ensembl site
Calls: useEnsembl -> .chooseEnsemblMirror
Execution halted

just run the pipeline again with -resume :

nextflow run multiomicsintegrator -params-file /full/path/to/params_mcia.yml -profile docker -resume

If the error persists try delete container of bianca7/mompreprocess (or all containers if possible) and run again
Comparative analysis, isoform analysis and mcia need substantial resources (at least 7 cpus).
Check resources and your directories!

Documentation

The ASAGlab/moi pipeline comes with documentation about the pipeline under docs in various usage.md files as well as example yml files which the user can modify as guidance into custom modifications directly. Example outputs are also included under the docs folder in this repository.

Credits

ASAGlab/moi was originally written by Bianca Alexandra Pasat.

We thank the following people for their extensive assistance in the development of this pipeline:

Contributions and Support

If you would like to contribute to this pipeline, please see the contributing guidelines.

For further information or help, don't hesitate to get in touch on the Slack #MOM channel (you can join with this invite).

Citations

An extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.

You can cite the nf-core publication as follows:

The nf-core framework for community-curated bioinformatics pipelines.

Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.

Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.

Name		Name	Last commit message	Last commit date
Latest commit History 210 Commits
assets		assets
bin		bin
conf		conf
docs		docs
lib		lib
modules		modules
subworkflows		subworkflows
workflows		workflows
.gitattributes		.gitattributes
CHANGELOG.md		CHANGELOG.md
CITATIONS.md		CITATIONS.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
modules.json		modules.json
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
params_genes.yml		params_genes.yml
params_genes_whole.yml		params_genes_whole.yml
params_isoforms.yml		params_isoforms.yml
params_isoforms_whole.yml		params_isoforms_whole.yml
params_lipids.yml		params_lipids.yml
params_mcia.yml		params_mcia.yml
params_mcia_whole.yml		params_mcia_whole.yml
params_mirna.yml		params_mirna.yml
params_mirna_whole.yml		params_mirna_whole.yml
params_proteins.yml		params_proteins.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Pipeline summary

Supplementary materials for this pipeline can be found at this zenodo repository:

General inputs and outputs

Most important tools

Quick Start

In addition check modify resources (in params_mcia.yml) according to your system:

Common issues:

Documentation

Credits

Contributions and Support

Citations

About

Releases

Packages

Languages

License

ASAGlab/MOI--An-integrated-solution-for-omics-analyses

Folders and files

Latest commit

History

Repository files navigation

Introduction

Pipeline summary

Supplementary materials for this pipeline can be found at this zenodo repository:

General inputs and outputs

Most important tools

Quick Start

In addition check modify resources (in params_mcia.yml) according to your system:

Common issues:

Documentation

Credits

Contributions and Support

Citations

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages