Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .nf-core.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,4 @@ template:
name: quantmsdiann
org: bigbio
outdir: .
version: 1.8.0dev
version: 1.0.0
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This is **non-negotiable**. All code must pass formatting and style checks befor

## Project Overview

**quantmsdiann** is an nf-core bioinformatics best-practice analysis pipeline for **DIA-NN-based quantitative mass spectrometry**. It is a standalone pipeline focused exclusively on **Data-Independent Acquisition (DIA)** workflows using the DIA-NN search engine.
**quantmsdiann** is a [bigbio](https://github.com/bigbio) bioinformatics pipeline, built following [nf-core](https://nf-co.re/) guidelines, for **DIA-NN-based quantitative mass spectrometry**. It is a standalone pipeline focused exclusively on **Data-Independent Acquisition (DIA)** workflows using the DIA-NN search engine.

**This pipeline does NOT support DDA, TMT, iTRAQ, LFQ-DDA, or any non-DIA workflows.** Those are handled by the parent `quantms` pipeline.

Expand Down
71 changes: 30 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,30 +10,34 @@
[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)
[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)

**quantmsdiann** is an [nf-core](https://nf-co.re/) bioinformatics pipeline for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN).
## Introduction

## Pipeline Overview
**quantmsdiann** is a [bigbio](https://github.com/bigbio) bioinformatics pipeline, built following [nf-core](https://nf-co.re/) guidelines, for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN).

The pipeline takes SDRF metadata and mass spectrometry data files as input, performs DIA-NN-based identification and quantification, and produces protein/peptide quantification matrices, MSstats-compatible output, and QC reports.
The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a portable manner. It uses Docker/Singularity containers making results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process, making it easy to maintain and update software dependencies.

### Workflow Diagram
## Pipeline summary

<p align="center">
<img src="docs/images/quantmsdiann_workflow.svg" alt="quantmsdiann workflow" width="520">
<img src="docs/images/quantmsdiann_workflow.svg" alt="quantmsdiann workflow" width="800">
</p>

### Supported Input Formats
The pipeline takes [SDRF](https://github.com/bigbio/proteomics-metadata-standard) metadata and mass spectrometry data files (`.raw`, `.mzML`, `.d`, `.dia`) as input and performs:

| Format | Description | Handling |
| ------- | --------------------------- | --------------------------------------- |
| `.raw` | Thermo RAW files | Converted to mzML (ThermoRawFileParser) |
| `.mzML` | Open standard mzML | Optionally re-indexed |
| `.d` | Bruker timsTOF directories | Native or converted to mzML |
| `.dia` | DIA-NN native binary format | Passed through without conversion |
1. **Input validation** — SDRF parsing and validation
2. **File preparation** — RAW to mzML conversion (ThermoRawFileParser), indexing, Bruker `.d` handling
3. **In-silico spectral library generation** — or use a user-provided library (`--diann_speclib`)
4. **Preliminary analysis** — per-file calibration and mass accuracy estimation
5. **Empirical library assembly** — consensus library from preliminary results
6. **Individual analysis** — per-file search with the empirical library
7. **Final quantification** — protein/peptide/gene group matrices
8. **MSstats conversion** — DIA-NN report to MSstats-compatible format
9. **Quality control** — interactive QC report via [pmultiqc](https://github.com/bigbio/pmultiqc)

Compressed formats (`.gz`, `.tar`, `.tar.gz`, `.zip`) are supported for `.raw`, `.mzML`, and `.d`.
## Quick start

## Quick Start
> [!NOTE]
> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set up Nextflow.

```bash
nextflow run bigbio/quantmsdiann \
Expand All @@ -43,34 +47,13 @@ nextflow run bigbio/quantmsdiann \
-profile docker
```

## Key Output Files

| File | Description |
| ----------------------------------------- | ----------------------------------- |
| `quant_tables/diann_report.{tsv,parquet}` | Main DIA-NN peptide/protein report |
| `quant_tables/diann_report.pg_matrix.tsv` | Protein group quantification matrix |
| `quant_tables/diann_report.pr_matrix.tsv` | Precursor quantification matrix |
| `quant_tables/diann_report.gg_matrix.tsv` | Gene group quantification matrix |
| `quant_tables/out_msstats_in.csv` | MSstats-compatible quantification |
| `pmultiqc/` | Interactive QC HTML report |

## Test Profiles

```bash
# Quick DIA test
nextflow run . -profile test_dia,docker --outdir results

# DIA with Bruker .d files
nextflow run . -profile test_dia_dotd,docker --outdir results

# Latest DIA-NN (2.2.0)
nextflow run . -profile test_latest_dia,docker --outdir results
```
> [!WARNING]
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), not for defining parameters.

## Documentation

- [Usage](docs/usage.md) - How to run the pipeline
- [Output](docs/output.md) - Description of output files
- [Usage](docs/usage.md) How to run the pipeline, input formats, optional outputs, and custom configuration
- [Output](docs/output.md) Description of all output files produced by the pipeline

## Credits

Expand All @@ -82,12 +65,18 @@ quantmsdiann is developed and maintained by:
- [Vadim Demichev](https://github.com/vdemichev) (Charite Universitaetsmedizin Berlin)
- [Qi-Xuan Yue](https://github.com/yueqixuan) (Chongqing University of Posts and Telecommunications)

## License
## Contributions and Support

[MIT](LICENSE)
If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).

## Citation

If you use quantmsdiann in your research, please cite:

> Dai et al. "quantms: a cloud-based pipeline for quantitative proteomics" (2024). DOI: [10.5281/zenodo.15573386](https://doi.org/10.5281/zenodo.15573386)

An extensive list of references for the tools used by the pipeline can be found in the [CITATIONS.md](CITATIONS.md) file.

## License

[MIT](LICENSE)
2 changes: 1 addition & 1 deletion assets/email_template.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">

<meta name="description" content="bigbio/quantmsdiann: DIA-NN quantitative mass spectrometry nf-core workflow">
<meta name="description" content="bigbio/quantmsdiann: DIA-NN quantitative mass spectrometry workflow built following nf-core guidelines">
<title>bigbio/quantmsdiann Pipeline Report</title>
</head>
<body>
Expand Down
2 changes: 1 addition & 1 deletion assets/methods_description_template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ plot_type: "html"
## You can inject any metadata from the Nextflow '${workflow}' object
data: |
<h4>Methods</h4>
<p>Data was processed using bigbio/quantms v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
<p>Data was processed using bigbio/quantms v${workflow.manifest.version} ${doi_text} a bigbio pipeline built following nf-core guidelines (<a href="https://doi.org/10.1038/s41587-020-0439-x">Ewels <em>et al.</em>, 2020</a>), utilising reproducible software environments from the Bioconda (<a href="https://doi.org/10.1038/s41592-018-0046-7">Grüning <em>et al.</em>, 2018</a>) and Biocontainers (<a href="https://doi.org/10.1093/bioinformatics/btx192">da Veiga Leprevost <em>et al.</em>, 2017</a>) projects.</p>
<p>The pipeline was executed with Nextflow v${workflow.nextflow.version} (<a href="https://doi.org/10.1038/nbt.3820">Di Tommaso <em>et al.</em>, 2017</a>) with the following command:</p>
<pre><code>${workflow.commandLine}</code></pre>
<p>${tool_citations}</p>
Expand Down
14 changes: 14 additions & 0 deletions conf/modules/shared.config
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,20 @@ process {
]
}

// Optional: publish TSV spectral library from in-silico generation.
// Enable via ext.publish_speclib_tsv in a custom config or via --save_speclib_tsv.
withName: '.*:INSILICO_LIBRARY_GENERATION' {
publishDir = [
path: { "${params.outdir}/library_generation" },
mode: 'copy',
saveAs: { filename ->
if (filename.equals('versions.yml')) return null
if (filename.endsWith('.tsv') && (task.ext.publish_speclib_tsv || params.save_speclib_tsv)) return filename
return null
}
]
}

// publishDir for all features tables
withName: '.*:MZML_STATISTICS' {
publishDir = [
Expand Down
Loading
Loading