diff --git a/.nf-core.yml b/.nf-core.yml index 1331d21..15d137a 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -34,4 +34,4 @@ template: name: quantmsdiann org: bigbio outdir: . - version: 1.8.0dev + version: 1.0.0 diff --git a/AGENTS.md b/AGENTS.md index 77401fb..e4a235f 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -16,7 +16,7 @@ This is **non-negotiable**. All code must pass formatting and style checks befor ## Project Overview -**quantmsdiann** is an nf-core bioinformatics best-practice analysis pipeline for **DIA-NN-based quantitative mass spectrometry**. It is a standalone pipeline focused exclusively on **Data-Independent Acquisition (DIA)** workflows using the DIA-NN search engine. +**quantmsdiann** is a [bigbio](https://github.com/bigbio) bioinformatics pipeline, built following [nf-core](https://nf-co.re/) guidelines, for **DIA-NN-based quantitative mass spectrometry**. It is a standalone pipeline focused exclusively on **Data-Independent Acquisition (DIA)** workflows using the DIA-NN search engine. **This pipeline does NOT support DDA, TMT, iTRAQ, LFQ-DDA, or any non-DIA workflows.** Those are handled by the parent `quantms` pipeline. diff --git a/README.md b/README.md index 6338cb7..02d6f3c 100644 --- a/README.md +++ b/README.md @@ -10,30 +10,34 @@ [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) -**quantmsdiann** is an [nf-core](https://nf-co.re/) bioinformatics pipeline for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN). +## Introduction -## Pipeline Overview +**quantmsdiann** is a [bigbio](https://github.com/bigbio) bioinformatics pipeline, built following [nf-core](https://nf-co.re/) guidelines, for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN). -The pipeline takes SDRF metadata and mass spectrometry data files as input, performs DIA-NN-based identification and quantification, and produces protein/peptide quantification matrices, MSstats-compatible output, and QC reports. +The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a portable manner. It uses Docker/Singularity containers making results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process, making it easy to maintain and update software dependencies. -### Workflow Diagram +## Pipeline summary

- quantmsdiann workflow + quantmsdiann workflow

-### Supported Input Formats +The pipeline takes [SDRF](https://github.com/bigbio/proteomics-metadata-standard) metadata and mass spectrometry data files (`.raw`, `.mzML`, `.d`, `.dia`) as input and performs: -| Format | Description | Handling | -| ------- | --------------------------- | --------------------------------------- | -| `.raw` | Thermo RAW files | Converted to mzML (ThermoRawFileParser) | -| `.mzML` | Open standard mzML | Optionally re-indexed | -| `.d` | Bruker timsTOF directories | Native or converted to mzML | -| `.dia` | DIA-NN native binary format | Passed through without conversion | +1. **Input validation** — SDRF parsing and validation +2. **File preparation** — RAW to mzML conversion (ThermoRawFileParser), indexing, Bruker `.d` handling +3. **In-silico spectral library generation** — or use a user-provided library (`--diann_speclib`) +4. **Preliminary analysis** — per-file calibration and mass accuracy estimation +5. **Empirical library assembly** — consensus library from preliminary results +6. **Individual analysis** — per-file search with the empirical library +7. **Final quantification** — protein/peptide/gene group matrices +8. **MSstats conversion** — DIA-NN report to MSstats-compatible format +9. **Quality control** — interactive QC report via [pmultiqc](https://github.com/bigbio/pmultiqc) -Compressed formats (`.gz`, `.tar`, `.tar.gz`, `.zip`) are supported for `.raw`, `.mzML`, and `.d`. +## Quick start -## Quick Start +> [!NOTE] +> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set up Nextflow. ```bash nextflow run bigbio/quantmsdiann \ @@ -43,34 +47,13 @@ nextflow run bigbio/quantmsdiann \ -profile docker ``` -## Key Output Files - -| File | Description | -| ----------------------------------------- | ----------------------------------- | -| `quant_tables/diann_report.{tsv,parquet}` | Main DIA-NN peptide/protein report | -| `quant_tables/diann_report.pg_matrix.tsv` | Protein group quantification matrix | -| `quant_tables/diann_report.pr_matrix.tsv` | Precursor quantification matrix | -| `quant_tables/diann_report.gg_matrix.tsv` | Gene group quantification matrix | -| `quant_tables/out_msstats_in.csv` | MSstats-compatible quantification | -| `pmultiqc/` | Interactive QC HTML report | - -## Test Profiles - -```bash -# Quick DIA test -nextflow run . -profile test_dia,docker --outdir results - -# DIA with Bruker .d files -nextflow run . -profile test_dia_dotd,docker --outdir results - -# Latest DIA-NN (2.2.0) -nextflow run . -profile test_latest_dia,docker --outdir results -``` +> [!WARNING] +> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), not for defining parameters. ## Documentation -- [Usage](docs/usage.md) - How to run the pipeline -- [Output](docs/output.md) - Description of output files +- [Usage](docs/usage.md) — How to run the pipeline, input formats, optional outputs, and custom configuration +- [Output](docs/output.md) — Description of all output files produced by the pipeline ## Credits @@ -82,12 +65,18 @@ quantmsdiann is developed and maintained by: - [Vadim Demichev](https://github.com/vdemichev) (Charite Universitaetsmedizin Berlin) - [Qi-Xuan Yue](https://github.com/yueqixuan) (Chongqing University of Posts and Telecommunications) -## License +## Contributions and Support -[MIT](LICENSE) +If you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md). ## Citation If you use quantmsdiann in your research, please cite: > Dai et al. "quantms: a cloud-based pipeline for quantitative proteomics" (2024). DOI: [10.5281/zenodo.15573386](https://doi.org/10.5281/zenodo.15573386) + +An extensive list of references for the tools used by the pipeline can be found in the [CITATIONS.md](CITATIONS.md) file. + +## License + +[MIT](LICENSE) diff --git a/assets/email_template.html b/assets/email_template.html index 8ee58b8..5f5324c 100644 --- a/assets/email_template.html +++ b/assets/email_template.html @@ -4,7 +4,7 @@ - + bigbio/quantmsdiann Pipeline Report diff --git a/assets/methods_description_template.yml b/assets/methods_description_template.yml index f8bc88b..3f2ccfc 100644 --- a/assets/methods_description_template.yml +++ b/assets/methods_description_template.yml @@ -6,7 +6,7 @@ plot_type: "html" ## You can inject any metadata from the Nextflow '${workflow}' object data: |

Methods

-

Data was processed using bigbio/quantms v${workflow.manifest.version} ${doi_text} of the nf-core collection of workflows (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

+

Data was processed using bigbio/quantms v${workflow.manifest.version} ${doi_text} a bigbio pipeline built following nf-core guidelines (Ewels et al., 2020), utilising reproducible software environments from the Bioconda (Grüning et al., 2018) and Biocontainers (da Veiga Leprevost et al., 2017) projects.

The pipeline was executed with Nextflow v${workflow.nextflow.version} (Di Tommaso et al., 2017) with the following command:

${workflow.commandLine}

${tool_citations}

diff --git a/conf/modules/shared.config b/conf/modules/shared.config index 49c4b99..8387b9c 100644 --- a/conf/modules/shared.config +++ b/conf/modules/shared.config @@ -42,6 +42,20 @@ process { ] } + // Optional: publish TSV spectral library from in-silico generation. + // Enable via ext.publish_speclib_tsv in a custom config or via --save_speclib_tsv. + withName: '.*:INSILICO_LIBRARY_GENERATION' { + publishDir = [ + path: { "${params.outdir}/library_generation" }, + mode: 'copy', + saveAs: { filename -> + if (filename.equals('versions.yml')) return null + if (filename.endsWith('.tsv') && (task.ext.publish_speclib_tsv || params.save_speclib_tsv)) return filename + return null + } + ] + } + // publishDir for all features tables withName: '.*:MZML_STATISTICS' { publishDir = [ diff --git a/docs/images/quantmsdiann_workflow.svg b/docs/images/quantmsdiann_workflow.svg index 1cb825b..93c114d 100644 --- a/docs/images/quantmsdiann_workflow.svg +++ b/docs/images/quantmsdiann_workflow.svg @@ -1,183 +1,217 @@ - + - - - - - - - - quantmsdiann - DIA-NN Quantitative Mass Spectrometry Pipeline + + + + + + LEGEND + + Optional / skippable + + Input handling + + Preprocessing + + DIA-NN analysis + + Statistics & export + + Quality control + + + Process / module + + Output file + + + + + Per-file (parallel) + + + + Collective (all files) + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + SDRF + Raw Files + .raw / .mzML / .d / .dia + + + FASTA Database + Protein sequences + + + + + INPUT_CHECK + Input Validation + SDRF parsing & validation + + + Create Input Channel + + + + + FILE_PREPARATION + File Preparation + RAW → mzML conversion, indexing + ⫽ per-file parallel + + + + + GENERATE_CFG + Generate Config + enzyme + mods → diann.cfg + + + + INSILICO_LIBRARY_GENERATION + Spectral Library + in-silico prediction from FASTA + skip if --diann_speclib + + + + + PRELIMINARY_ANALYSIS + Preliminary Analysis + per-file calibration & mass accuracy + ⫽ per-file parallel + + + + + ASSEMBLE_EMPIRICAL_LIBRARY + Empirical Library + consensus from .quant files + + + skip if + --skip_preliminary + _analysis + + + + + INDIVIDUAL_ANALYSIS + Individual Analysis + per-file search with empirical library + ⫽ per-file parallel + + + + + FINAL_QUANTIFICATION + Final Quantification + summary report + matrices + + + + + DIANN_MSSTATS + MSstats Conversion + DIA-NN report → MSstats CSV + + + + + PMULTIQC + pMultiQC + Quality control report + + + + Quant Tables + pg / pr / gg matrices + + + MSstats CSV + msstats_in.csv + + + QC Report + HTML interactive - - - - SDRF + Raw Files - .raw / .mzML / .d / .dia - - - - FASTA Database - Protein sequences - - - - - - - - - Input Validation - SDRF check + parsing - - - - - - File Preparation - RAW->mzML, indexing, .d/.dia - - - - - - - - - - - - 1. Generate Config - enzyme + mods -> diann.cfg - - - - - - 2. Spectral Library - in-silico or user-provided - - - skip if --diann_speclib - - - - - - 3. Preliminary Analysis - per-file calibration + mass accuracy - - - - - - 4. Empirical Library - assemble consensus from .quant files - - - - - - 5. Individual Analysis - per-file search with empirical library - - - - - - 6. Final Quantification - summary report + matrices - - - - - - 7. MSstats Conversion - DIA-NN report -> MSstats CSV - - - - skip if library - provided - - - - - - - - pMultiQC - Quality control report - - - - - - - - - Quant Tables - pg/pr/gg matrices - diann_report.tsv - - - MSstats CSV - msstats_in.csv - ready for DE analysis - - - QC Report - HTML interactive - plots + metrics - - - - - - - - - Supported DIA-NN Versions - - v1.8.1 - - v2.1.0 - - GPU ready - - - bigbio/quantmsdiann | nf-core compliant | Nextflow DSL2 - - - - Input - - DIA-NN - - QC - - Output - - Optional diff --git a/docs/output.md b/docs/output.md index b9b54c5..9af5b47 100644 --- a/docs/output.md +++ b/docs/output.md @@ -70,6 +70,12 @@ results/ - `quant_tables/diann_report.unique_genes_matrix.tsv` - Unique gene quantification matrix - `quant_tables/out_msstats_in.csv` - MSstats-compatible quantification table +### Optional Output Files + +These files are not published by default. Enable them with `save_*` parameters or `ext.*` config properties (see [Usage: Optional outputs](usage.md#optional-outputs)). + +- `library_generation/*.tsv` - TSV spectral library from in-silico library generation (`--save_speclib_tsv`) + ### Nextflow pipeline info [Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. diff --git a/docs/usage.md b/docs/usage.md index 623b612..4464cac 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -88,6 +88,46 @@ nextflow run . -profile test_dia_dotd,docker --outdir results nextflow run . -profile test_latest_dia,docker --outdir results ``` +## Optional outputs + +By default, only final result files are published. Intermediate files can be exported using `save_*` parameters or via `ext.*` properties in a custom Nextflow config. + +| Parameter | Default | Description | +| -------------------- | ------- | ------------------------------------------------------------------------------------------- | +| `--save_speclib_tsv` | `false` | Publish the TSV spectral library from in-silico library generation to `library_generation/` | + +**Using a parameter:** + +```bash +nextflow run bigbio/quantmsdiann \ + --input 'experiment.sdrf.tsv' \ + --database 'proteins.fasta' \ + --save_speclib_tsv \ + --outdir './results' \ + -profile docker +``` + +**Using a custom Nextflow config (ext properties):** + +```groovy +// custom.config +process { + withName: '.*:INSILICO_LIBRARY_GENERATION' { + ext.publish_speclib_tsv = true + } +} +``` + +```bash +nextflow run bigbio/quantmsdiann -c custom.config ... +``` + +For full verbose output of all intermediate files (useful for debugging), use the `verbose_modules` profile: + +```bash +nextflow run bigbio/quantmsdiann -profile verbose_modules,docker ... +``` + ## Custom configuration ### Resource requests diff --git a/modules/local/diann/insilico_library_generation/main.nf b/modules/local/diann/insilico_library_generation/main.nf index 6f759bd..d61fc63 100644 --- a/modules/local/diann/insilico_library_generation/main.nf +++ b/modules/local/diann/insilico_library_generation/main.nf @@ -14,6 +14,7 @@ process INSILICO_LIBRARY_GENERATION { output: path "versions.yml", emit: versions path "*.predicted.speclib", emit: predict_speclib + path "*.tsv", emit: speclib_tsv, optional: true path "silicolibrarygeneration.log", emit: log when: diff --git a/nextflow.config b/nextflow.config index acd72b6..982e37d 100644 --- a/nextflow.config +++ b/nextflow.config @@ -59,6 +59,9 @@ params { diann_speclib = null diann_extra_args = null + // Optional outputs — control which intermediate files are published + save_speclib_tsv = false // Save the TSV spectral library from in-silico generation + // DIA-NN: PRELIMINARY_ANALYSIS — calibration & mass accuracy scan_window = 8 scan_window_automatic = true @@ -353,11 +356,11 @@ manifest { ] ] homePage = 'https://github.com/bigbio/quantmsdiann' - description = """DIA-NN quantitative mass spectrometry nf-core workflow""" + description = """DIA-NN quantitative mass spectrometry workflow built following nf-core guidelines""" mainScript = 'main.nf' defaultBranch = 'main' nextflowVersion = '!>=25.04.0' - version = '1.8.0dev' + version = '1.0.0' doi = '10.5281/zenodo.15573386' } diff --git a/nextflow_schema.json b/nextflow_schema.json index b02deab..87304e9 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -448,6 +448,13 @@ "fa_icon": "fas fa-terminal", "hidden": false, "help_text": "Pass additional DIA-NN command-line arguments that will be appended to all DIA-NN steps (INSILICO_LIBRARY_GENERATION, PRELIMINARY_ANALYSIS, ASSEMBLE_EMPIRICAL_LIBRARY, INDIVIDUAL_ANALYSIS, FINAL_QUANTIFICATION). Flags that conflict with a specific step are automatically stripped with a warning. For step-specific overrides, use custom Nextflow config files with ext.args." + }, + "save_speclib_tsv": { + "type": "boolean", + "default": false, + "description": "Save the TSV spectral library from the in-silico library generation step.", + "fa_icon": "fas fa-save", + "help_text": "When enabled, the human-readable TSV version of the spectral library produced by DIA-NN during the INSILICO_LIBRARY_GENERATION step is published to the output directory under `library_generation/`. By default this file is discarded as an intermediate." } }, "fa_icon": "fas fa-braille" diff --git a/ro-crate-metadata.json b/ro-crate-metadata.json index d19d073..a3b0755 100644 --- a/ro-crate-metadata.json +++ b/ro-crate-metadata.json @@ -23,7 +23,7 @@ "@type": "Dataset", "creativeWorkStatus": "InProgress", "datePublished": "2026-02-20T15:36:51+00:00", - "description": "# quantmsdiann\n\n[![GitHub Actions CI Status](https://github.com/bigbio/quantmsdiann/actions/workflows/ci.yml/badge.svg)](https://github.com/bigbio/quantmsdiann/actions/workflows/ci.yml)\n[![GitHub Actions Linting Status](https://github.com/bigbio/quantmsdiann/actions/workflows/linting.yml/badge.svg)](https://github.com/bigbio/quantmsdiann/actions/workflows/linting.yml)\n[![Cite with Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.15573386.svg)](https://doi.org/10.5281/zenodo.15573386)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.04.0-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)\n[![nf-core template version](https://img.shields.io/badge/nf--core_template-3.5.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/3.5.2)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n\n**quantmsdiann** is an [nf-core](https://nf-co.re/) bioinformatics pipeline for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN).\n\n## Pipeline Overview\n\nThe pipeline takes SDRF metadata and mass spectrometry data files as input, performs DIA-NN-based identification and quantification, and produces protein/peptide quantification matrices, MSstats-compatible output, and QC reports.\n\n### Workflow Diagram\n\n

\n \"quantmsdiann\n

\n\n### Supported Input Formats\n\n| Format | Description | Handling |\n| ------- | --------------------------- | --------------------------------------- |\n| `.raw` | Thermo RAW files | Converted to mzML (ThermoRawFileParser) |\n| `.mzML` | Open standard mzML | Optionally re-indexed |\n| `.d` | Bruker timsTOF directories | Native or converted to mzML |\n| `.dia` | DIA-NN native binary format | Passed through without conversion |\n\nCompressed formats (`.gz`, `.tar`, `.tar.gz`, `.zip`) are supported for `.raw`, `.mzML`, and `.d`.\n\n## Quick Start\n\n```bash\nnextflow run bigbio/quantmsdiann \\\n --input 'experiment.sdrf.tsv' \\\n --database 'proteins.fasta' \\\n --outdir './results' \\\n -profile docker\n```\n\n## Key Output Files\n\n| File | Description |\n| ----------------------------------------- | ----------------------------------- |\n| `quant_tables/diann_report.{tsv,parquet}` | Main DIA-NN peptide/protein report |\n| `quant_tables/diann_report.pg_matrix.tsv` | Protein group quantification matrix |\n| `quant_tables/diann_report.pr_matrix.tsv` | Precursor quantification matrix |\n| `quant_tables/diann_report.gg_matrix.tsv` | Gene group quantification matrix |\n| `quant_tables/out_msstats_in.csv` | MSstats-compatible quantification |\n| `pmultiqc/` | Interactive QC HTML report |\n\n## Test Profiles\n\n```bash\n# Quick DIA test\nnextflow run . -profile test_dia,docker --outdir results\n\n# DIA with Bruker .d files\nnextflow run . -profile test_dia_dotd,docker --outdir results\n\n# Latest DIA-NN (2.2.0)\nnextflow run . -profile test_latest_dia,docker --outdir results\n```\n\n## Documentation\n\n- [Usage](docs/usage.md) - How to run the pipeline\n- [Output](docs/output.md) - Description of output files\n\n## Credits\n\nquantmsdiann is developed and maintained by:\n\n- [Yasset Perez-Riverol](https://github.com/ypriverol) (EMBL-EBI)\n- [Dai Chengxin](https://github.com/daichengxin) (Beijing Proteome Research Center)\n- [Julianus Pfeuffer](https://github.com/jpfeuffer) (Freie Universitat Berlin)\n- [Vadim Demichev](https://github.com/vdemichev) (Charite Universitaetsmedizin Berlin)\n- [Qi-Xuan Yue](https://github.com/yueqixuan) (Chongqing University of Posts and Telecommunications)\n\n## License\n\n[MIT](LICENSE)\n\n## Citation\n\nIf you use quantmsdiann in your research, please cite:\n\n> Dai et al. \"quantms: a cloud-based pipeline for quantitative proteomics\" (2024). DOI: [10.5281/zenodo.15573386](https://doi.org/10.5281/zenodo.15573386)\n", + "description": "# quantmsdiann\n\n[![GitHub Actions CI Status](https://github.com/bigbio/quantmsdiann/actions/workflows/ci.yml/badge.svg)](https://github.com/bigbio/quantmsdiann/actions/workflows/ci.yml)\n[![GitHub Actions Linting Status](https://github.com/bigbio/quantmsdiann/actions/workflows/linting.yml/badge.svg)](https://github.com/bigbio/quantmsdiann/actions/workflows/linting.yml)\n[![Cite with Zenodo](https://zenodo.org/badge/DOI/10.5281/zenodo.15573386.svg)](https://doi.org/10.5281/zenodo.15573386)\n[![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com)\n\n[![Nextflow](https://img.shields.io/badge/version-%E2%89%A525.04.0-green?style=flat&logo=nextflow&logoColor=white&color=%230DC09D&link=https%3A%2F%2Fnextflow.io)](https://www.nextflow.io/)\n[![nf-core template version](https://img.shields.io/badge/nf--core_template-3.5.2-green?style=flat&logo=nfcore&logoColor=white&color=%2324B064&link=https%3A%2F%2Fnf-co.re)](https://github.com/nf-core/tools/releases/tag/3.5.2)\n[![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/)\n[![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/)\n\n## Introduction\n\n**quantmsdiann** is a [bigbio](https://github.com/bigbio) bioinformatics pipeline, built following [nf-core](https://nf-co.re/) guidelines, for **Data-Independent Acquisition (DIA)** quantitative mass spectrometry analysis using [DIA-NN](https://github.com/vdemichev/DiaNN).\n\nThe pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool to run tasks across multiple compute infrastructures in a portable manner. It uses Docker/Singularity containers making results highly reproducible. The [Nextflow DSL2](https://www.nextflow.io/docs/latest/dsl2.html) implementation of this pipeline uses one container per process, making it easy to maintain and update software dependencies.\n\n## Pipeline summary\n\n

\n \"quantmsdiann\n

\n\nThe pipeline takes [SDRF](https://github.com/bigbio/proteomics-metadata-standard) metadata and mass spectrometry data files (`.raw`, `.mzML`, `.d`, `.dia`) as input and performs:\n\n1. **Input validation** \u2014 SDRF parsing and validation\n2. **File preparation** \u2014 RAW to mzML conversion (ThermoRawFileParser), indexing, Bruker `.d` handling\n3. **In-silico spectral library generation** \u2014 or use a user-provided library (`--diann_speclib`)\n4. **Preliminary analysis** \u2014 per-file calibration and mass accuracy estimation\n5. **Empirical library assembly** \u2014 consensus library from preliminary results\n6. **Individual analysis** \u2014 per-file search with the empirical library\n7. **Final quantification** \u2014 protein/peptide/gene group matrices\n8. **MSstats conversion** \u2014 DIA-NN report to MSstats-compatible format\n9. **Quality control** \u2014 interactive QC report via [pmultiqc](https://github.com/bigbio/pmultiqc)\n\n## Quick start\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to [this page](https://nf-co.re/docs/usage/installation) on how to set up Nextflow.\n\n```bash\nnextflow run bigbio/quantmsdiann \\\n --input 'experiment.sdrf.tsv' \\\n --database 'proteins.fasta' \\\n --outdir './results' \\\n -profile docker\n```\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files specified with `-c` must only be used for [tuning process resource specifications](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources), not for defining parameters.\n\n## Documentation\n\n- [Usage](docs/usage.md) \u2014 How to run the pipeline, input formats, optional outputs, and custom configuration\n- [Output](docs/output.md) \u2014 Description of all output files produced by the pipeline\n\n## Credits\n\nquantmsdiann is developed and maintained by:\n\n- [Yasset Perez-Riverol](https://github.com/ypriverol) (EMBL-EBI)\n- [Dai Chengxin](https://github.com/daichengxin) (Beijing Proteome Research Center)\n- [Julianus Pfeuffer](https://github.com/jpfeuffer) (Freie Universitat Berlin)\n- [Vadim Demichev](https://github.com/vdemichev) (Charite Universitaetsmedizin Berlin)\n- [Qi-Xuan Yue](https://github.com/yueqixuan) (Chongqing University of Posts and Telecommunications)\n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the [contributing guidelines](.github/CONTRIBUTING.md).\n\n## Citation\n\nIf you use quantmsdiann in your research, please cite:\n\n> Dai et al. \"quantms: a cloud-based pipeline for quantitative proteomics\" (2024). DOI: [10.5281/zenodo.15573386](https://doi.org/10.5281/zenodo.15573386)\n\nAn extensive list of references for the tools used by the pipeline can be found in the [CITATIONS.md](CITATIONS.md) file.\n\n## License\n\n[MIT](LICENSE)\n", "hasPart": [ { "@id": "main.nf" @@ -169,7 +169,7 @@ "https://nf-co.re/bigbio/quantms/dev/" ], "version": [ - "1.8.0dev" + "1.0.0" ] }, {