Skip to content

First release of quantmsdiann including new features previously no supported#27

Merged
ypriverol merged 127 commits intomainfrom
dev
Apr 6, 2026
Merged

First release of quantmsdiann including new features previously no supported#27
ypriverol merged 127 commits intomainfrom
dev

Conversation

@ypriverol
Copy link
Copy Markdown
Member

@ypriverol ypriverol commented Apr 1, 2026

This pull request introduces foundational project management and contribution infrastructure for the bigbio/quantmsdiann pipeline. It adds comprehensive documentation for contributors, issue and PR templates, CI/CD and Dockstore configuration, and a detailed action plan and roadmap for the pipeline's development. These changes are essential for establishing best practices, improving collaboration, and guiding future development.

The most important changes are:

Project Management & Documentation

  • Added a detailed action plan and roadmap for the quantmsdiann pipeline, outlining completed and planned phases, version-aware testing strategy, and a decision log in .claude/actions_plans.md.
  • Introduced comprehensive contributing guidelines in .github/CONTRIBUTING.md, covering workflow, testing, patching, conventions, and Codespaces support.

Issue and PR Templates

  • Added GitHub issue templates for bug reports, feature requests, and contact links, as well as a pull request template to standardize contributions and reviews. [1] [2] [3] [4]

CI/CD and Tooling Configuration

  • Added Dockstore configuration for workflow publishing in .github/.dockstore.yml.
  • Added a composite GitHub Action (.github/actions/get-shards/action.yml) to dynamically determine the number of test shards for CI jobs based on changed files and tags.

Repository Metadata

  • Updated .gitattributes to improve linguist language detection and mark nf-core generated files.

Summary by CodeRabbit

  • New Features

    • Complete DIA-NN-based proteomics pipeline with support for multiple data formats (.raw, .mzML, Bruker .d files)
    • Flexible DIA-NN version management (1.8.1, 2.1.0, 2.2.0)
    • Comprehensive quality control and reporting via pmultiqc
    • MSstats-compatible quantification outputs
    • Full pipeline documentation and contribution guidelines
  • Documentation

    • Added usage guide, output format documentation, and API guidelines
    • Established Code of Conduct and contribution workflow

ypriverol and others added 4 commits April 5, 2026 07:37
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ch_diann_cfg is already a single-emission channel from SDRF_PARSING,
  .first() was causing a Nextflow warning about useless operator on
  value channels
- Remove stale TODO comments about ch_versions naming and nf-validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DIA-NN 1.8.1 doesn't recognize --direct-quant (introduced with QuantUMS
in 1.9.2). Direct quantification is the default in 1.8.1, so the flag
is unnecessary. Now only passed when version >= 1.9.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add DIA-NN 2.3.2 to version selection table in usage.md
- Document version-dependent features (QuantUMS >= 1.9.2, DDA >= 2.3.2,
  InfinDIA >= 2.3.0) with automatic compatibility handling
- Fix parameters.md: QuantUMS description now correctly explains
  --direct-quant behavior and version requirements

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jpfeuffer
Copy link
Copy Markdown

I think the number one priority for this should now be to not convert hundreds of gigabytes of .d to mzml just for a bit of qc. Do we even need indexed mzml for QC or why do you have the indexing step? Use the new .d reading capabilities of pyopenms if they are sufficient, otherwise there are many other libraries that can read .d these days.

yueqixuan and others added 4 commits April 5, 2026 21:00
- .nf-core.yml: set is_nfcore to false, remove non-existent files from
  lint exclusions (conf/test.config, conf/test_full.config,
  conf/modules.config, conf/igenomes_ignored.config)
- .dockstore.yml: use uppercase NFL for Dockstore subclass
- SKILL.md: fix default output format from csv to tsv for SDRF
- pmultiqc/meta.yml: align inputs/outputs with actual process definition
- sdrf_parsing/main.nf: use idiomatic .name instead of .Name
- parse_empirical_log_task: run on head node (executor local) instead
  of spawning a compute node for simple log grep
- CI actions: remove unused paths input from nf-test and get-shards

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the convert_dotd parameter and TDF2MZML module. DIA-NN handles
Bruker .d files natively, so converting hundreds of GB to mzML just for
QC statistics is wasteful (per jpfeuffer's review). Native .d QC support
will be tracked in a separate issue.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ypriverol and others added 8 commits April 5, 2026 16:30
Replace script+executor local with an exec block that runs Groovy code
directly on the head node. This is the idiomatic Nextflow approach for
lightweight tasks — no container, no compute node, no scheduler overhead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per jpfeuffer's review: users who skip preliminary analysis already
know their mass accuracy and scan window parameters — no need to pass
a log file. For the normal workflow, parse calibrated values directly
inside ASSEMBLE_EMPIRICAL_LIBRARY (same node, zero overhead) instead
of spawning a separate process.

Removes: PARSE_EMPIRICAL_LOG subworkflow, PARSE_EMPIRICAL_LOG_TASK
module, and --empirical_assembly_log parameter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The database parameter is already declared as required with exists
validation in nextflow_schema.json — nf-schema enforces this before
the workflow starts. Per daichengxin's review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add conf/modules.config, conf/test.config, conf/test_full.config,
conf/igenomes.config, and conf/igenomes_ignored.config back to
files_exist ignore list. Add .dockstore.yml to files_unchanged
to suppress template mismatch from NFL uppercase fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The env output directive fails at Nextflow compilation. Switch to
writing calibrated_params.txt and reading it via .map { f.text }.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of writing a file or using env output, parse the calibrated
mass accuracy and scan window values directly from the assembly log
using a .map operation in the workflow. This runs as Groovy on the
head node with zero overhead — no extra process, no extra file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 131 out of 136 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

ypriverol and others added 5 commits April 6, 2026 06:56
Remove non-existent files from .nf-core.yml lint checks, remove phantom
quantms_log input from pmultiqc meta.yml, fix tdf2mzml input description,
and add missing ms2_statistics/feature_statistics outputs to file_preparation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The files_exist list excludes files from the lint existence check.
Re-add nf-test.yml, modules.config, test.config, and test_full.config
so nf-core lint skips checking for them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update pipeline name in config comments, nextflow run examples, module
homepages, ro-crate metadata, version output filename, and custom config
path to consistently use bigbio/quantmsdiann.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The quantmsdiann.config does not exist in nf-core/configs yet, causing
nextflow config to fail with a file-not-found error during linting.
Remove the include until a pipeline-specific config is submitted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reuse the existing quantms institutional profiles from nf-core/configs
since quantmsdiann shares the same infrastructure requirements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ypriverol ypriverol merged commit 876342a into main Apr 6, 2026
39 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Apr 6, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment