Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ null/
.cursor/rules/codacy.mdc
.codacy/
.github/instructions/codacy.instructions.md
docs/superpowers/
2 changes: 2 additions & 0 deletions conf/diann_versions/v1_8_1.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
* DIA-NN 1.8.1 container override (public biocontainers)
* Used by merge_ci.yml for version × feature matrix testing.
*/
params.diann_version = '1.8.1'

process {
withLabel: diann {
container = 'docker.io/biocontainers/diann:v1.8.1_cv1'
Expand Down
2 changes: 2 additions & 0 deletions conf/diann_versions/v2_1_0.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
* DIA-NN 2.1.0 container override (private ghcr.io)
* Used by merge_ci.yml for version × feature matrix testing.
*/
params.diann_version = '2.1.0'

process {
withLabel: diann {
container = 'ghcr.io/bigbio/diann:2.1.0'
Expand Down
2 changes: 2 additions & 0 deletions conf/diann_versions/v2_2_0.config
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
* DIA-NN 2.2.0 container override (private ghcr.io)
* Used by merge_ci.yml for version × feature matrix testing.
*/
params.diann_version = '2.2.0'

process {
withLabel: diann {
container = 'ghcr.io/bigbio/diann:2.2.0'
Expand Down
18 changes: 18 additions & 0 deletions conf/tests/test_dia_local.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Local container overrides for testing with dev builds of sdrf-pipelines and quantms-utils.
Uses docker.io/ prefix to prevent quay.io registry from being prepended.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
*/

process {
withName: 'SDRF_PARSING' {
container = 'docker.io/local/sdrf-pipelines:dev'
}
withName: 'SAMPLESHEET_CHECK' {
container = 'docker.io/local/quantms-utils:dev'
}
withName: 'DIANN_MSSTATS' {
container = 'docker.io/local/quantms-utils:dev'
}
}
142 changes: 142 additions & 0 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,95 @@ nextflow run . -profile test_dia_dotd,docker --outdir results
nextflow run . -profile test_latest_dia,docker --outdir results
```

## DIA-NN parameters

The pipeline passes parameters to DIA-NN at different steps. Some parameters come from the SDRF metadata (per-file), some from `nextflow.config` defaults, and some from the command line. The table below documents each parameter, its source, and which pipeline steps use it.

### Parameter sources

Parameters are resolved in this priority order:

1. **SDRF metadata** (per-file, from `convert-diann` design file) — highest priority
2. **Pipeline parameters** (`--param_name` on command line or params file)
3. **Nextflow defaults** (`nextflow.config`) — lowest priority

### Pipeline steps

| Step | Description |
| ------------------------------- | ------------------------------------------------------------------- |
| **INSILICO_LIBRARY_GENERATION** | Predicts a spectral library from FASTA using DIA-NN's deep learning |
| **PRELIMINARY_ANALYSIS** | Per-file calibration and mass accuracy estimation (first pass) |
| **ASSEMBLE_EMPIRICAL_LIBRARY** | Builds consensus empirical library from preliminary results |
| **INDIVIDUAL_ANALYSIS** | Per-file quantification with the empirical library (second pass) |
| **FINAL_QUANTIFICATION** | Aggregates all files into protein/peptide matrices |

### Per-file parameters from SDRF

These parameters are extracted per-file from the SDRF via `convert-diann` and stored in `diann_design.tsv`:

| DIA-NN flag | SDRF column | Design column | Steps | Notes |
| ---------------- | -------------------------------------------------- | ------------------------ | ----------------------- | ----------------------------------------------- |
| `--mass-acc-ms1` | `comment[precursor mass tolerance]` | `PrecursorMassTolerance` | PRELIMINARY, INDIVIDUAL | Falls back to auto-detect if missing or not ppm |
| `--mass-acc` | `comment[fragment mass tolerance]` | `FragmentMassTolerance` | PRELIMINARY, INDIVIDUAL | Falls back to auto-detect if missing or not ppm |
| `--min-pr-mz` | `comment[ms1 scan range]` or `comment[ms min mz]` | `MS1MinMz` | PRELIMINARY, INDIVIDUAL | Per-file for GPF; global broadest for INSILICO |
| `--max-pr-mz` | `comment[ms1 scan range]` or `comment[ms max mz]` | `MS1MaxMz` | PRELIMINARY, INDIVIDUAL | Per-file for GPF; global broadest for INSILICO |
| `--min-fr-mz` | `comment[ms2 scan range]` or `comment[ms2 min mz]` | `MS2MinMz` | PRELIMINARY, INDIVIDUAL | Per-file for GPF; global broadest for INSILICO |
| `--max-fr-mz` | `comment[ms2 scan range]` or `comment[ms2 max mz]` | `MS2MaxMz` | PRELIMINARY, INDIVIDUAL | Per-file for GPF; global broadest for INSILICO |

### Global parameters from config

These parameters apply globally across all files. They are set in `diann_config.cfg` (from SDRF) or as pipeline parameters:

| DIA-NN flag | Pipeline parameter | Default | Steps | Notes |
| --------------------------------------------- | -------------------------------------------------- | ----------------------------------------------- | ---------------------------------------- | --------------------------------------------------------------- |
| `--cut` | (from SDRF enzyme) | — | ALL | Enzyme cut rule, derived from `comment[cleavage agent details]` |
| `--fixed-mod` | (from SDRF) | — | ALL | Fixed modifications from `comment[modification parameters]` |
| `--var-mod` | (from SDRF) | — | ALL | Variable modifications from `comment[modification parameters]` |
| `--monitor-mod` | `--enable_mod_localization` + `--mod_localization` | `false` / `Phospho (S),Phospho (T),Phospho (Y)` | PRELIMINARY, ASSEMBLE, INDIVIDUAL, FINAL | PTM site localization scoring (DIA-NN 1.8.x only) |
| `--window` | `--scan_window` | `8` | PRELIMINARY, ASSEMBLE, INDIVIDUAL | Scan window; auto-detected when `--scan_window_automatic=true` |
| `--quick-mass-acc` | `--quick_mass_acc` | `true` | PRELIMINARY | Fast mass accuracy calibration |
| `--min-corr 2 --corr-diff 1 --time-corr-only` | `--performance_mode` | `true` | PRELIMINARY | High-speed, low-RAM mode |
| `--pg-level` | `--pg_level` | `2` | INDIVIDUAL, FINAL | Protein grouping level |
| `--species-genes` | `--species_genes` | `false` | FINAL | Use species-specific gene names |
| `--no-norm` | `--diann_normalize` | `true` | FINAL | Disable normalization when `false` |

### PTM site localization (`--monitor-mod`)

DIA-NN supports PTM site localization scoring via `--monitor-mod`. When enabled, DIA-NN reports `PTM.Site.Confidence` and `PTM.Q.Value` columns for the specified modifications.

**Important**: `--monitor-mod` is applied to all DIA-NN steps **except INSILICO_LIBRARY_GENERATION** (where it has no effect). It is particularly important for:

- **PRELIMINARY_ANALYSIS**: Affects PTM-aware scoring during calibration.
- **ASSEMBLE_EMPIRICAL_LIBRARY**: Strongly affects empirical library generation for PTM peptides.
- **INDIVIDUAL_ANALYSIS** and **FINAL_QUANTIFICATION**: Enables PTM site confidence scoring.

Note: For DIA-NN 2.0+, `--monitor-mod` is no longer needed — PTM localization is handled automatically by `--var-mod`. The flag is only used for DIA-NN 1.8.x.

To enable PTM site localization:

```bash
nextflow run bigbio/quantmsdiann \
--enable_mod_localization \
--mod_localization 'Phospho (S),Phospho (T),Phospho (Y)' \
...
```

The parameter accepts two formats:

- **Modification names** (quantms-compatible): `Phospho (S),Phospho (T),Phospho (Y)` — site info in parentheses is stripped, the base name is mapped to UniMod
- **UniMod accessions** (direct): `UniMod:21,UniMod:1`

Supported modification name mappings:

| Name | UniMod ID | Example |
| ----------- | ------------ | ------------------------------------- |
| Phospho | `UniMod:21` | `Phospho (S),Phospho (T),Phospho (Y)` |
| GlyGly | `UniMod:121` | `GlyGly (K)` |
| Acetyl | `UniMod:1` | `Acetyl (Protein N-term)` |
| Oxidation | `UniMod:35` | `Oxidation (M)` |
| Deamidated | `UniMod:7` | `Deamidated (N),Deamidated (Q)` |
| Methylation | `UniMod:34` | `Methylation (K),Methylation (R)` |

## Optional outputs

By default, only final result files are published. Intermediate files can be exported using `save_*` parameters or via `ext.*` properties in a custom Nextflow config.
Expand Down Expand Up @@ -154,6 +243,59 @@ Use `screen`, `tmux`, or the Nextflow `-bg` flag to run the pipeline in the back
nextflow run bigbio/quantmsdiann -profile docker --input sdrf.tsv --database db.fasta --outdir results -bg
```

## Developer testing with local containers

When developing changes to `sdrf-pipelines` or `quantms-utils`, you can build local Docker containers and test them with the pipeline without publishing to a registry.

### 1. Build local dev containers

```bash
# From sdrf-pipelines repo
cd /path/to/sdrf-pipelines
docker build -f Dockerfile.dev -t local/sdrf-pipelines:dev .

# From quantms-utils repo
cd /path/to/quantms-utils
docker build -f Dockerfile.dev -t local/quantms-utils:dev .
```

### 2. Run the pipeline with local containers

Use the `test_dia_local.config` to override container references:

```bash
nextflow run main.nf \
-profile test_dia,docker \
-c conf/tests/test_dia_local.config \
--outdir results
```

This config (`conf/tests/test_dia_local.config`) overrides:

- `SDRF_PARSING` → `local/sdrf-pipelines:dev`
- `SAMPLESHEET_CHECK` → `local/quantms-utils:dev`
- `DIANN_MSSTATS` → `local/quantms-utils:dev`

### 3. Using pre-converted mzML files

To skip ThermoRawFileParser (useful on macOS/ARM where Mono crashes):

```bash
# Convert raw files with ThermoRawFileParser v2.0+
docker run --rm --platform=linux/amd64 \
-v /path/to/raw:/data -v /path/to/mzml:/out \
quay.io/biocontainers/thermorawfileparser:2.0.0.dev--h9ee0642_0 \
ThermoRawFileParser -d /data -o /out -f 2

# Run pipeline with pre-converted files
nextflow run main.nf \
-profile test_dia,docker \
-c conf/tests/test_dia_local.config \
--root_folder /path/to/mzml \
--local_input_type mzML \
--outdir results
```

## Nextflow memory requirements

Add the following to your environment to limit Java memory:
Expand Down
7 changes: 4 additions & 3 deletions modules/local/diann/assemble_empirical_library/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,8 @@ process ASSEMBLE_EMPIRICAL_LIBRARY {
'--temp', '--threads', '--verbose', '--lib', '--f', '--fasta',
'--mass-acc', '--mass-acc-ms1', '--window',
'--individual-mass-acc', '--individual-windows',
'--out-lib', '--use-quant', '--gen-spec-lib', '--rt-profiling']
'--out-lib', '--use-quant', '--gen-spec-lib', '--rt-profiling',
'--monitor-mod', '--var-mod', '--fixed-mod']
// Sort by length descending so longer flags (e.g. --mass-acc-ms1) are matched before shorter prefixes (--mass-acc)
blocked.sort { a -> -a.length() }.each { flag ->
def flagPattern = '(?<=^|\\s)' + java.util.regex.Pattern.quote(flag) + '(?=\\s|\$)(\\s+(?!-{1,2}[a-zA-Z])\\S+)*'
Expand Down Expand Up @@ -60,8 +61,8 @@ process ASSEMBLE_EMPIRICAL_LIBRARY {

ls -lcth

# Extract --var-mod and --fixed-mod flags from diann_config.cfg (DIA-NN best practice)
mod_flags=\$(cat ${diann_config} | grep -oP '(--var-mod\\s+\\S+|--fixed-mod\\s+\\S+)' | tr '\\n' ' ')
# Extract --var-mod, --fixed-mod, and --monitor-mod flags from diann_config.cfg
mod_flags=\$(cat ${diann_config} | grep -oP '(--var-mod\\s+\\S+|--fixed-mod\\s+\\S+|--monitor-mod\\s+\\S+)' | tr '\\n' ' ')

diann --f ${(ms_files as List).join(' --f ')} \\
--lib ${lib} \\
Expand Down
4 changes: 2 additions & 2 deletions modules/local/diann/diann_msstats/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ process DIANN_MSSTATS {
label 'process_medium'

container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/quantms-utils:0.0.25--pyh106432d_0' :
'biocontainers/quantms-utils:0.0.25--pyh106432d_0' }"
'https://depot.galaxyproject.org/singularity/quantms-utils:0.0.28--pyh106432d_0' :
'biocontainers/quantms-utils:0.0.28--pyh106432d_0' }"

input:
path(report)
Expand Down
7 changes: 4 additions & 3 deletions modules/local/diann/final_quantification/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ process FINAL_QUANTIFICATION {
'--temp', '--threads', '--verbose', '--lib', '--f', '--fasta',
'--use-quant', '--matrices', '--out', '--relaxed-prot-inf', '--pg-level',
'--qvalue', '--window', '--individual-windows',
'--species-genes', '--report-decoys', '--xic', '--no-norm']
'--species-genes', '--report-decoys', '--xic', '--no-norm',
'--monitor-mod', '--var-mod', '--fixed-mod']
// Sort by length descending so longer flags (e.g. --individual-windows) are matched before shorter prefixes (--window)
blocked.sort { a -> -a.length() }.each { flag ->
def flagPattern = '(?<=^|\\s)' + java.util.regex.Pattern.quote(flag) + '(?=\\s|\$)(\\s+(?!-{1,2}[a-zA-Z])\\S+)*'
Expand All @@ -72,8 +73,8 @@ process FINAL_QUANTIFICATION {
# Notes: if .quant files are passed, mzml/.d files are not accessed, so the name needs to be passed but files
# do not need to pe present.

# Extract --var-mod and --fixed-mod flags from diann_config.cfg (DIA-NN best practice)
mod_flags=\$(cat ${diann_config} | grep -oP '(--var-mod\\s+\\S+|--fixed-mod\\s+\\S+)' | tr '\\n' ' ')
# Extract --var-mod, --fixed-mod, and --monitor-mod flags from diann_config.cfg
mod_flags=\$(cat ${diann_config} | grep -oP '(--var-mod\\s+\\S+|--fixed-mod\\s+\\S+|--monitor-mod\\s+\\S+)' | tr '\\n' ' ')

diann --lib ${empirical_library} \\
--fasta ${fasta} \\
Expand Down
4 changes: 2 additions & 2 deletions modules/local/diann/generate_cfg/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ process GENERATE_CFG {
label 'process_tiny'

container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/quantms-utils:0.0.25--pyh106432d_0' :
'biocontainers/quantms-utils:0.0.25--pyh106432d_0' }"
'https://depot.galaxyproject.org/singularity/quantms-utils:0.0.28--pyh106432d_0' :
'biocontainers/quantms-utils:0.0.28--pyh106432d_0' }"

input:
val(meta)
Expand Down
Loading
Loading