Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to restrict analysis to specific contigs #644

Merged
merged 11 commits into from
Nov 11, 2024
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### `Added`

- A new analysis option `mito` to call and annotate only mitochondrial variants [#608](https://github.com/nf-core/raredisease/pull/608)
- An option to restrict analysis to specific contigs [#644](https://github.com/nf-core/raredisease/pull/644)

### `Changed`

Expand All @@ -28,6 +29,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### Parameters

| Old parameter | New parameter |
| ------------- | ------------------- |
| | extract_alignments |
| | restrict_to_contigs |

### Tool updates

| Tool | Old version | New version |
Expand Down
5 changes: 5 additions & 0 deletions conf/modules/align_bwa_bwamem2_bwameme.config
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,11 @@ process {
ext.prefix = { "${meta.id}_sorted_merged" }
}

withName: '.*ALIGN:ALIGN_BWA_BWAMEM2_BWAMEME:EXTRACT_ALIGNMENTS' {
ext.prefix = { "${meta.id}_sorted_merged_extracted" }
ext.args2 = { params.restrict_to_contigs }
}

withName: '.*ALIGN:ALIGN_BWA_BWAMEM2_BWAMEME:MARKDUPLICATES' {
ext.args = "--TMP_DIR ."
ext.prefix = { "${meta.id}_sorted_md" }
Expand Down
5 changes: 5 additions & 0 deletions conf/modules/align_sentieon.config
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,11 @@ process {
ext.prefix = { "${meta.id}_merged.bam" }
}

withName: '.*ALIGN:ALIGN_SENTIEON:EXTRACT_ALIGNMENTS' {
ext.prefix = { "${meta.id}_merged_extracted" }
ext.args2 = { params.restrict_to_contigs }
}

withName: '.*ALIGN:ALIGN_SENTIEON:SENTIEON_DEDUP' {
ext.args4 = { $params.rmdup ? "--rmdup" : '' }
ext.prefix = { "${meta.id}_dedup.bam" }
Expand Down
21 changes: 12 additions & 9 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,22 +168,25 @@ The mandatory and optional parameters for each category are tabulated below.

##### 1. Alignment

| Mandatory | Optional |
| ------------------------------ | ------------------------------ |
| aligner<sup>1</sup> | fasta_fai<sup>4</sup> |
| fasta<sup>2</sup> | bwamem2<sup>4</sup> |
| platform | bwa<sup>4</sup> |
| mito_name/mt_fasta<sup>3</sup> | bwameme<sup>4</sup> |
| | known_dbsnp<sup>5</sup> |
| | known_dbsnp_tbi<sup>5</sup> |
| | min_trimmed_length<sup>6</sup> |
| Mandatory | Optional |
| ------------------------------ | ------------------------------- |
| aligner<sup>1</sup> | fasta_fai<sup>4</sup> |
| fasta<sup>2</sup> | bwamem2<sup>4</sup> |
| platform | bwa<sup>4</sup> |
| mito_name/mt_fasta<sup>3</sup> | bwameme<sup>4</sup> |
| | known_dbsnp<sup>5</sup> |
| | known_dbsnp_tbi<sup>5</sup> |
| | min_trimmed_length<sup>6</sup> |
| | extract_alignments |
| | restrict_to_contigs<sup>7</sup> |

<sup>1</sup>Default value is bwamem2. Other alternatives are bwa, bwameme and sentieon (requires valid Sentieon license ).<br />
<sup>2</sup>Analysis set reference genome in fasta format, first 25 contigs need to be chromosome 1-22, X, Y and the mitochondria.<br />
<sup>3</sup>If mito_name is provided, mt_fasta can be generated by the pipeline.<br />
<sup>4</sup>fasta_fai, bwa, bwamem2 and bwameme, if not provided by the user, will be generated by the pipeline when necessary.<br />
<sup>5</sup>Used only by Sentieon.<br />
<sup>6</sup>Default value is 40. Used only by fastp.<br />
<sup>7</sup>Used to limit your analysis to specific contigs. Can be used to remove alignments to unplaced contigs to minimize potential errors. This parameter should be used in conjuction with `extract_alignments` parameter.<br />

##### 2. QC stats from the alignment files

Expand Down
2 changes: 2 additions & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ params {
analysis_type = 'wgs'
bwa_as_fallback = false
bait_padding = 100
extract_alignments = false
restrict_to_contigs = null
run_mt_for_wes = false
run_rtgvcfeval = false
save_mapped_as_cram = false
Expand Down
12 changes: 12 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -503,6 +503,13 @@
"help_text": "errorStrategy needs to be set to ignore for the bwamem2 process for the fallback to work. Turned off by default.",
"fa_icon": "fas fa-toggle-on"
},
"extract_alignments": {
"type": "boolean",
"default": "false",
"description": "After aligning the reads to a reference, extract alignments from specific regions/contigs and restrict the analysis to those regions/contigs.",
"help_text": "Set this to true, and specify the contig(s) using `restrict_to_contigs` parameter",
"fa_icon": "fas fa-toggle-on"
},
"platform": {
"type": "string",
"default": "illumina",
Expand All @@ -516,6 +523,11 @@
"fa_icon": "fas fa-align-center",
"enum": ["xy", "hetx", "sry"]
},
"restrict_to_contigs": {
"type": "string",
"description": "Can be specified as RNAME[:STARTPOS[-ENDPOS]]. Multiple regions should be seperated by space",
"fa_icon": "fas fa-align-center"
},
"run_mt_for_wes": {
"type": "boolean",
"description": "Specifies whether to run mitochondrial analysis for wes samples",
Expand Down
2 changes: 1 addition & 1 deletion subworkflows/local/align.nf
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ workflow ALIGN {
ch_bwamem2_bai = ALIGN_BWA_BWAMEM2_BWAMEME.out.marked_bai
ch_versions = ch_versions.mix(ALIGN_BWA_BWAMEM2_BWAMEME.out.versions)
} else if (params.aligner.equals("sentieon")) {
ALIGN_SENTIEON ( // Triggered when params.aligner is set as sentieon
ALIGN_SENTIEON ( // Triggered when params.aligner is set as sentieon
ch_reads,
ch_genome_fasta,
ch_genome_fai,
Expand Down
10 changes: 10 additions & 0 deletions subworkflows/local/alignment/align_bwa_bwamem2_bwameme.nf
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,11 @@ include { BWA_MEM as BWAMEM_FALLBACK } from '../../../modules/nf-c
include { BWAMEM2_MEM } from '../../../modules/nf-core/bwamem2/mem/main'
include { BWAMEME_MEM } from '../../../modules/nf-core/bwameme/mem/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_ALIGN } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_EXTRACT } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_MARKDUP } from '../../../modules/nf-core/samtools/index/main'
include { SAMTOOLS_STATS } from '../../../modules/nf-core/samtools/stats/main'
include { SAMTOOLS_MERGE } from '../../../modules/nf-core/samtools/merge/main'
include { SAMTOOLS_VIEW as EXTRACT_ALIGNMENTS } from '../../../modules/nf-core/samtools/view/main'
include { PICARD_MARKDUPLICATES as MARKDUPLICATES } from '../../../modules/nf-core/picard/markduplicates/main'


Expand Down Expand Up @@ -82,6 +84,14 @@ workflow ALIGN_BWA_BWAMEM2_BWAMEME {
SAMTOOLS_MERGE ( bams.multiple, ch_genome_fasta, ch_genome_fai )
prepared_bam = bams.single.mix(SAMTOOLS_MERGE.out.bam)

// GET ALIGNMENT FROM SELECTED CONTIGS
if (params.extract_alignments) {
SAMTOOLS_INDEX_EXTRACT ( prepared_bam )
extract_bam_sorted_indexed = prepared_bam.join(SAMTOOLS_INDEX_EXTRACT.out.bai, failOnMismatch:true, failOnDuplicate:true)
EXTRACT_ALIGNMENTS( extract_bam_sorted_indexed, ch_genome_fasta, [])
prepared_bam = EXTRACT_ALIGNMENTS.out.bam
}

// Marking duplicates
MARKDUPLICATES ( prepared_bam , ch_genome_fasta, ch_genome_fai )
SAMTOOLS_INDEX_MARKDUP ( MARKDUPLICATES.out.bam )
Expand Down
19 changes: 15 additions & 4 deletions subworkflows/local/alignment/align_sentieon.nf
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,13 @@
// A subworkflow to annotate structural variants.
//

include { SENTIEON_BWAMEM } from '../../../modules/nf-core/sentieon/bwamem/main'
include { SENTIEON_DATAMETRICS } from '../../../modules/nf-core/sentieon/datametrics/main'
include { SENTIEON_DEDUP } from '../../../modules/nf-core/sentieon/dedup/main'
include { SENTIEON_READWRITER } from '../../../modules/nf-core/sentieon/readwriter/main'
include { SENTIEON_BWAMEM } from '../../../modules/nf-core/sentieon/bwamem/main'
include { SENTIEON_DATAMETRICS } from '../../../modules/nf-core/sentieon/datametrics/main'
include { SENTIEON_DEDUP } from '../../../modules/nf-core/sentieon/dedup/main'
include { SENTIEON_READWRITER } from '../../../modules/nf-core/sentieon/readwriter/main'
include { SAMTOOLS_VIEW as EXTRACT_ALIGNMENTS } from '../../../modules/nf-core/samtools/view/main'
include { SAMTOOLS_INDEX as SAMTOOLS_INDEX_EXTRACT } from '../../../modules/nf-core/samtools/index/main'

workflow ALIGN_SENTIEON {
take:
ch_reads_input // channel: [mandatory] [ val(meta), path(reads_input) ]
Expand Down Expand Up @@ -36,6 +39,14 @@ workflow ALIGN_SENTIEON {
SENTIEON_READWRITER ( merge_bams_in.multiple, ch_genome_fasta, ch_genome_fai )
ch_bam_bai = merge_bams_in.single.mix(SENTIEON_READWRITER.out.output_index)

// GET ALIGNMENT FROM SELECTED CONTIGS
if (params.extract_alignments) {
EXTRACT_ALIGNMENTS( ch_bam_bai, ch_genome_fasta, [])
ch_bam_bai = EXTRACT_ALIGNMENTS.out.bam
SAMTOOLS_INDEX_EXTRACT ( EXTRACT_ALIGNMENTS.out.bam )
ch_bam_bai = EXTRACT_ALIGNMENTS.out.bam.join(SAMTOOLS_INDEX_EXTRACT.out.bai, failOnMismatch:true, failOnDuplicate:true)
}

SENTIEON_DATAMETRICS ( ch_bam_bai, ch_genome_fasta, ch_genome_fai, false )

SENTIEON_DEDUP ( ch_bam_bai, ch_genome_fasta, ch_genome_fai )
Expand Down
Loading