Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DEV] Adding a gatk_jar downloader #58

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions conda_envs/mtbseq-nf-env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,5 @@ dependencies:
- bioconda/linux-64::gatk=3.8.0
- bioconda::fastqc=0.11.9
- bioconda::multiqc=1.9
- conda-forge::wget=1.2
- conda-forge::tar=1.34
abhi18av marked this conversation as resolved.
Show resolved Hide resolved
4 changes: 4 additions & 0 deletions conf/docker.config
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,8 @@ process {
container = 'quay.io/biocontainers/multiqc:1.9--pyh9f0ad1d_0'
}

withName:
'DOWNLOAD_GATK_JAR' {
container = 'alpine:3.14'
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mxrcon , perhaps we might not need to use an extra container?

In these cases we can reuse (maybe?) the mtbseq container since its derived from debian and would be downloaded (and available) on the underlying machine anyhow.

What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, even alpine being a minimal container, we can use the mtbseq for it.

}
2 changes: 2 additions & 0 deletions conf/global_params.config
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ cohort_tsv = "${params.project}_cohort.tsv"

gatk38_jar = "${projectDir}/resources/GenomeAnalysisTK-3.8-0-ge9d806836/GenomeAnalysisTK.jar"

gatk_jar_link = "https://storage.googleapis.com/gatk-software/package-archive/gatk/GenomeAnalysisTK-3.8-0-ge9d806836.tar.bz2"

Comment on lines 29 to +32
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some testing, preferably we can remove the necessity of gatk38_jar completely 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, but we can keep it until we properly test it

//NOTE: Setting this OPTION will skip all filtering steps and report the calculated information for all positions in the input file.
// The all_vars only needs to be activated in MTBseq. But in mtbseq-nf we'll specify it as false
all_vars = false
Expand Down
28 changes: 28 additions & 0 deletions modules/utils/download_gatk_jar.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
nextflow.enable.dsl = 2



params.gatk_jar_link = "https://storage.googleapis.com/gatk-software/package-archive/gatk/GenomeAnalysisTK-3.8-0-ge9d806836.tar.bz2"

process DOWNLOAD_GATK_JAR {
publishDir params.results_dir, mode: params.save_mode, enabled: params.should_publish

output:
path('**/GenomeAnalysisTK.jar')


script:

"""
wget ${params.gatk_jar_link}
tar -xfv *.tar.bz2
"""

stub:

"""
echo "wget ${params.gatk_jar_link}"

touch GenomeAnalysisTK.jar
"""
}