Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

protocol option not interpreted ? #351

Closed
nservant opened this issue May 7, 2024 · 4 comments
Closed

protocol option not interpreted ? #351

nservant opened this issue May 7, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@nservant
Copy link

nservant commented May 7, 2024

Description of the bug

I'm wondering if the protocol option is well interpreted for the adapter trimming.
If I use --protocol illumina or --protocol qiaseq, there is no difference in the fastp command which is always run with ;

fastp \
    --in1 XXX.fastq.gz \
    --out1  XXX.fastp.fastq.gz \
    --thread 6 \
    --json XXX.fastp.json \
    --html XXX.fastp.html \
    --adapter_fasta known_adapters.fa \
     \
    -l 17 --max_len1 100 --adapter_sequence AGATCGGAAGAGCACACGTCTGAACTCCAGTCA \
    2> >(tee XXXX.fastp.log >&2)

I do not think that it has a strong impact on the results, because all adapter sequences are available in the known_adapters.fa file, but based on the code ;

code: subworkflows/local/utils_nfcore_smrnaseq_pipeline/main.nf

def formatProtocol(params,log) {

    switch(params.protocol){
        case 'illumina':
            params.putIfAbsent("clip_r1", 0);
            params.putIfAbsent("three_prime_clip_r1",0);
            params.putIfAbsent("three_prime_adapter", "TGGAATTCTCGGGTGCCAAGG");
            break
        case 'nextflex':
            params.putIfAbsent("clip_r1", 4);
            params.putIfAbsent("three_prime_clip_r1", 4);
            params.putIfAbsent("three_prime_adapter", "TGGAATTCTCGGGTGCCAAGG");
            break
	case 'qiaseq':
            params.putIfAbsent("clip_r1",0);
            params.putIfAbsent("three_prime_clip_r1",0);
            params.putIfAbsent("three_prime_adapter","AACTGTAGGCACCATCAAT");
            break
        case 'cats':
            params.putIfAbsent("clip_r1",3);
            params.putIfAbsent("three_prime_clip_r1", 0);
            params.putIfAbsent("three_prime_adapter", "AAAAAAAA");
            break
        case 'custom':
            params.putIfAbsent("clip_r1", params.clip_r1)
            params.putIfAbsent("three_prime_clip_r1", params.three_prime_clip_r1)
        default:
            log.warn "Please make sure to specify all required clipping and trimming parameters, otherwise only adapter detection will be performed."
        }

        log.warn "Running with Protocol ${params.protocol}"
        log.warn "Therefore using Adapter: ${params.three_prime_adapter}"
        log.warn "Clipping ${params.clip_r1} bases from R1"
        log.warn "And clipping ${params.three_prime_clip_r1} bases from 3' end"
    }

and conf/modules.config

    withName: '.*:FASTQ_FASTQC_UMITOOLS_FASTP:FASTP' {
        ext.args = [ "",
            params.trim_fastq                           ? "" : "--disable_adapter_trimming",
            params.clip_r1 > 0                          ? "--trim_front1 ${params.clip_r1}" : "", // Remove bp from the 5' end of read 1.                                                                                                                              
            params.three_prime_clip_r1 > 0              ? "--trim_tail1 ${params.three_prime_clip_r1}" : "", // Remove bp from the 3' end of read 1 AFTER adapter/quality trimming has been performed.                                                                 
            params.fastp_min_length > 0                 ? "-l ${params.fastp_min_length}" : "",
            params.fastp_max_length > 0                 ? "--max_len1 ${params.fastp_max_length}" : "",
            params.three_prime_adapter == "auto-detect" ?  "" : "--adapter_sequence ${params.three_prime_adapter}"
        ].join(" ").trim()
}

I would expect the parameter --adapter_sequence to be updated according to the ${params.three_prime_adapter} which is itself defined based on the protocol ?

Command used and terminal output

Apps/nextflow-23.10.1/nextflow run smrnaseq-2.3.1 \
  --input ${PROJ_DIR}/data/samplesheet.csv \
  --protocol 'illumina' \
  --mirtrace_species 'rno' \
  --genome 'Rnor_6.0' \
  --outdir ${PROJ_DIR}/results \
  -w ${PROJ_DIR}/results/work \
  -profile singularity \
  -c ${PROJ_DIR}/curie.config \
  -resume "

Relevant files

No response

System information

nextflow-23.10.1
smrnaseq-2.3.1

@nservant nservant added the bug Something isn't working label May 7, 2024
@christopher-mohr
Copy link
Contributor

Hi @nservant, thanks for reporting. You're definietly right, that the protocol option currently does not seem to have the expected effect. As you also mentioned, usually there should be no strong impact since the miRNA adapters are provided in the fasta file. However, if clipping is needed this might have some impact.

We are also running some additional tests internally to confirm this.

@lpantano
Copy link
Contributor

@christopher-mohr , let me know if you have some insight, I would be happy to work on this code if you figure out something else.

@lpantano
Copy link
Contributor

as well protocol is use specifically for mirtrace, independently of the other tools.

@lpantano lpantano moved this to Ready in smrnaseq Jul 11, 2024
@apeltzer
Copy link
Member

Fixed in #364, please give dev a try 👍🏻

@github-project-automation github-project-automation bot moved this from Ready to Done in smrnaseq Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants