Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subworkflow for organelles #18

Merged
merged 35 commits into from
Oct 31, 2023
Merged

Subworkflow for organelles #18

merged 35 commits into from
Oct 31, 2023

Conversation

ksenia-krasheninnikova
Copy link
Contributor

Implements assembling mito from raw reads and from assembly

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the sanger-tol/genomeassembly branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link

github-actions bot commented Jun 29, 2023

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 10093d9

+| ✅ 131 tests passed       |+
#| ❔  18 tests were ignored |#
!| ❗  14 tests had warnings |!

❗ Test warnings:

  • files_exist - File not found: conf/igenomes.config
  • readme - README contains the placeholder zenodo.XXXXXXX. This should be replaced with the zenodo doi (after the first release).
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your prefered methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in WorkflowMain.groovy: Add Zenodo DOI for pipeline after first release
  • pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
  • pipeline_todos - TODO string in base.config: Check the defaults for all processes
  • pipeline_todos - TODO string in base.config: Customise requirements for specific processes.
  • pipeline_todos - TODO string in output.md: Write this documentation describing your workflow's output
  • pipeline_todos - TODO string in usage.md: Add documentation about anything specific to running your pipeline. For general topics, please point to (and add to) the main nf-core website.
  • pipeline_todos - TODO string in genomeassembly.nf: Add all file path parameters for the pipeline to the list below
  • system_exit - System.exit in WorkflowSanger-tol-genomeassembly.groovy: System.exit(1) [line 51]
  • schema_description - No description provided in schema for parameter: timestamp
  • schema_description - No description provided in schema for parameter: hifiasm
  • schema_description - No description provided in schema for parameter: hifiasmhic

❔ Tests ignored:

  • files_exist - File is ignored: assets/nf-core-genomeassembly_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-genomeassembly_logo_light.png
  • files_exist - File is ignored: docs/images/nf-core-genomeassembly_logo_dark.png
  • files_exist - File is ignored: .github/ISSUE_TEMPLATE/config.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • nextflow_config - Config variable ignored: manifest.name
  • nextflow_config - Config variable ignored: manifest.homePage
  • files_unchanged - File ignored due to lint config: LICENSE or LICENSE.md or LICENCE or LICENCE.md
  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/ISSUE_TEMPLATE/bug_report.yml
  • files_unchanged - File ignored due to lint config: .github/workflows/linting.yml
  • files_unchanged - File ignored due to lint config: assets/sendmail_template.txt
  • files_unchanged - File does not exist: assets/nf-core-genomeassembly_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-genomeassembly_logo_light.png
  • files_unchanged - File does not exist: docs/images/nf-core-genomeassembly_logo_dark.png
  • files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore or pyproject.toml

✅ Tests passed:

Run details

  • nf-core/tools version 2.8
  • Run at 2023-10-30 10:20:45

Copy link
Contributor

@priyanka-surana priyanka-surana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's test this after merging the other two PRs and then good to move forward. You might want to add associated documentation to this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the module you are trying to add to nf-core?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the one

code)
}

emit:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can some of the channel engineering be moved above for a cleaner emit section? It is very hard to follow right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems it looks better now.

Copy link
Contributor

@priyanka-surana priyanka-surana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to make the changes and merge this PR? And do the changes for CI testing in a separate PR?

subworkflows/local/organelles.nf Outdated Show resolved Hide resolved
@ksenia-krasheninnikova
Copy link
Contributor Author

Removed all redundant files from the emit section. I think it can be merged in now. Thanks @priyanka-surana

Copy link
Contributor

@priyanka-surana priyanka-surana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple minor changes requested.

nextflow.config Outdated
@@ -19,6 +19,7 @@ params {

// Assembly options
hifiasm_hic_on = false
organelles_on = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are recommended to keep the default as false and then turn on via command line flags or the config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


emit:

versions = ch_versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now there are no output files. Is this intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we need this files in the final output, we don't need them down the workflow. Initially I thought the emit section has to do something with the files being published, but actually it works fine the way it is now.

@muffato
Copy link
Member

muffato commented Oct 17, 2023

I find this emit section confusing and difficult to navigate. Could be just me since I am not familiar with MitoHifi. Looking for a more informed view @muffato since you are familiar with the tool.

I'm not that familiar with MitoHifi :) !

I can see you cleaned up the entire emit section. It simplifies things a lot ! For the record, I had some questions about that structure:

  1. Could you make the MitoHifi module take a single input like this:
    1. Define it as tuple val(meta), path(input) rather than the mutually-exclusive reads and contigs
    2. Use $args $input in the MitoHifi command-line rather than ${run_type}
    3. First check that $args does contain either -c or -r (one of them is mandatory)
  2. Then, do you need this one subworkflow to choose between the assembly and read mode and duplicate the input and output channels ? Could you imagine having a single subworkflow that is agnostic of the mode (just one input channel and one et of output channels), and in the parent workflow you would import it twice: once as ORGANELLES_READS and once as ORGANELLES_ASM

@ksenia-krasheninnikova
Copy link
Contributor Author

ksenia-krasheninnikova commented Oct 20, 2023

  1. Could you make the MitoHifi module take a single input like this:

    1. Define it as tuple val(meta), path(input) rather than the mutually-exclusive reads and contigs
    2. Use $args $input in the MitoHifi command-line rather than ${run_type}
    3. First check that $args does contain either -c or -r (one of them is mandatory)
  2. Then, do you need this one subworkflow to choose between the assembly and read mode and duplicate the input and output channels ? Could you imagine having a single subworkflow that is agnostic of the mode (just one input channel and one et of output channels), and in the parent workflow you would import it twice: once as ORGANELLES_READS and once as ORGANELLES_ASM

Thank you @muffato for suggestions. I've updated the code, could you have a look if it's what you mean please? I had to put -r/-c into ext.args2 because the input file should follow the corresponding option immediately:

usage: MitoHiFi [-h] (-r <reads>.fasta | -c <contigs>.fasta) -f
                <relatedMito>.fasta -g <relatedMito>.gbk -t <THREADS> [-d]
                [-a {animal,plant,fungi}] [-p <PERC>] [-m <BLOOM FILTER>]
                [--max-read-len MAX_READ_LEN] [--mitos]
                [--circular-size CIRCULAR_SIZE]
                [--circular-offset CIRCULAR_OFFSET] [-winSize WINSIZE]
                [-covMap COVMAP] [-v] [-o <GENETIC CODE>]

The purpose of the original implementation of the organelles workflow was to separate the whole organelles logic into the separate workflow to lighten up the main workflow. My concern though is that the main workflow getting rather long. Is it a standard practice to keep it that way? It will become more readable when I add documentation.

Copy link
Member

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. I find it much more straightforward !
Can you propose your updated mitohifi/mitohifi module to nf-core (can do that after this PR) and suggest mahesh-panchal and scorreard for review ? They both work on genome-assembly pipelines.

modules/local/get_calcuts_params.nf Outdated Show resolved Hide resolved
@ksenia-krasheninnikova
Copy link
Contributor Author

@muffato
Merged it updated mitohifi module from nf-core. All the review items must have been addressed for now!
PR ready for the final review.
Thanks!

Copy link
Member

@muffato muffato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you said, the module is now in sync with nf-core, meaning no local .diff file.

All good !

@ksenia-krasheninnikova ksenia-krasheninnikova merged commit 814f6cd into dev Oct 31, 2023
6 checks passed
@muffato muffato deleted the mitohifi branch November 21, 2023 22:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants