-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor channel identifiers and process names #45
Refactor channel identifiers and process names #45
Conversation
Hi! Thanks for the the efforts. I have checked the changes, and it seems that everything will run ok because it just changes the cases and nothing else, however a quick test is required to check if no typo is present. When skimming through it, I found two typos:
The only way I can think of testing it is running an example with a small bacterial genome and with a nanopore subset with fast5 to try the methylation calling process. As the second one using raw reads and fast5 can be a little demanding I can make these tests myself directly using the branch from your fork after the typos have been fixed and before merging it. 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review @fmalmeida - good catches! I've accomodated the change requests.
Hi @abhi18av, thank you for accomodating the changes and all the help. I will perform a rapid test to check if nextflow doesn't find another typo, but everything should run smoothly. As soon as the test finishes, I'll merge it in the Talking about the |
When running the test it complained about a $ nextflow run \
abhi18av/bacannot -r abhinav/refactor-channel-identifiers \
-profile test --threads 10 --output ABHINAV
Launching `abhi18av/bacannot` [astonishing_albattani] - revision: 0f58558832 [abhinav/refactor-channel-identifiers]
=================================================================
Container-based, fmalmeida/bacannot, Genome Annotation Pipeline
=================================================================
Input genomes : https://github.com/fmalmeida/bacannot/raw/master/test_data/samplesheet.yaml
Output dir : ABHINAV
Threads : 10
Blast % ID - Virulence Genes: 90
Blast query coverage - Virulence Genes: 80
Blast % ID - AMR Genes: 90
Blast query coverage - AMR Genes: 80
Blast % ID - ICEs or Phages: 65
Blast query coverage - ICEs or Phages: 65
Blast % ID - Plasmids: 90
Blast query coverage - Plasmids: 60
Pipeline Release: abhinav/refactor-channel-identifiers
Current home : /home/falmeida
Current user : falmeida
Current path : /work1/falmeida/bacannot_tests
Configuration file: /home/falmeida/.nextflow/assets/abhi18av/bacannot/nextflow.config
==============================================================
No such variable: flye_ch
-- Check script '/home/falmeida/.nextflow/assets/abhi18av/bacannot/./workflows/bacannot.nf' at line: 132 or see '.nextflow.log' file for more details The problem stated is in line 132 where you've changed it to PROKKA(parsed_inputs.annotation_ch.mix(flye_ch.out[1], unicycler_ch.out[1])) However, in this step, prokka module expects the outputs from assemblers modules, so it should be something like: PROKKA(parsed_inputs.annotation_ch.mix(FLYE.out[1], UNICYCLER.out[1])) |
Thanks for the catch again @fmalmeida #45 (comment) , I've addressed it in 03cde47 |
Hi @abhi18av, thanks for the change. I've tried it again and it now complained the following: Missing process or function with name 'iceberg'
-- Check script '/home/falmeida/.nextflow/assets/abhi18av/bacannot/main.nf' at line: 135 or see '.nextflow.log' file for more details The problem seems in line 209: // ICEs search
if (params.skip_iceberg_search == false) {
// ICEberg db
iceberg(PROKKA.out[4], PROKKA.out[3])
iceberg_output_ch = ICEBERG.out[1]
iceberg_output_2_ch = ICEBERG.out[2]
} else {
iceberg_output_ch = Channel.empty()
iceberg_output_2_ch = Channel.empty()
} |
Ah, okay. I've now fixed it with 6b77030 (Feliz Natal! 🎄🎅 ) |
Feliz Natal! 🌲 |
It almost worked, but it know complained about this line: /*
Eighth step -- Merge all annotations with the same Prefix value in a single Channel
*/
annotations_files_ch = PROKKA.out[3].join(PROKKA.out[1])
.join(MLST.out[0])
.join(BARRNAP.out[0])
.join(COMPUTE_GC.out[0])
.join(kofamscan_output_ch, remainder: true)
.join(vfdb_output_ch, remainder: true)
.join(victors_output_ch, remainder: true)
.join(amrfinder_output_ch, remainder: true)
.join(resfinder_gff_ch, remainder: true)
.join(rgi_output_ch, remainder: true)
.join(iceberg_output_ch, remainder: true)
.join(phast_output_ch, remainder: true)
.join(phigaro_output_2_ch, remainder: true)
.join(FIND_GIS.out[0], remainder: true)
// Contatenation of annotations in a single GFF file
MERGE_ANNOTATIONS(ANNOTATIONS_FILES.join(DIGIS.out[1], remainder: true)) The annotation files channel is not After that I believe it would work. |
Hey @fmalmeida , fixed that one now and keeping my 🤞 😄 As an aside, to speed up such code reviews Github has a couple features
Apart from this, happy to address any other issues. |
Ah, there's still another one you found. Apologies for this Felipe! 😅 I'd really appreciate if there's a light-weight testing data I could use on my end - hopefully this would avoid these issues in future |
Hi @abhi18av, No worries. I still not checked if it finished properly. I will check it again after 06/Jan. Then, I can also think about a small dataset to share with you. The dataset just can't be too small in order to properly test the majority of processes. For now, there are two 30X bacterial seq reads available with After coming back from vacations I will put more efforts on find a more appropriate testing dataset. Furthermore, Happy new year, and happy celebrations. All the best and thank you for helping with the code 🥳😁 |
The tests here have finished successfully, so I will now merge this PR. I have created a new issue #46 to handle this "problem" of a smaller dataset for quicker tests. 😄 |
* Split out the config profiles (#43) * Split out the config profiles * Resolve the conflict using the upstream config * Refactor channel identifiers and process names (#45) **Main change** Processes, channels and workflow names have had its case names changed in order to meet nextflow's community standards. **Commits** * Refactor names * refactor names in the workflows * Update workflows/bacannot.nf * Update workflows/bacannot.nf * Apply suggestions from code review * Update workflows/bacannot.nf wrt iceberg * Accomodate code review and fix the channel name * fix MERGE_ANNOTATIONS process case in line 331 Co-authored-by: Felipe Marques de Almeida <[email protected]> * added small dataset test profile * fixing name of quicktest profile * fixing urls of testing samplesheets * removing unnecessary files * added new action to test the pipeline from PR and updated docs about quicktest * fixed branch name in github action * fixing input typo in quicktest profile * adding more space to workflow env * fixed action as only 2 threads are available Co-authored-by: Abhinav Sharma <[email protected]>
* Split out the config profiles (#43) * Split out the config profiles * Resolve the conflict using the upstream config * trying to have a create dbs process * found way to download dbs * adding resfinder and plasmidfinder download rules * added phigaro, vfdb and amrfinder download rules * added last database download rules * added prokka HMM download rule * added label to use docker for tools * fixed argminer download * fixed victors download * fixed iceberg download * prokka using given database and bioconda image * docker specific for downloading databases * update packages * added mlst database download rule * add bacannot db info * also downloads PGAP db * added prokka and mlst * fix conditional * fixed pgap conditional * added barrnap * added 'compute_gc' module -- think about modules with two tools * adding identation * fixing label * first tries on conda envs * removing label * adding plasmidfinder * added platon * working until islandpath * added: VFDB * added: Victors * changing name and organization of MISC image * added: PHAST * added: phigaro * added: phispy * added: iceberg * added: kofamscan db download * removing named outputs as it will be solved in another PR * added: kofamscan from downloaded database * added: kegg_decoder * Refactor channel identifiers and process names (#45) **Main change** Processes, channels and workflow names have had its case names changed in order to meet nextflow's community standards. **Commits** * Refactor names * refactor names in the workflows * Update workflows/bacannot.nf * Update workflows/bacannot.nf * Apply suggestions from code review * Update workflows/bacannot.nf wrt iceberg * Accomodate code review and fix the channel name * fix MERGE_ANNOTATIONS process case in line 331 Co-authored-by: Felipe Marques de Almeida <[email protected]> * trying to brind amrfinder, card_rgi and argminer * trying to fix how the scale is calculated for amrfinder * changing scale to perl * typo fix * removing unnecessary comma * Update amrfinder.nf passing calculation of thresold into two decimals scale to groovy * properly working until amrfinder and card_rgi * added small dataset test profile * fixing name of quicktest profile * fixing urls of testing samplesheets * removing unnecessary files * fix missing label * adding resfinder to miscellaneous image * fixing gitignore * change manifest to upcoming version * updating db download workflow behaviour * removing .git dirs * fixing argminder download * properly added resfinder * adding script that parses nanopolish methyl call * removing unnecessary labels * starting to change how images are done * added perl tools Added: * prokka * barrnap * mlst * added misc module added: * gc_compute module * adding labels * added kofam analysis * added main pyenv added tools: * platon * plasmidfinder * fixed perlenv image Added tool: * islandpath * included virulence modules they use the miscellaneous image * pyenv image updated added prophage tools: * phast scan * phigaro *phispy * added iceberg db module it uses the miscellaneous image * added py36env image added tool: * rgi * added resistance tools added: * argminer db module * resfinder * amrfinderplus * added nanopolish * added refseq_masher * added digIS Docker image and modules were updated to run until digIS * added antismash * added sequence server * added merge_annotation module * added draw_gis modules * added gff2gbk module * added create_sql module * added first resource management labels * fixing resouce label for phigaro * little finx in env * fixing how tuples should be passed * arrived at jbrowse step * fixing draw_gis module input tuple * jbrowse added * Create test_pr.yml This is how new versions will be organised after the remodeling branch update, and, therefore, the test_pr must be updated. * fixed phast db incorporation * fixing inputs on report * Update test_pr.yml Will only act in ready for review PRs * adding ENV VAR for current version * adding scripts to automatically build images * fixed iceberg db incorporation * Update digIS.nf DIGis gff usage fixed * fixed vfdb incorporation * changing file to path input resolution * fixed victors db incorporation * changed channel names in main script * fixed argminer and prokka tables * fixed custom db annotations incorporation * fixed bacannot server loading * fixed custom db reports * begin incorporation of nf-core framework * nf-core libs have been added to the pipeline * custom database annotation added to JBrowse * fixed custom database gff generation * Update Dockerfile diminished image size by removing antismash db from inside it * Update docker.config uncommenting assemblers images * creating singularity profile * begin documentation update * added singularity profile * made image compatible with earlier versions * given 777 permissions to workdir * adjusted default values * fixed prokka to work with singularity * fixed rgi for singularity usage * updating PR test action * update targeted branches * limiting process resources * updated image for singularity * removing unused dbs in quicktest * not using big dbs in quicktest * Update resfinder.nf * adding gitpod config * fixed yml * fixing custom db report getattributefileld snippet * begin change to mkdocs * add requirements * Update .readthedocs.yml * Update .readthedocs.yml * Update .readthedocs.yml * Update .readthedocs.yml * update * update requirements * added index * fixed tags * added installation information * now on quickstart * changed some admonitions * added quickstart * added samplesheet page to mkdocs * included dir with images * Update samplesheet.md fixed header levels * outputs page added to mkdocs * Update standard.config removing "paralallel_jobs" definition * Update defaults.config fixed blast default values * updated and tested quickstart * updating gitpod.yml * creates a testing dir with more space * added profile selection information * Update manual.md manual updated for mkdocs and with new --ncbi_proteins parameter * Update nextflow_schema.json updated --custom_db parameter description * Update nextflow.config removed parameters that are in default and should not be in the boilerplate * added config file page * information about custom databases added * Update nextflow_schema.json * defaults need to be loaded before boilerplate * changing label of unicycler and flye * fixed antismash installation * fixed keggdecoder requires py36 * fixed resfinder module fixed db setup * pipeline fixed for docker Co-authored-by: Abhinav Sharma <[email protected]>
* Pipeline remodelling. Issues 24, 31 and 36 (#44) * Split out the config profiles (#43) * Split out the config profiles * Resolve the conflict using the upstream config * trying to have a create dbs process * found way to download dbs * adding resfinder and plasmidfinder download rules * added phigaro, vfdb and amrfinder download rules * added last database download rules * added prokka HMM download rule * added label to use docker for tools * fixed argminer download * fixed victors download * fixed iceberg download * prokka using given database and bioconda image * docker specific for downloading databases * update packages * added mlst database download rule * add bacannot db info * also downloads PGAP db * added prokka and mlst * fix conditional * fixed pgap conditional * added barrnap * added 'compute_gc' module -- think about modules with two tools * adding identation * fixing label * first tries on conda envs * removing label * adding plasmidfinder * added platon * working until islandpath * added: VFDB * added: Victors * changing name and organization of MISC image * added: PHAST * added: phigaro * added: phispy * added: iceberg * added: kofamscan db download * removing named outputs as it will be solved in another PR * added: kofamscan from downloaded database * added: kegg_decoder * Refactor channel identifiers and process names (#45) **Main change** Processes, channels and workflow names have had its case names changed in order to meet nextflow's community standards. **Commits** * Refactor names * refactor names in the workflows * Update workflows/bacannot.nf * Update workflows/bacannot.nf * Apply suggestions from code review * Update workflows/bacannot.nf wrt iceberg * Accomodate code review and fix the channel name * fix MERGE_ANNOTATIONS process case in line 331 Co-authored-by: Felipe Marques de Almeida <[email protected]> * trying to brind amrfinder, card_rgi and argminer * trying to fix how the scale is calculated for amrfinder * changing scale to perl * typo fix * removing unnecessary comma * Update amrfinder.nf passing calculation of thresold into two decimals scale to groovy * properly working until amrfinder and card_rgi * added small dataset test profile * fixing name of quicktest profile * fixing urls of testing samplesheets * removing unnecessary files * fix missing label * adding resfinder to miscellaneous image * fixing gitignore * change manifest to upcoming version * updating db download workflow behaviour * removing .git dirs * fixing argminder download * properly added resfinder * adding script that parses nanopolish methyl call * removing unnecessary labels * starting to change how images are done * added perl tools Added: * prokka * barrnap * mlst * added misc module added: * gc_compute module * adding labels * added kofam analysis * added main pyenv added tools: * platon * plasmidfinder * fixed perlenv image Added tool: * islandpath * included virulence modules they use the miscellaneous image * pyenv image updated added prophage tools: * phast scan * phigaro *phispy * added iceberg db module it uses the miscellaneous image * added py36env image added tool: * rgi * added resistance tools added: * argminer db module * resfinder * amrfinderplus * added nanopolish * added refseq_masher * added digIS Docker image and modules were updated to run until digIS * added antismash * added sequence server * added merge_annotation module * added draw_gis modules * added gff2gbk module * added create_sql module * added first resource management labels * fixing resouce label for phigaro * little finx in env * fixing how tuples should be passed * arrived at jbrowse step * fixing draw_gis module input tuple * jbrowse added * Create test_pr.yml This is how new versions will be organised after the remodeling branch update, and, therefore, the test_pr must be updated. * fixed phast db incorporation * fixing inputs on report * Update test_pr.yml Will only act in ready for review PRs * adding ENV VAR for current version * adding scripts to automatically build images * fixed iceberg db incorporation * Update digIS.nf DIGis gff usage fixed * fixed vfdb incorporation * changing file to path input resolution * fixed victors db incorporation * changed channel names in main script * fixed argminer and prokka tables * fixed custom db annotations incorporation * fixed bacannot server loading * fixed custom db reports * begin incorporation of nf-core framework * nf-core libs have been added to the pipeline * custom database annotation added to JBrowse * fixed custom database gff generation * Update Dockerfile diminished image size by removing antismash db from inside it * Update docker.config uncommenting assemblers images * creating singularity profile * begin documentation update * added singularity profile * made image compatible with earlier versions * given 777 permissions to workdir * adjusted default values * fixed prokka to work with singularity * fixed rgi for singularity usage * updating PR test action * update targeted branches * limiting process resources * updated image for singularity * removing unused dbs in quicktest * not using big dbs in quicktest * Update resfinder.nf * adding gitpod config * fixed yml * fixing custom db report getattributefileld snippet * begin change to mkdocs * add requirements * Update .readthedocs.yml * Update .readthedocs.yml * Update .readthedocs.yml * Update .readthedocs.yml * update * update requirements * added index * fixed tags * added installation information * now on quickstart * changed some admonitions * added quickstart * added samplesheet page to mkdocs * included dir with images * Update samplesheet.md fixed header levels * outputs page added to mkdocs * Update standard.config removing "paralallel_jobs" definition * Update defaults.config fixed blast default values * updated and tested quickstart * updating gitpod.yml * creates a testing dir with more space * added profile selection information * Update manual.md manual updated for mkdocs and with new --ncbi_proteins parameter * Update nextflow_schema.json updated --custom_db parameter description * Update nextflow.config removed parameters that are in default and should not be in the boilerplate * added config file page * information about custom databases added * Update nextflow_schema.json * defaults need to be loaded before boilerplate * changing label of unicycler and flye * fixed antismash installation * fixed keggdecoder requires py36 * fixed resfinder module fixed db setup * pipeline fixed for docker Co-authored-by: Abhinav Sharma <[email protected]> * removing travis CI file * Update CHANGELOG.md * wget tries 10 times before hanging * fixed indentation * database download modules retry 2 times before raising errors * Delete update_database_image.sh script not useful anymore * Update manual.md * Update README.md * Update manual.md * update bacannot_db information * Update manual.md * to avoid errors, load either pgap or tigrfam in prokka Instead of loading both, as PGAP contains TIGRFAM, either load one or another. User selects. Default is loading TIGRFAM which is smaller. Also updated information about it. * Update manual.md * updated prokka db information * Update manual.md * Update README.md * Update manual.md * Update manual.md * Update manual.md Co-authored-by: Abhinav Sharma <[email protected]>
This PR builds upon the discussion done in #38 .
I'm not sure how to test locally, so I've initiated this PR to track all changes.
Unless I have ignored something crucial, as of now the PR implements the refactor of identifiers used in the modules/workflows.
Happy to accomodate any change request :)