-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate "index" and "search" into substeps for nextflow pipeline #10
base: main
Are you sure you want to change the base?
Conversation
Here's what the CLI looks like so far. Is it clear that
My thoughts:
|
Or maybe |
Here's what the command line looks like so far:
|
Gave this some more thought. Leaving out rocksdb for a sec, this is what I see
Then for
and for
Am I missing any input or output files here? |
@heuermh I really like this naming! Thank you so much! I have a minor correction and a question.
For
Example visualization:
|
Thanks for the clarification! Note some tools may write out partitioned Parquet format, as in one or more partition files in a directory. E.g. duckdb does this when you specify
I presume under the hood those two subcommands are exactly the same? The query vs target distinction is more something to be handled by the caller, I think. |
Addresses #8 and #7. Separating out these steps will make writing Nextflow pipelines easier because while having the commands be wrapped is nice, currently, some of the steps take a long time (e.g. #9 has been running for ~7 days, boo) and it would be good to have them be separate Nextflow processes in https://github.com/seanome/nf-core-kmerseek/