-
Notifications
You must be signed in to change notification settings - Fork 635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve -dump-hashes
output
#4369
Conversation
Signed-off-by: Ben Sherman <[email protected]>
✅ Deploy Preview for nextflow-docs-staging canceled.
|
To tell the true I find this output much more confusing compared the current one
|
Why? It's the same content but on a single line per task. And it's not really meant to be viewed in that form... imagine a pipeline run with 10,000 tasks, you wouldn't be viewing those logs. Instead, you should use the bash script to extract and sort the logs, then use a diff viewer to highlight the changes. |
because the current format allows to diff two logs and find where it diverges. With the new one is quite everything in one row |
In other words, there's no point in making the output prettier, because the tasks will be logged in arbitrary order anyway. You have to take some extra steps to get it into a usable format, which primarily means one line per task. |
It can be added as an optional format, but it should be human readable first |
It's still human readable though |
but you can't really diff the logs because the tasks are printed in arbitrary order. you have to re-order the tasks manually. I don't see the issue with putting everything on a single line, just use line wrap in your editor |
I don't see how your screenshot illustrates your point... it is mostly noise 😄 Fair execution would improve somewhat, but for large pipelines you could still have different processes executing in arbitrary order. So the current format is still very difficult to use But I agree that I would rather not layer hacks upon hacks. That is why in the original issue I suggested that we find a way for Nextflow to automate the whole thing and produce the final diff:
But that solution is more involved, and restructuring the hash dump to one line per task is a simple fix that greatly improves the overall debugging experience, even if some manual effort is still required. So here we are 😄 If it will make you happy, I will change the one-line format to json and make it opt-in with |
Okay it's ready |
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
modules/nextflow/src/main/groovy/nextflow/processor/TaskProcessor.groovy
Outdated
Show resolved
Hide resolved
…sor.groovy [ci skip] Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Ben Sherman <[email protected]> Signed-off-by: Paolo Di Tommaso <[email protected]> Co-authored-by: Paolo Di Tommaso <[email protected]>
Close #4367
Re-structures the
-dump-hashes
output to be on a single line per task, which makes it easier to produce a diff from the log files.To test it, do an initial run, modify the pipeline, then a resumed run:
Then run the following bash script to filter the logs and produce the diff: