-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to add custom traces and use them as metadata #4425
Conversation
Signed-off-by: Jordi Deu-Pons <[email protected]>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
I very much like the idea of "custom traces". That can be used to collect input sizes (for instance size in bytes of some files, or number of lines, etc). Question about the interface |
What would be a use-case for this PR? |
I see two use cases:
|
It's true that when you use |
TODO:
|
I'm not convinced we should proceed down this path. It would lead to even more do-it-yourself metadata, which the nextflow runtime would be totally unaware of. Think we should have a more controlled approach |
Let's take this discussion back to the original issue #4386 |
I've looked more into this, and while the idea is interesting and we should pursue it to some extent, what I'm concerned about between the other things is that it's very fragile. it's enough an extra new-line introduced by a custom command to break the trace file. Also, an invalid command can break the command wrapper script, and alter the time metrics collection. If we want to go ahead with the idea of collecting custom metadata running third-party tools, it should be independent by the trace mechanism |
When you say independent... do you mean to isolate them in a new function at the |
Yes, possibly. Likely it should be executed just before or after the task command. #540 |
Description
This PR allows the user to add custom traces on each process. Also exposes them as workflow metadata, so the user can use them without any extra channel.
This is a more generic approach to solve the version tracking problem described at #4386
The idea is to allow the user to collect custom traces before running the command. The custom traces is a map where the key is the name of the custom trace and the value is an string with the bash script to run to collect that trace. The output is always parsed as an string.
Then all the traces of the completed and resumed tasks are available as workflow metadata at
workflow.traces
. Ifworkflow.traces
is used as part of a process script it will contain all the traces of completed task just before submitting that process.You can use
workflow.traces
atworkflow.onComplete
if you want to store custom traces into a file or do something else.Example pipeline
See the whole pipeline here
main.nf
Notes
workflow.traces
is an array ofTraceRecord
of the completed tasksTraceRecord
to expose traces to the user, would be better to expose them using a custom class more user friendly.