-
Notifications
You must be signed in to change notification settings - Fork 737
Cid store quick wins #5945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cid store quick wins #5945
Conversation
Signed-off-by: jorgee <[email protected]>
Missing parts discussed today: Bugs:
Other:
|
Based on our discussion today, let's make it so that you don't need to specify Here is a proposed structure for task inputs and outputs: {
"inputs": {
"val": { /* map of val names/values */ },
"env": { /* map of env names/values */ },
"file": [ /* list of file/path inputs */ ],
"stdin": "<hash>/command.in" // or null if not specified
}
} {
"outputs": {
"env": { /* map of env names/values */ },
"eval": { /* map of eval names/values */ },
"file": [ /* list of file/path outputs */ ],
"stdout": "<hash>/command.out" // or null if not specified
}
} This is how I modeled task inputs/outputs in #4553 (an experiment for static types) and it worked well. See for example ProcessInputs and ProcessOutputs. |
Signed-off-by: jorgee <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Let's move the open points into a new PR |
Not sure to agree on this, there should be a convenient to access outputs both for workflow and tasks.
I think we should unify both inputs and outputs and list of objects. The grouping can be a bit readable for humans, but the real consumer should be indexing sub-system. Having a flat, easy predictable scheme, likely would simplify things |
@pditommaso I would not rush to try to fit tasks and workflows into the same model. They are similar but not exactly the same. Instead we should think about the best way to model task inputs and outputs on their own terms. The structure I propose, I think is actually easier to consume for both users and the data provenance. Rather than searching through a list, you simply look up a value by name in a map. For a workflow run,
But I don't know if all that is really needed. The simplest thing would be to just access output files as @jorgee if you understand enough the structure I proposed, can you make an initial PR for it? Should be easier to discuss there |
Not sure it's worth, at least in this milestone. It would be better to focus on simplify and consistency, hence my suggestion to use always a collection both inputs and outputs. |
I partially agree with both comments. I like maps, but I agree with @pditommaso that we should refer to outputs in the same way for workflows and tasks. So, Regarding the structure of the output, I was trying to model them as a List of Writing them to a list or a map depends on what we want to support. The reason why I changed workflow outputs to a map is that it is very easy to refer to a single output ( |
Should be a good topic to discuss on Friday. In my view, the correct model is also the simplest one in this case. |
Instant
object instead of strings in model objectstaskRun
andworkflowRun
instead ofpublishBy
,runBy
, etcWorkflowResults
andTaskResults
renamed byWorkflowOutputs
andTaskOutputs