Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - BEP 028 prov #750

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

WIP - BEP 028 prov #750

wants to merge 3 commits into from

Conversation

yibeichan
Copy link
Collaborator

This PR was made during the 2024 BIDS meeting in Seattle after discussions with @effigies, some other discussions can be found here #bids-standard/BEP028_BIDSprov#129
BEP028_prov doc see here

Types of changes

context.jsonfile

  • replace openprov context with BEP028 prov context
  • adding customized fields wasStartedBy, wasEndedBy, and hadActivity.
    • Those three were used in audit.py but not defined in the prov context. Those are important for connecting audit_start, monitor, and finalize_audit.
    • If we view audit_start, monitor, and finalize_audit as subactivities of the activity of a given task, the current BEP028 doesn't define the connection between activities but only allows connecting two activities through entities.

audit.py

  • fixed typos such as updating startedAtTime as StartedAtTime
  • dropped entity_generated, replaced it with a more specific field; in this case is runtime but it can be something else depending on the situation. entity_generated can be confused with entity in the prov doc, which is supposed to be either input files or output files related to an activity.
  • changing "AssociatedWith": version_cmd to
    a dictionary, which is supposed to be an Agent, but I couldn't get the Label, which is Name of the software
    • I put a comment about command, currently we only get command when the task is a shell command task. If we want to be consistent with the prov, we probably should enable command for function tasks too. See the anonymous (that's Chris) comment on the prov doc.

Summary

so the above changes are based on what we have in the prov doc, we probably need to dive deeper into audit.py and messenger.py (worth discussing in the next pydra meeting @djarecka)
@effigies suggested we collect messages for a workflow to generate prov records.
for example, we can collect all messages at finalize_audit level into FileMessenger using collect_messages

Checklist

  • I have added tests to cover my changes (if necessary)
  • I have updated documentation (if necessary)

@satra
Copy link
Contributor

satra commented Apr 12, 2024

@yibeichan - i don't think we should drop the prov context - we are still based on the prov model. that's more general and the bids context should still conform to prov.

@yibeichan
Copy link
Collaborator Author

@satra do you mean the original one? openprov context
so we don't have to directly use/cite BEP028 prov context in pydra?

@satra
Copy link
Contributor

satra commented Apr 12, 2024

essentially bids prov was a simplification to keep the keys readable to people and perhaps should stay. however, the openprov context should be included or referenced.

here is a linkml based context generated for prov: https://github.com/linkml/linkml-prov/blob/main/prov/jsonld/prov.model.context.jsonld

technically speaking we should be able to generate whatever we want as prov and then transform it to bep28 if we wanted using the bids context. pydra doesn't have to generate bep28, but could result in bep28. the following should work.

pydra jsonld + context -> expand -> compact/frame using bep28 context.

@djarecka
Copy link
Collaborator

@satra - what is relation between the context from the openprov repo and the one from linkml? do you suggest switching to the one from linkml.
Seems to me that openprov doesn't have many useful objects, like wasGeneratedBy, etc, that is part of bids-prov

When you're saying transformation to bids-prov, you mean we would have to ignore some of the properties. I understand that they both point to prov model, just the coverage is different.

@satra
Copy link
Contributor

satra commented Apr 12, 2024

the linkml one looks more comprehensive and since chris did it, i have some trust in it as well. so i would lean towards using it and that also allows us to use the linkml model for other things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants