-
Notifications
You must be signed in to change notification settings - Fork 27
Initial draft of the SSSOM/RDF spec. #469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This is the complete proposal for the specification of the SSSOM/RDF serialisation format, according to the current state of the discussions about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great start. Lets go a bit back and forth over this; I made my first round of comments with the biggest bomb is to specify a bespoke serialisation of curie_map.
Use BCP14 keywords more consistently. Add a "special consideration" section to explain the possibility of injecting "direct" SPO triples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some cosmetic things, I will ask two reviewers to chime in so we can merge this asap!
This is huge work, thanks @gouttegd!! much appreciated!
i'm going to run prettier on the markdown before reviewing it, after damien has a chance to address your suggestions |
Clarify that a "string literal" is a `xsd:string` literal -- this has the side-effect of clarifying that it cannot be a langString. Also fix incorrect use of pav:authoredBy to represent the creator_id slot.
@cthoyt As you wish, but please note that you can also view a “rendered” version directly on the branch: https://github.com/mapping-commons/sssom/blob/rdf-spec/src/docs/spec-formats-rdf.md |
This command: `npx prettier --prose-wrap always --check --write src/docs/spec-formats-rdf.md`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all for being patient and @gouttegd for writing this important groundwork. I had a full read and left some minor comments.
One larger question is still about what's the role of this specification? Is the goal to make sure that we can implement RDF serialization outside of a LinkML context (which I consider very important, it's not reliable to defer to that implicitly)
In many cases, I think we should be much more explicit. Let's consider in a follow-up to develop example a suite SSSOM/TSV and SSSOM/RDF input/output pairs against which a serializer can be valiated
Err, to allow developers to write SSSOM/RDF serialisers and deserialisers? The same way the SSSOM/TSV specification is there to allow developers to write SSSOM/TSV serialisers and deserialisers. What else do you think the specification of a format is for?
Yes. Because LinkML is made by and for Python developers. The LinkML runtime (with its built-in RDF serialisers and deserialisers) is only available in Python – there is no support whatsoever for any other language. Programmers in other languages have to implement RDF serialisation “outside of a LinkML context”, because there is no such thing as “a LinkML context” for them. And as if that was not enough, it so happens that LinkML barely bothers to fully describe the way their RDF serialiser and deserialiser work (how objects described in a LinkML schema are turned into a RDF graph, or read from a RDF graph), which means that in practice, without the formal specification that we are trying to make here, the only way for someone wishing to read/write SSSOM/RDF files while having the silly idea of not working in Python is to reverse-engineer SSSOM-Py – that’s what I had to do when I added RDF support in SSSOM-Java. I highlighted at the very beginning of the discussion about the RDF serialisation that this was not acceptable for something claiming to be a “standard”. |
Besides, the LinkML-generated serialization does not always do what we’d want. For example, it will serialize a mapping record with an explicit [] a owl:Axiom ;
owl:annotatedSource UBERON:0000001 ;
owl:annotatedProperty semapv:crossSpeciesExactMatch ;
owl:annotatedTarget FBbt:00000001 ;
sssom:mapping_justification semapv:ManualMappingCuration .
sssom:record_id "https://example.org/mymapping1" . whereas it was quickly agreed in the discussion about the RDF serialisation that, whenever a <https://example.org/mymapping1> a owl:Axiom ;
owl:annotatedSource UBERON:0000001 ;
owl:annotatedProperty semapv:crossSpeciesExactMatch ;
owl:annotatedTarget FBbt:00000001 ;
sssom:mapping_justification semapv:ManualMappingCuration . |
Resolves [#421, #457]
docs/
have been added/updated if necessarymake test
has been run locally[ ] tests have been added/updated (if applicable)This is the complete proposal for the specification of the SSSOM/RDF serialisation format, according to the current state of the discussions about it.