-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PCDM 2.0 #53
Comments
👍 This looks great, it just needs some fixture files we can discuss. Just want to discuss the section under Originally when we did this kind of structure at Princeton we had proxies which were aggregations. There was significant distaste for that, so now there's PCDM Objects which have ordered PCDM objects via proxies. The problem I see is that as far as I can see Ranges within a TopRange are worthless without order. Are we shooting ourselves in the foot by making it optional? Should we use a different ordering paradigm for logical structure? Also, this results in a duplication of all the hasMember statements that exist in the parent pcdm:object, which seems like a waste. Thoughts? |
Tagging @no-reply and @azaroth42 (oops, habit... ;( ) |
@escowles hi, looking at this now and digesting(thanks!). Just some minor question, why a pcdm:TopRange (class) instead of going for a property that points to any existing pcmd:Range? also, if we go for Ranges should we go explicit for sequences (is pcdm:AlternateOrder this?) (i mean to match other IIIF concept also, like the canvas concept, which is by the way very cool). I think i end always asking the same, sorry for that, but i would love if we could make more use of properties/predicates and not only depend on classes. |
Ah, @escowles, wasn't there a proposal in hybox to have an explicit hasFileSet predicate? |
@DiegoPino, there was a discussion about whether we needed TopRange or could just use Range with a different predicate. It would be more like Object if we just had Range, so I think it's worth revisiting. |
@tpendragon Oh, that's right — I +1'd that even. |
@tpendragon re: Ranges — I agree that we don't need to support multiple orders, and so it seems like we could do something simpler for Ranges. Since the Ranges aren't meaningful outside the context of the Object they are structuring, the ordering info could be directly on them. In that way, Ranges start to look like a different kind of Proxy... |
With the model as it is in the wiki, I think this is a 2-page book with a logical order that has one chapter.
Note: This does NOT have any of the pcdm:Use predicates, which I think @barmintor would be interested in talking about, either as part of this or in another ticket, using this fixture. |
@tpendragon @escowles i will give this a try. I had done some attempt to match some of this ideas here |
Any more reactions to this in general? |
I haven't had a chance to look at this yet, but will do so this evening. Please expect my comments tomorrow. |
Just wondering about the statement Note on transitivity: hasMember is not defined as transitive, but applications may treat it as transitive as local needs dictate. Seems that if you want to change how a property operates, then you subclass the property. Or am I mis-understanding how ontologies work? |
Also, why do we need a pcdm:hasFileSet property when you could just look for a pcdm:FileSet? |
and lastly, if I have a single page of a book with the associated images. What benefit does the FileSet give me? Could the pcdm:FileSet be an optional construct. So (forgive me)...
or
I understand the multiple original use case, but those seem less common than the norm. |
@whikloj: Thanks for the comments! Here's my thoughts:
|
👍 on consistency. |
@scossu, ok with consistency. |
@DiegoPino: I agree, I think FileSet should be required — that's the main point of PCDM 2.0, to take some of the optional things from Works that make it easier to implement PCDM and make them a required part of the core data model. |
@escowles, I can accept FileSet for consistencies sake. As for transitivity, perhaps we should be looking to a more complex and descriptive ontology and bring OWL into the mix? Which brings up the question about adding some of the possible structure of PCDM to the ontology, via some obvious and (I am guessing) generally assumed domains and ranges for the properties? I apologize if this has been discussed previously. |
@whikloj When we reach some consensus on the classes and properties, we can move on to updating the ontology, with appropriate domains and ranges. E.g., I'd expect I am very skeptical of OWL: it is much more expressive, but also much more complex, doesn't help with validating that data conforms to the ontology, etc. IMHO, I think we should stick with simple RDFS, and rely on application profiles, good documentation, implementations, etc. to show how the classes and properties should be used together. |
I think the big win here is how easy it would be to get to FileSets, and being able to show that FileSets are NOT ordered. |
I agree with @escowles about OWL. However great OWL is, a general-purpose ontology such as PCDM should have the lowest barrier possible and express everything with RDFS at the most. |
@escowles, just my 25 Cents (quarter dolar!) on this: OWL is not one OWL, OWL 2 is so awesome. There are profiles simple enough to make our life easier. The ability to express plays also for restricting. domains and ranges are almost annotations useful for human beings (and intersections, functional properties, etc, etc). RDFS: easy to write, not to reason on (if even). I can't validate (using a computer) anything right now with PCDM 1.0. @barmintor asked @ruebot some time ago for an OWL version i have for PCDM, sadly i could not respond because i'm working on this still + so much more, but with a few simple rules i can, based on this WIP validate a PCDM based graph, or even suggest, realtime to a user, what can be put into a pcdm compliant graph, or better, predict using SPARQL which predicate chain i need to fetch a particular resource. Sadly RDFS does not allow such fine tooling. Edited: Sorry if out of context. I'm 100% for making FileSet required. |
While I appreciate the suggestion that good application profiles, documentation and implementations are the examples to lead from. From my perspective application profiles and implementations are quite often one and the same. That being said I have tried to understand from the pcdm-works, hydra-works and curation_concerns what was being implemented and what is possibly on the horizon. Perhaps due to my lack of Ruby knowledge that seems to be a less than effective method. As a common data model, I feel like maybe the data model should be documented in its own right separate from any implementations. Then any new contributors or adopters will not have to browse the source code of one of the implementations to see how these objects work together. This is why a more strict and expressive ontology seems like a good starting point so we can understand what is possible and what is forbidden. Otherwise we are (as in the case of transitivity) leaving that up to the discretion of the implementor and the "common" in PCDM is lost. |
I'd like to clarify the transitivity situation: When we considered the issue previously, we resolved that Consider: :obj1 pcdm:hasMember :obj2 .
:obj2 pcdm:hasMember :obj3 . PCDM does not support inferring We punted on defining either of these, since no one had identified a use case where explicit transitivity/intransitivity needed to be preserved between implementations. The note quoted is somewhat clumsy in my opinion, but it's correct insofar as the inferring the transitive relation internally won't lead to inconsistent interpretations. Applications may make this inference without issue, provided:
Obviously, we will need to address this as soon as we have a realistic use case that requires either form of interoperability. In case it's not clear from the above, I'm 👎 on bringing OWL into the mix to deal with transitivity. Again, the apparently loose definition we have now is not loose at all, but both precise and exactly the thing we want.
If and when we do come to defining transitive/intransitive membership predicates, sub-properties might not be the right thing. Consider, e.g. SKOS's approach, which makes the transitive relation a super-property of the non-transitive one (see the note at the bottom of the section for an explanation). This is something to be considered carefully; and, again, with use cases in hand. |
Thanks for the clarification @no-reply on transitivity.
This again means that (unfortunately) the common data model is not really common except if your implementation and another implementation choose to follow the same pattern. |
Are you saying you have use cases for transitive membership? If so, I think we should open up an issue. The statement on transitivity here is the same as the one we make in the current iteration of PCDM. |
I think this is incorrect. It's common in that the supported semantics are clear: if your use case requires communicating transitive relations, that is not supported by I don't think interoperability is served, in this case, by over-specifying. Again, we have so far been unable to identify a use case that depends on a transitive membership property; much less one that requires the base membership property be transitive. In contrast, the common |
👍 to moving the transitivity discussion if there's any proposal to reconsider the past decision. On the other hand, I think the question about the clarity of the transitivity note is relevant and belongs as a part of the discussion of this topic.
👍
It seems from the discussion above that everybody is, at best, lukewarm on |
Yeah I have no problem shifting this, I think my issue is as a non-hydra implementor I would find it easier to understand with a more specific data model. We don't have an existing data model to work from and as a smaller development community we are moving slower. Thanks to all for the discussion it is always helpful.
👍 |
I don't think I agree with this anymore. It's less...noisy, without the predicate. But if it's significantly easier to use, isn't it our responsibility to make implementation feasible and performant? |
A few more comments and questions about the proposed text: Logical Structure:I think this could use some refinement. The initial text tries to motivate this in a general way by pointing to a distinction between "physical organization" and "logical structure". It's not clear to me what counts as "physical" in this case. In the individual classes, the IIIF use case is appealed to. I understand that the Property DefinitionsIn the text, each property is redefined in the context of the domain class ( The properties should be defined normatively one time, and any commentary about their usage in the context of a given domain class should be confined to non-normative usage notes and examples. For me, this especially includes discussion of "component parts" and similar. File Set
I'm tempted to strike "complete" from the above text. Is this constraint intended to be as strict as I'm reading it? I.e.: Under the proposed text, an institution that holds a fragment of some larger This seems to significantly complicate any case where an institution is trying to provide materials on a "best copy available" basis. Does this come with a significant interoperability benefit by way of trade-off? |
👍 This was originally made to avoid having to create what's effectively an entirely new Work with the same members, to represent a different structure. That being said, we totally ended up with that. This could now be the exact same thing if you had no
I think I'm 👍, but interop becomes hard here. If I can't trust a fileset to be a representation of the object it's on, how do I know if I should use that for display?
I don't think I understand. Can you expand on this? |
I'm fine with striking "complete" — the idea being that a FileSet should represent the Object it's attached to, but it's fine if it is a partial copy or otherwise incomplete. I think the key thing is to say that if you have an Object that represents a book and 100 images that represent pages, there should be Objects to represent those pages, not just 100 FileSets attached to the book Object. On the other hand, if you had a PDF of 75 pages of a 100 page book, it would be fine to attach that to the book Object. |
Right. I think this belongs in the realm of "best practices". I'm thinking of a research data use case (due to @grosscol, as discussed in Project Hydra Slack) where an experimental observation is made up of multiple distinct files that must sit together to make sense. There's no particular reason, in this case, why we should model each part as an Or, bringing it back to the book use case: take a book whose scans are broken up into chunks in arbitrary places (e.g. the first 75 pages, and the remaining 25). I think it should be acceptable for each of those PDFs to occupy a |
@no-reply: I agree that the current scenario is pretty much supporting IIIF. But I'd like to explore other cases and try to make sure it handles more than just that. E.g., UCSD has some research data objects that include derivatives organized in groups by X/Y/Z dimensions (see http://library.ucsd.edu/dc/object/bb2322141x). I've seen other data that has similar kinds of permutations broken out. I think you could use a TopRange/Range to organize those derivatives. Good point about the property definitions — we should probably define the properties at the top, and then have each class have notes about the expected usage of the property in its context. @tpendragon: I hope the implementation of TopRange/Range can be simpler than Object because of the constraints placed on TopRange and Range. Maybe this means they aren't subclasses of Object? |
👍 My sense is that there is something more here, but it's not concrete enough yet. |
I think so too, but it would be a totally different construct. Is PCDM open to that? Something like
Edit: but not hasMember. Ranges and all that, ya know. |
Just as a side note, that can be talked about in more detail once the transitivity issue is created, I wonder if Ranges could be made transitive too. Its original use case was a table of contents - if page 3 is in Chapter 3a, it's definitely in Chapter 3. Ranges are also always in the context of the thing it branches off of. If we say range membership is transitive, then we can say something along the lines of "if you have a structure with transitive membership, describe that structure using a TopRange." |
Also, in the LDP projection, there would be a FileSet container (that can Rob Rob Sanderson |
I agree that this reduces the end-user complexity of queries by introducing a property path. Whether this query is actually faster depends on a bunch of factors (typical cardinality of properties/type, query optimization approaches, implementation details, etc...). But the point is conceded that adding the property makes query implementation easier.
I'm not sure I follow this. Doesn't this elimination of |
No, because ?this (hasRelatedObject) ?x ; ?x hasFile ?y would not find On Thu, May 19, 2016 at 10:33 PM, Thomas Johnson [email protected]
Rob Sanderson |
Except the domain of hasFile should -probably- be FileSet. Another benefit of hasFileSet would be you could change the range of hasFileSet to FileSet to make the data model a little more clear. I was going to recommend changing the range of hasMember to Work too, but there's no Work anymore, and a FileSet is an object, sooo... |
I also think there is a semantic difference between hasMember and hasFileSet: hasMember represents a whole-part relationship where hasFileSet represents a thing-representation relationship. |
I think your functional elimination of ?this ?p ?x .
?x pcdm:hasFile ?f . If that isn't equivalent, I'm missing something.
👍 to this. Perhaps |
The discussion here stalled out, so I went ahead and made a few changes based on the discussion here:
https://github.com/duraspace/pcdm/wiki/PCDM-2.0 |
Thanks for moving this along, @escowles! I missed where we talked about making FileSets not a subclass of |
@mjgiarlo FileSets are still a subclass of ore:Aggregation, just not pcdm:Object any longer. This wasn't called out explicitly, but I thought it was part of making pcdm:hasFileSet not a subproperty of pcdm:hasMember. Does that make sense? |
👍 otherwise FileSets can have members and files and relatedObjects, as a subclass of Object. None of which we want. |
So I'm working on BookConcerns right now, and am starting to think about moving over Plum's structure editor, but to do that I'd like some level of agreement on what Logical Orders look like. I'm happy to be overridden here, but my complaints are described above: The implementation of "a logical order is a related object that has all the same members, but also sub-resources for chapters, done with proxies" is expensive and hard to maintain. Things like deleting a file set quickly becomes nasty to sync up. Can we do better, or is matching the ordering and structure for logical structure important enough to pay the price? Edit: Another thing - can we somehow build in the fact that items shouldn't be re-orderable by the logical structure into the data model? Or maybe it should be re-orderable, but that feels odd. |
@tpendragon: is this a modeling concern, or an LDP implementation concern? And does having (Top)Range be a subclass of pcdm:Object imply that we'd implement it the same way in LDP? IMO, it would be fine to implement (Top)Range as a single LDP resource with hash URI children, instead of IndirectContainers with Proxies, etc. And I'd be happy to reconsider whether these should be subclasses of ore:Aggregation instead of pcdm:Object if that is a barrier to this implementation strategy. |
Based on mailing list discussion, I'm closing this omnibus issue and creating separate issues for the main topics of discussion:
|
For tracking proposed changes to PCDM in the wiki page:
https://github.com/duraspace/pcdm/wiki/PCDM-2.0
The text was updated successfully, but these errors were encountered: