Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Versioning and resource status #1238

Closed
riccardoAlbertoni opened this issue May 11, 2020 · 28 comments · Fixed by #1257
Closed

Versioning and resource status #1238

riccardoAlbertoni opened this issue May 11, 2020 · 28 comments · Fixed by #1257

Comments

@riccardoAlbertoni
Copy link
Contributor

riccardoAlbertoni commented May 11, 2020

Originally posted by @andrea-perego in #93 (comment)
I don't remember if this was already mentioned, but another aspect of versioning may concern the resource "status" - e.g., draft, stable, deprecated, withdrawn.

The EU Publications Office maintains some reference code lists - the two that may be most relevant here are:

E.g., the dataset status code list above includes the following statuses (alphabetically ordered):

Code Label Definition
COMPLETED completed This dataset is considered to be complete, it holds all information that is intended.
DEPRECATED deprecated It is recommended that the contents of this dataset be no longer used.
DEVELOP under development This dataset is currently being assembled. It may be in an incomplete or faulty state.
DISCONT discontinued This dataset is no longer produced or updated.
WITHDRAWN withdrawn This dataset is no longer meant to be published.

The concept status code list includes additional statuses.

This information is clearly useful for administrative purposes, but relevant as well for users.

E.g., in the JRC Data Catalogue, these statuses determine where a dataset can be published (e.g., a dataset in draft status is not supposed to be published in production). On the other hand, deprecated, discontinued, or withdrawn records are not removed from the catalogue (because of the persistence policy we have in place), but they are "marked" as such, so that users are aware they shouldn't be used or are not longer available.

@dr-shorthair
Copy link
Contributor

ISO 19135 has a codelist with the following Status flags:

  • invalid
  • submitted
  • valid
  • superseded
  • retired

The Registry ontology was partly inspired by 19135, but elaborates the status flags further

    reg:statusNotAccepted         - corresponds to ISO 19135:2005 'notValid'
        reg:statusSubmitted       - corresponds to ISO 19135:(draft) 'submitted'
        reg:statusReserved        - flags a reserved entry, same semantics as statusSubmitted
        reg:statusInvalid         - corresponds to ISO 19135:(draft) 'invalid'
    reg:statusAccepted
        reg:statusValid	          - corresponds to ISO 19135:2005 'valid'
            reg:statusExperimental   
            reg:statusStable
        reg:statusDeprecated	
            reg:statusSuperseded  - corresponds to ISO 19135:2005 'superseded'
            reg:statusRetired     - corresponds to ISO 19135:2005 'retired'

See https://github.com/UKGovLD/ukl-registry-poc/wiki/Principles-and-concepts#status-and-life-cycle and http://purl.org/linked-data/registry

@andrea-perego
Copy link
Contributor

Thanks for pointing to the UKGovLD registry project, @dr-shorthair . I've been following this work since it started, and I think it provides a good reference for both versioning and resource lifecycle.

BTW, I think the latest version of this work is here:

https://github.com/UKGovLD/registry-core/wiki/Principles-and-concepts

Besides the lifecycle / status part, it may be worth looking also at the versioning part:

https://github.com/UKGovLD/registry-core/wiki/Principles-and-concepts#history-and-versioning

In particular, the description of the "version" vocabulary:

https://github.com/UKGovLD/registry-core/wiki/Principles-and-concepts#versioned-types

@riccardoAlbertoni
Copy link
Contributor Author

riccardoAlbertoni commented May 25, 2020

Another possible candidate property to consider is adms:status.
The property adms:status links the status of the Asset or Asset Distribution in the context of a particular workflow process.

No domain is specified for this property, so the property can be applied to entities without inferring that they are adms:Asset(s). The range is skos:Concept.

AFAIU, ADMS does not define a built-in codelist for status, so the conceptual schema can be chosen by adopters.

@andrea-perego
Copy link
Contributor

andrea-perego commented May 25, 2020

@riccardoAlbertoni said:

Another possible candidate property to consider is adms:status.
The property adms:status links the status of the Asset or Asset Distribution in the context of a particular workflow process.

No domain is specified for this property, so the property can be applied to entities without inferring that they are adms:Asset(s). the range is skos:Concept.

AFAIU, ADMS does not define a built-in codelist for status, so the conceptual schema can be chosen by adopters.

I agree that adms:status can be used for this purpose. Actually, this is the property DCAT-AP has been using so far on dcat:CatalogRecord and dcat:Distribution - see:

The original version of ADMS also defined a set of code lists - including one for "statuses", which is used by DCAT-AP.

RDF definition: http://purl.org/adms/status/

The possible values are:

  • completed
  • deprecated
  • under development
  • withdrawn

This code list has not been included in the W3C ADMS specification, but it is however used in one of the examples in Section 4 (https://www.w3.org/TR/vocab-adms/#example-1). Quoting:

1  :Fruit_02 a adms:Asset ;
2    dcterms:created "1999-05-24" ;
3    dcterms:description "Fruits that are found to be generally liked by most people." ;
4    dcterms:publisher <http://example.com/data#org> ;
5    dcterms:title "Fruit I like"@en ;
6    adms:status <http://purl.org/adms/status/Completed> ;
7    dcterms:type <http://purl.org/adms/assettype/CodeList> ;
8    adms:previous :Fruit_01 ;
9    adms:last :Fruit ;
10   dcat:distribution :Fruit_02.csv ;
11   dcat:distribution :Fruit_02.xml .

@dr-shorthair
Copy link
Contributor

I certainly have found the absence of a 'status' flag a regular gap, which I have needed to fill on more than one occasion. SO I have xxx:status predicates littering my own repositories. It looks like this is a very general requirement. I'm not sure if that means that we should adopt the ADMS offering in this space, or 'promote' it to DCAT alongside other general-purpose predicates missing from DCMI ;-) .

@makxdekkers
Copy link
Contributor

@dr-shorthair Are you suggesting that one approach could be to create a dcat:status on the example of adms:status, rather than using the ADMS property directly? What would be our criteria to either use or create new properties in parallel to ones that already exist? I remember we had the discussion in GLD whether to use DCMI terms or create the same properties in the DCAT namespace. The decision then was that the DCMI terms were sufficiently 'trustworthy' to be used. ADMS is in W3C namespace space so it can be trusted, but maybe an issue is that ADMS is not actively maintained? Another thing to take into account is that, as @andrea-perego writes, adms:status is already being used in real-life applications, so we might complicate things for current implementations by replacing it.

@tombaker
Copy link

@dr-shorthair @makxdekkers How many general-purpose predicates are needed for DCAT and missing from DCMI Metadata Terms? Do you happen to have a list?

@makxdekkers
Copy link
Contributor

@tombaker Maybe I wasn't clear: GLD didn't identify things that were missing from Dublin Core. The discussion was whether to mint dcat:title alongside dcterms:title, instead of using the DCMI term.

@tombaker
Copy link

@makxdekkers I understand @dr-shorthair to be saying that it would be nice if there were a dcterms:status, along with various other general-purpose predicates. If so, I would be interested to know more about these "missing" predicates.

@dr-shorthair
Copy link
Contributor

@makxdekkers @tombaker you have correctly identified most of the considerations.

Clearly there was a decision made early in the development of DCAT to re-use elements from existing well-established vocabularies in preference to coinages of concepts in a new namespace. I'm comfortable with that. But it then begs the question about which external vocabularies to trust. DC is so well established that is a no-brainer, (notwithstanding the equivocation regarding the RDF/RDFS formalization). And since we are working in the W3C context, the preference should be to use W3C vocabularies, But there is also the matter of how many dependencies we accept because of this. In principle it should not matter - they are all just URIS - but we know that in practice there is some cost to too many namespaces, because of confusion in the developer community. So I'm reluctant to add a whole new namespace for just one resource. ADMS is an edge case in this context in my opinion. It is W3C, but not widely adopted or actively maintained.

But looking from a DC point of view, DC provides a well-known set of widely used predicates, and I rely on it for many applications. I'm developing a profile of DCAT for the Queensland Government right now, for example, a lot of which is rules around the use of DC properties. But 'status' comes up all the time, and I'm always somewhat surprised to find it missing from the 'standard' RDF vocabs. It is such a common requirement I think it warrants being added to one of the standard vocabs - I'm not sure that ADMS quite makes the cut.

@tombaker
Copy link

tombaker commented May 27, 2020 via email

@pwin
Copy link
Contributor

pwin commented May 27, 2020 via email

@pwin
Copy link
Contributor

pwin commented May 27, 2020 via email

@makxdekkers
Copy link
Contributor

Just to note that some of the 'workflow versioning' is already included in ADMS with adms:prev, adms:next and adms:last.

@dr-shorthair I am intrigued by your criterion on the number of dependencies. How would you define that? Two, three, four, less than ten?

As I see it, you accept none (perspective of schema.org and some other initiatives that I've seen) or you accept any number, as long as the terms in that 'external' namespace suit your needs and an criterion of 'trustworthy' -- to be defined, e.g. stability, persistence, FAIR etc. -- can be satisfied.

@dr-shorthair
Copy link
Contributor

In principle there should not be a numerical limit, and your criteria mostly catch what matters. I would add, however, the challenge that the larger the set of vocabularies, the greater the risk of axiomatic inconsistency, and difficulty in testing and diagnosing this.

In practice, we must also consider the tolerance of the target community of developers. (End users don't need to know.)

@makxdekkers
Copy link
Contributor

@dr-shorthair I would say that problems of inconsistencies and testing difficulties are related to the criteria of trustworthiness. In the case of ADMS, I see no problems with things like stability and persistence (stuff in W3C namespaces are guaranteed to survive as long as W3C is around and probably even longer). But, even if I was part of the team that developed ADMS, I am not trying to 'sell' it to the group, just trying to find out why we wouldn't use properties that are readily available and seem to be doing exactly what we need.

I am not sure how to measure 'tolerance of developers'. Are you saying there is a risk they'll turn away if DCAT incorporates properties from other namespaces, or just one more namespace? Doesn't that invalidate the whole idea of reusing or mixing and matching namespaces?

@dr-shorthair
Copy link
Contributor

problems of inconsistencies and testing difficulties are related to the criteria of trustworthiness

Perhaps I didn't explain what I meant. Two ontologies could each be trustworthy, but are not compatible when used together. This is a particular risk when global constraints are used (domain, range) - you can quickly end up up saying that things like two classes are the same even though they clearly aren't.

@makxdekkers
Copy link
Contributor

@dr-shorthair Agreed as a general issue of ontology mixing. But do you see this problem occur with the reuse of ADMS properties?

@dr-shorthair
Copy link
Contributor

Looking at https://www.w3.org/TR/vocab-dcat-2/#UML_DCAT_All_Attr I note that we refer to the following external namespaces:

  • dcterms - lots
  • foaf - 2 mentions
  • skos - 2 or 3
  • odrl - only 1, but with a whole ODRL structure as the target
  • prov - 2 properties (and more further down), but again with a whole PROV structure as the targets.

I recognise that ADMS is about 'managed assets' so 'status' is in scope. And we already mentioned it in a non-normative clause of DCAT. I'm just a bit queasy about introducing a whole new namespace for just one 'scalar' property. Maybe I should get over that.

@riccardoAlbertoni
Copy link
Contributor Author

sorry, last answer got cut short.... the discussion on versioning then brought us onto workflow. That's a bit like 'status' but it is slightly different. I wonder if we need to separate them in some way? E.g. I can have a draft that is in preparation, or a draft that is completed. So perhaps there needs to be some property, say 'workflowStage' that takes as its object a concept or category

I think we need something similar to adms:status which ranges into skos:Concept or in the worst case, if ADMS can't be admitted as normative namespace, an equivalent newly minted DCAT term.

Also, I think we want to keep this as simple as belonging to a prefigured or user-defined set of statuses. More complex solutions are already covered by other vocabularies, e.g., http://www.opmw.org/model/OPMW provides an explicit definition of Workflows executions and templates by extending PROV-O and Plan-P.

@smrgeoinfo
Copy link
Contributor

Without being familiar with ADMS, and following @dr-shorthair 's concern about incoherent inferences due to multiple ontology imports, it seems to me the operational criteria for whether to import or not should involve what kind of assertions/restrictions/implications are brought by the imported vocabulary. If the import target is simple a bunch of owl:class assertions with rdfs:label and other annotation properties, maybe some rdfs:subClassOf assertions (internal to the vocabulary), then the import is not problematic. On the other hand, if the import has a bunch of domain and range restrictions, transitive, inverse properties, owl:restriction assertions, and other statements that have inferencing implications, then there is good reason to decouple the ontologies-- implement local classes and properties, and offer a mapping ontology that can be imported in a profile to bring in the implications from the more 'semantically rich' ontology. I think this is the pattern used in SOSA and makes a lot of sense to me.

So, what kind of implications does adms:status bring?

@riccardoAlbertoni
Copy link
Contributor Author

Without being familiar with ADMS, and following @dr-shorthair 's concern about incoherent inferences due to multiple ontology imports, it seems to me the operational criteria for whether to import or not should involve what kind of assertions/restrictions/implications are brought by the imported vocabulary. If the import target is simple a bunch of owl:class assertions with rdfs:label and other annotation properties, maybe some rdfs:subClassOf assertions (internal to the vocabulary), then the import is not problematic. On the other hand, if the import has a bunch of domain and range restrictions, transitive, inverse properties, owl:restriction assertions, and other statements that have inferencing implications, then there is good reason to decouple the ontologies-- implement local classes and properties, and offer a mapping ontology that can be imported in a profile to bring in the implications from the more 'semantically rich' ontology. I think this is the pattern used in SOSA and makes a lot of sense to me.

So, what kind of implications does adms:status bring?

I share your concern in general, but here I think we are in the first case. ADMS is very lightweight, AFAIK, no restrictions, nor domain restriction are provided for adms:status. There is a range restriction to skos:Concept, which shouldn't be problematic. Besides, we have already used a bit of ADMS in the non-normative part of DCAT 2 ( see adms:identifier). I do not see issues if we continue to use ADMS in the non-normative part for providing guidelines, similarly to what we did for adms:identifier.

I am not sure we can use w3c vocabularies that are not REC in the DCAT normative part. ADMS is a w3c Note, so we might need extra care in the case we want to have ADMS in the normative part.
One possible way to overcome this concern is to consider minting new equivalent or sub-super terms in DCAT namespace. However, we need very solid motivations for minting adms-"equivalent" terms in DCAT. The decision might also depend on the number of terms that we want to borrow from ADMS ( at the moment there is at least another adms term under discussion see adms:versionNotes for #89)

@riccardoAlbertoni
Copy link
Contributor Author

Just an update on the discussion, part of this issue has been discussed in tonight's dcat teleconference.

An agreement on provisionally including adms:status as a working solution has emerged from the restricted group of tonight's attendants (see discussion and resolution)

@rob-metalinkage
Copy link
Contributor

If ADMS is useful, its non-normative status is an issue - then this is already addressed by the fact ADMS is a profile of DCAT. It can be dealt with as a suggested profile recommendation. What is really achieved by duplication other than potential confusion?

Alignments are good - except there is no explicit behaviour requiring them to be used in a particular way. If you have an alignment, but its in a separate resource, then what would that mean for any statement about conformance to ADMS? How could a user discover that ADMS conformance is available via some inferencing entailment regime using the alignment?

Making it normative is best achieved as a matter of changing the status of ADMS. i think the alternative is to define the ways alignments are used in conformance declarations and determination.

@dr-shorthair
Copy link
Contributor

I'm OK with using adms:status. ADMS is already very much part of the DCAT family anyway.

@init-dcat-ap-de
Copy link

+1 for using existing adms:status in short time but also
+1 establishing it on a long run at w3c dcat level.
The concern on the non-governance of ISA adms is true but could be fixed by the EU ISA DCAT-AP SEMIC team.
side note:
While https://www.w3.org/ns/adms#status namespace is save, the persistent EU-URI of adms is currently the only non-persistant one: https://data.europa.eu/URI.html

Sebastian from https://www.dcat-ap.de/def/

@andrea-perego
Copy link
Contributor

Aspects related to resource life-cycle are now addressed by the ED, as part of the section on versioning:

https://w3c.github.io/dxwg/dcat/#life-cycle

@andrea-perego
Copy link
Contributor

Issue closed as decided in the 11 Nov 2020 DCAT call and following the extension of the versioning section in the ED - in particular, see #1238 (comment).

New issues should be created for further discussion on this topic.

@andrea-perego andrea-perego changed the title versioning and resource status Versioning and resource status Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants