Skip to content
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
## Next

- Add the concept of "propagatable slots".
- Add the `curie_map` to the model (instead of it being a specificity of the SSSOM/TSV format).

## SSSOM version 0.15.1

Expand Down
7 changes: 2 additions & 5 deletions src/docs/spec-formats-tsv.md
Original file line number Diff line number Diff line change
Expand Up @@ -106,11 +106,9 @@ SSSOM/TSV files MUST be encoded in UTF-8 ([RFC 3629](https://datatracker.ietf.or

All identifiers in a SSSOM/TSV file, that is, all the values of slots typed as [EntityReference](EntityReference.md), MUST be serialised in [CURIE syntax](https://www.w3.org/TR/curie/). SSSOM/TSV parsers SHOULD reject files containing identifiers serialised as IRIs.

To allow unambiguous resolution of all CURIEs present in a SSSOM/TSV file, the metadata block MUST contain an additional `curie_map` field, which is a map of prefix names to IRI prefixes. The `curie_map` field SHOULD appear at the beginning of the metadata block.
As stated in the description of the model ([Identifiers section](spec-model.md#identifiers)), all prefix names used in CURIEs MUST be declared in the `curie_map` slot of the mapping set object, unless the prefix is a “built-in” prefix (in which case it MAY be omitted). SSSOM/TSV parsers MUST reject a file with undeclared, non-built-in prefix names.

Any prefix name used in a SSSOM/TSV file MUST be declared with a corresponding entry in the CURIE map. SSSOM/TSV parsers MUST reject a file with undeclared prefix names.

Prefix names listed in the table found in the [IRI prefixes](spec-intro.md#iri-prefixes) section are considered “built-in”. As such, they MAY be omitted from the CURIE map. If they are not omitted, they MUST point to the same IRI prefixes as in the aforementioned table.
A SSSOM/TSV writer SHOULD refuse to serialise a mapping set that contains IRIs that cannot be contracted into CURIEs because there is no suitable prefix declaration in its CURIE map. The use of a custom, ad-hoc logic to infer a possible prefix name where none has been provided (e.g., “if the IRI ends with a `ZZZ_NNNNNNN` pattern, turn it into a `ZZZ:NNNNNNN` CURIE”) is strongly discouraged.


## Propagatable slots
Expand Down Expand Up @@ -203,7 +201,6 @@ When writing the metadata block, a canonical SSSOM/TSV writer:
* MUST serialise multi-valued slots as YAML “block sequences” ([YAML Specification §8.2.1](https://yaml.org/spec/1.2.2/#821-block-sequences)) – even when the list of values contains only one item;
* MUST serialise scalar values in YAML “plain style” ([YAML Specification §7.3.3](https://yaml.org/spec/1.2.2/#733-plain-style)) whenever possible, otherwise in “double-quoted style” ([YAML Specification §7.3.1](https://yaml.org/spec/1.2.2/#731-double-quoted-style));
* MUST serialise the slots in the order they appear in the [“Slots” table](MappingSet.md#slots), in the documentation for the `MappingSet` class;
* MUST write the `curie_map` at the beginning of the block, before any other slots;
* MUST NOT include in the CURIE map the prefix names that are considered “built-in”;
* MUST NOT include in the CURIE map any prefix name that is not used anywhere in the set;
* MUST sort the prefix names in the CURIE map in lexicographical order.
Expand Down
9 changes: 9 additions & 0 deletions src/docs/spec-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,15 @@ The `MappingSet` class represents, well, a set of individual mappings, which are
Of note, within a set, a mapping may not necessarily be uniquely identified by the combination of its four mandatory slots (`subject_id`, `predicate_id`, `object_id`, and `mapping_justification`). A set may very well contain several mappings with the same subject, predicate, object, and justification, but that differ on some of the other, complementary slots.


## Identifiers

Throughout the model, identifiers to external resources are represented using the custom type [`EntityReference`](EntityReference.md) (based on the LinkML type [`uriorcurie`](https://w3id.org/linkml/Uriorcurie)), which accepts both full-length IRIs and [CURIEs](https://www.w3.org/TR/curie/) as possible identifier formats. (Note however that serialisation formats may mandate the use of one identifier format over the other; for example, the [SSSOM/TSV](spec-formats-tsv.md) format requires the systematic use of CURIEs, whereas the [OWL/RDF](spec-formats-owl.md) format conversely requires the systematic use of IRIs).

Whenever the CURIE syntax is used in a mapping set (whether this is by choice of the SSSOM producer, or because it is mandated by the serialisation format), all CURIEs MUST be unambiguously resolvable into corresponding full-length IRIs without requiring any external resources. This means that any prefix name used MUST be properly declared in the set’s `curie_map` slot, which is a dictionary associating a prefix name to an IRI prefix.

By exception, prefix names listed in the table found in the [IRI prefixes](spec-intro.md#iri-prefixes) section are considered “built-in”. As such, they MAY be omitted from the `curie_map`. If they are not omitted, they MUST point to the same IRI prefixes as in the aforementioned table.


## Propagation of mapping set slots

As mentioned briefly above, there are two different types of slots in the `MappingSet` class:
Expand Down
17 changes: 17 additions & 0 deletions src/sssom_schema/schema/sssom_schema.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,18 @@ types:
- https://mapping-commons.github.io/sssom/spec/#tsv

slots:
prefix_name:
key: true
range: ncname
prefix_url:
range: uri
curie_map:
description: A dictionary that contains prefixes as keys and their URI expansions as values.
range: prefix
multivalued: true
inlined: true
see_also:
- https://github.com/mapping-commons/sssom/issues/225
mirror_from:
description: A URL location from which to obtain a resource, such as a mapping set.
range: uri
Expand Down Expand Up @@ -627,6 +639,7 @@ classes:
license:
required: true
slots:
- curie_map
- mappings
- mapping_set_id
- mapping_set_version
Expand Down Expand Up @@ -770,6 +783,10 @@ classes:
- mapping_set_group
- last_updated
- local_name
prefix:
slots:
- prefix_name
- prefix_url
Propagatable:
class_uri: sssom:Propagatable
description: Metamodel extension class to describe slots whose value can be
Expand Down