Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maps with no keys #569

Merged
merged 8 commits into from
Jan 22, 2018
Merged

Maps with no keys #569

merged 8 commits into from
Jan 22, 2018

Conversation

gkellogg
Copy link
Member

  • Allows indexing on @none, which does not result in adding the index to the indexed object when expanding.
  • Allows compaction to use @language, @index, @id, and @type even if there is no corresponding member.

… containers.

Changes behavior for id maps to use `@none` instead of a blank node identifier.

Fixes #480
@gkellogg gkellogg added this to the JSON-LD 1.1 milestone Jan 17, 2018
@gkellogg gkellogg self-assigned this Jan 17, 2018
…action to use concatenated/sorted values from `@container`.
<li>Append the values <code>@index</code> and <code>@index@set</code>
to <em>containers</em>.</li>
</ol>
</li>
append <code>@graph</code>, <code>@id</code>, <code>@index</code> and <code>@set</code> to <em>containers</em>.</li>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this line supposed to be removed due to additions above?

is <code>json-ld-1.1</code>, append <code>@id</code>, <code>@type</code>,
and <code>@index</code> to <em>containers</em>.</li>
<span class="changed">and append <code>@id</code>, <code>@id@set</code>,
<code>@type</code>, and <code>@type@set</code></span>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ordered as @set@type? (Also, does that appear anywhere else in doc?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, lexicographically.

@gkellogg
Copy link
Member Author

gkellogg commented Jan 17, 2018

Note that that IRI Compaction introduces a number of new things to add to @containers, due to the various permutations, and change to allow things within an index key to be indexed using @none.


<p class="changed">The special index <code>@none</code> is used for indexing
data which does not have an associated index, which is useful to maintain
a normalized represeiontation.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling.


<p class="changed">The special index <code>@none</code> is used for indexing
data which does not have a language, which is useful to maintain
a normalized represeiontation.</p>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling.


<p class="changed">The special index <code>@none</code> is used for indexing
<a>node objects</a> which do not have an <code>@id</code>, which is useful to maintain
a normalized represeiontation. The <code>@none</code> index may also be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling.


<p>As with <a>id maps</a>, the special index <code>@none</code> is used for indexing
<a>named graphs</a> which do not have an <code>@id</code>, which is useful to maintain
a normalized represeiontation. The <code>@none</code> index may also be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling.

@@ -3517,6 +3674,48 @@ <h2>Node Type Indexing</h2>
-->
</pre>

<p class="changed">The special index <code>@none</code> is used for indexing
<a>node objects</a> which do not have an <code>@type</code>, which is useful to maintain
a normalized represeiontation. The <code>@none</code> index may also be
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling.

<p>As with <a>id maps</a>, the special index <code>@none</code> is used for indexing
<a>named graphs</a> which do not have an <code>@id</code>, which is useful to maintain
a normalized represeiontation. The <code>@none</code> index may also be
a term which expands to <code>@none</code>, such as the term <em>none</em>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy & paste issue: the "none" term isn't used in this example.

@davidlehn
Copy link
Member

You almost touched every file with this commit.
What happens when you have multiple things that would compact to @none? Do they all go under @none key? Or is it an error? Should be tests one way or another and maybe spec text about it (I may have missed it).

@gkellogg
Copy link
Member Author

gkellogg commented Jan 18, 2018

What happens when you have multiple things that would compact to @none.

In general, this is just fine. Two node objects without an @id would be indexed using @none, and this would round-trip. Same for index, language, and type maps.

There may be an interesting case with unamed/indexed graphs, which we should probably explore. Two anonymous graphs would seem to merge into a single graph, and would not round-trip, where you would normally consider that each anonymous graph was, in fact, named with a missing BNode. Of course, this can be controlled in the source by giving the graph an explicit blank node identifier.

I think we should explore with some test cases, and address with a note, but I don't think we need different behavior.

Copy link
Contributor

@azaroth42 azaroth42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, just a couple of questions.

@@ -4250,7 +4448,7 @@ <h2>Language Maps</h2>
or an array containing both <code>@language</code> and <code>@set</code>
</span>. The keys of a
<a>language map</a> MUST be <a>strings</a> representing
[[BCP47]] language codes and the values MUST be any of the following types:</p>
[[BCP47]] language codes, or the <a>keyword</a> <code>@none</code>, and the values MUST be any of the following types:</p>

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..., the keyword @none or an alias that expands to it, and ... ?

<span class="ednote">From <a href="#issue-480">Issue 480</a>
the <a>language tag</a> <code>@none</code> represents no language.</span></li>
<span class="changed">If <em>language</em> is <code>@none</code>,
or expands to <code>@none</code>, do not set the <code>@language</code> member.</span>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need explicit mention that the underlying RDF datatype is different? (xsd:string versus rdf:langString)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, @none only appears in the map, it is removed on expansion, so there is no underlying RDF datatype. This is why the id maps were changed to not use blank node identifiers, as it aids round-tripping, and they never make it into either expansion or RDF generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I agree that @none is json-ld flavored sugar... but the RDF Concepts spec asserts different datatypes based on the presence or lack of a language tag. At least that's my understanding of https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal

If there's a language tag, then the data type is necessarily rdf:langString. And if there isn't, then it's not. If there isn't another data type, it's an xsd:string.

Which we can be completely silent about, but I'd like us to have thought about that silence :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language maps have a proscribed form, where the key is a string, and the value is either a string or an array of strings, so the expansion of a map value of @none, would be like {"@value": "non-languaged-tagged value"}, thus no datatype or language. The Object to RDF Conversion algorithm renders this as a literal without datatype or language, which, as you note, is semantically equivalent to having the datatype xsd:string.

Do you think this needs some exposition in either document?

@azaroth42
Copy link
Contributor

I agree that having an array of objects indexed to @none is fine. An unnamed named graph that gets put into an index seems like a pathological edge case, that anyone who knows what they're doing would instinctively avoid anyway.

…, as well as the bare keyword `@none`.

Fix some minor issues in the API defining values of `@nest`.
…es on merging multiple unindexed or unnamed graphs.
@gkellogg
Copy link
Member Author

@azaroth42 and @davidlehn your concerns have been addressed.

@niklasl
Copy link
Member

niklasl commented Jan 22, 2018

Is this feature compatible with an approach possible in 1.0, where you'd use two distinct terms to separate language-keyed values and non-language-tagged ones? Like:

{
  "@context": {
   "rdfs": "http://www.w3.org/2000/01/rdf-schema#",
   "xsd": "http://www.w3.org/2001/XMLSchema#",
   "labelByLang": {"@id": "rdfs:label", "@container": "@language"},
   "labelString": {"@id": "rdfs:label", "@language": null}
  },
  "labelByLang": {"en": "thing", "sv": "sak"},
  "labelString": "stuff"
}```

@gkellogg
Copy link
Member Author

We may need more tests for this, and the IRI Compaction language may need to be updated. In most cases, preference in term selection is given to the most specific match; there are tests for this, but we may not have sufficient overlap. The language in 2.9.3 may need to be moved until after @none is added in 2.10 to ensure this.

I'll look through to make sure we have adequate test coverage for these various cases.

On a personal note, I have long thought the term selection logic to be over complicated in trying to maintain such variation in matching, which is not the typical case. In 99% of cases (I would guess) there is a single term matching an IRI, and that would be the one to use. Allowing for matching from among a set of possible IRIs complicates the algorithm, and is one aspect of performance issues for large datasets.

…@language` and `@language@set` when value has neither `@language` nor `@type`. Also move `@index` and `@index@set` for values without `@index` to below `@none`.
@gkellogg
Copy link
Member Author

@gkellogg gkellogg merged commit 749a243 into master Jan 22, 2018
@niklasl
Copy link
Member

niklasl commented Jan 22, 2018

If further work on this is to be done (apart from ensuring the explicit none-term case works), could we consider this to be an opt-in feature for map containers; e.g. by declaring "@container": ["@language", "@none"]? (Although I understand the desire to get away from complex term selection rules, unless we're to evict some of the existing mechanics, we need to support it going forward.)

The "none-term" case is important, we've used it for the National Library data in Sweden (since, in library data in general, lacking language tags is the norm, alas.)

@gkellogg gkellogg deleted the maps-with-no-keys branch January 22, 2018 21:31
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 22, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 22, 2018
@gkellogg
Copy link
Member Author

@niklasl Can you provide an example which illustrates your concern. I'd hate to have to add @none as a container value, and it's not clear how that would fit anyway, as @none is effectively no container.

You can inhibit the behavior by defining another term which would be used for terms which you don't want to be in a language map, as the test case added does.

As I mentioned on the call, we could also vary based on processingMode/version, and only allow @none in a language map if version is 1.1 (or processingMode is json-ld-1.1). However, the ability to define a term that is matched with a higher priority than language maps for values without a language would seem to adequately address this issue for me.

gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 22, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 22, 2018
@niklasl
Copy link
Member

niklasl commented Jan 23, 2018

@gkellogg Indeed, if the behaviour is inhibited to handle the example I provided above, the critical part of the issue is solved, since anyone using language-maps in 1.1 can declare a "companion" term for the none-case.

Apart from the reasonably undesired full IRI fallback, there is still one case which works in 1.0 that isn't compatible with this IIUC. If we use @vocab, in 1.0, this will round-trip:

{
  "@context": {
   "@vocab": "http://www.w3.org/2000/01/rdf-schema#",
   "labelByLang": {"@id": "label", "@container": "@language"}
  },
  "labelByLang": {"en": "thing", "sv": "sak"},
  "label": "stuff"
}

Whereas with none-maps in 1.1, we now get only "labelByLang": {"en": "thing", "sv": "sak", "@none": "stuff"}.

The new behaviour is reasonably a better default (and I am very grateful for your work on implementing it!). It is technically is a breaking change though (as shown in this and the above example). I think we need to document it in tests, and probably add the processingMode >= 1.1 check as you suggest. And also add clear instructions for how to upgrade to 1.1, by stating that you now have to use a companion term for language-maps in order to single out the non-language-tagged values using distinct terms.

I'm not keen on my opt-in either, I just wanted to mention it in case we end up not being able to justify the break. (And if so, while @none isn't a reasonable container value (in that you cannot use @container: @none), technically, it could be, just a pointless one (a map with one key (@none) where all values for the term are put...). Hopefully that is moot of course.)

@gkellogg
Copy link
Member Author

See #571.

gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 25, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 25, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 25, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 25, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 26, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Jan 26, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 10, 2018
gkellogg added a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 10, 2018
davidlehn pushed a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 28, 2018
davidlehn pushed a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 28, 2018
davidlehn pushed a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 28, 2018
davidlehn pushed a commit to digitalbazaar/jsonld.js that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants