Language maps on literals with no language don't play well together #480

gkellogg · 2017-04-12T00:18:47Z

As described in this email, mixing literals having language, with those not having a language in a language map creates an odd structure when compacting:

If a string is associated with a property defined as a languageMap, but
does not have a language associated with it, it creates two keys in the
JSON ... the langStrings go in one, and the non-langStrings in another.
This is unintuitive and exposes some of the weirdest weirdness of RDF
(langStrings) to unsuspecting JSON developers.

A proposal to discuss:

If compaction would result in an attempt to add a string without an
associated language into a LanguageMap, then the processor SHOULD assign
the undefined language code UND as the key in the array.

Thus,
_:x rdfs:label "Fish"@en, "Poisson"@fr, "51234" .

Would result in:

{
"@id": "_:x",
"label": {"en": "Fish", "fr": "Poisson", "UND": "51234"}
}

Rather than the current compaction result:

{
"@id": "_:x",
"label": {"en": "Fish", "fr": "Poisson"},
"rdfs:label": "51234"
}

Notes:

PHP does not support "" as a key in a dictionary, and thus UND as the key
This does not propose an inverse expansion rule, in case someone has an
explicit @und langString [seems unlikely], where it should not become a
regular xsd:string

References:

digitalbazaar/jsonld.js#151
IIIF/api#755

The text was updated successfully, but these errors were encountered:

gkellogg · 2017-04-12T00:26:34Z

Jakob Voß notes that "und" actually means "Unknown language", and "zxx" may be more appropriate, which means "No linguistic content/Not applicable".

Perhaps this (or something similar) would be a way to address this. When trying to use a language-mapped term with a literal with no language, the language tag "zxx" is used as a key. {“@value”: “foo”, “@language”: “zxx”} should probably also work and perhaps just expand to {“@value”: “foo”}.

cc/ @azaroth42

azaroth42 · 2017-04-12T00:30:43Z

Thanks for making the issue!

UND is "Undetermined" ... which, to me, captures both "q2341234"@UND where there is no linguistic content, and "fish"@UND where there is, but the language isn't known.

As such, as a fallback for when there isn't a language specified, it makes more sense to me to use UND than the more explicit ZXX which would be incorrect for the "fish" case.

azaroth42 · 2017-04-12T00:33:40Z

And, with all due respect to Jakob ... the I18N group agrees:

https://www.w3.org/International/questions/qa-no-language

workergnome · 2017-04-12T00:57:55Z

Just to clarify—this only applies if the property is explicitly marked as a languageMap in the context, correct? - David Newbury ----------------------------------- p. (773) 547-2272 e. [email protected]

…

On Tue, Apr 11, 2017 at 8:33 PM, Rob Sanderson ***@***.***> wrote: And, with all due respect to Jakob ... the I18N group agrees: https://www.w3.org/International/questions/qa-no-language — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#480 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AACG6Jk8mriU5r7w9-ISM_NjPx_R9eZTks5rvBvlgaJpZM4M6zWj> .

gkellogg · 2017-04-12T02:33:06Z

@workergnome Yes, but we need to consider the results of expanding and removing the concept of a language-map, or compacting something that didn't originally have a language map.

But, we could decide that for JSON-LD, the use of a language-map term is sufficient indicate that literal values, without an explicit @type, are considered to have an unknown language (und), in most cases, this is probably the case.

gkellogg · 2017-04-13T21:24:55Z

Jakob Voß writes:

"" is a legal object key in JSON and PHP just happens to be one programming languages with full support of JSON, so why design JSON-LD with focus on a particular choice of implementation in PHP? The language can deal with "" keys in JSON data pretty well:

<?php
$json = '{"":1}';
$data = json_decode($json, true);
echo $data[""]; # prints '1'
?>

If you prefer PHP objects over PHP arrays, the "" can internally be replaces by another special value but that's a personal choice of implementation and out of the scope of JSON-LD.

Using "" for non-language strings looks like the cleanest and most obvious solution to me.

dlongley · 2017-04-14T15:00:26Z

Cross-posting from the mailing list in response to Jakob Voß's comments:

I don't remember all of the implementation trouble related to the use of empty strings in PHP, but I do remember it being more complex than is being hinted at here. I'm pretty sure that you can't use PHP arrays because you lose the ability to easily distinguish between arrays and objects, which is a requirement for proper implementation.

In any event, how difficult it is to implement a syntax's processing rules in common programming languages should absolutely be a factor in related design decisions -- otherwise adoption could be harmed in a significant way.

It would indeed be nice if there were no issue here, but, unfortunately, I'm not yet convinced that's true. My memory is that the two PHP implementers agreed that this was a pain point worth avoiding. If we can
get a new implementer to step forward (or PRs to the existing implementations) that clearly demonstrate that this problem can be fully avoided without serious detriment to performance or significantly increased implementation complexity, then I'll agree that we should no longer consider it when making future design decisions.

dlongley · 2017-04-14T15:05:30Z

Related: https://bugs.php.net/bug.php?id=46600

It seems that this limitation in PHP may have finally been dealt with in June of last year -- we'll need to figure out which version of PHP has the fix (and how prevalent), if true.

dlongley · 2017-04-14T15:11:15Z

PHP 7.1 (the latest version) now supports empty string properties in objects:

http://php.net/manual/en/migration71.incompatible.php

Decoding an empty key now results in an empty property name, rather than empty as a property name.

azaroth42 · 2017-04-14T15:58:41Z

Fantastic ... back to the list for a revised proposal...

dlongley · 2017-04-14T16:05:24Z

@azaroth42,

Fantastic ... back to the list for a revised proposal...

Keep in mind that hardly anyone is using 7.1 yet -- I suspect it may take a while for it to get sufficient adoption. So that's a consideration here. Long term, however, I think we can rid ourselves of this particular annoyance. :)

azaroth42 · 2017-04-14T16:06:54Z

True, but by the time JSON-LD 1.1 hits TR, hopefully that will have changed. And if we can push people in the right direction if they need a particular feature in a particular language, that seems like an acceptable situation to me.

gkellogg · 2017-04-14T18:02:52Z

On Apr 14, 2017, at 9:06 AM, Rob Sanderson ***@***.***> wrote: True, but by the time JSON-LD 1.1 hits TR, hopefully that will have changed. And if we can push people in the right direction if they need a particular feature in a particular language, that seems like an acceptable situation to me.

I agree, if it’s a language limitation, which has since been solved, we shouldn’t restrict the spec, as it’s otherwise perfectly legitimate JSON. We’ll have a CG spec in a month or so (he said optimistically), but it will be more important in a WG timeframe, which is around another couple of years. Gregg

…

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#480 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAC02EHQOvpuiLoo7WdYHS-Swf4b3M6cks5rv5mfgaJpZM4M6zWj>.

lanthaler · 2017-04-19T21:14:50Z

Leaving the PHP issue aside, this would fundamentally change how compaction works. Till now, it didn't implicitly introduce values. This would change that. To be consistent, we would need to do the same with other containers... such as @index.

The solution I normally recommend for this is to defined an additional term like fallbackLabel.

azaroth42 · 2017-05-03T14:11:22Z

With the "" option, it would continue to not introduce values. It's just a significantly more convenient syntax than needing both label_with_language and label_without_language keys.

I do agree that we should consider consistency with other containers, however. Will work on that.

azaroth42 · 2017-05-03T15:34:22Z

After re-reading @index a couple of times, I'm sorry but I don't see how this applies?

@index containers explicitly persist through compaction and expansion, and there's no way to generate them from RDF directly. It seems to me like the only ramification is that we would allow "" as a key in an @index? Could you expand your comment a little please @lanthaler?

gkellogg · 2017-05-03T16:29:16Z

@lanthaler the idea is to not need to use different properties for such cases. If you have some dc:title properties, some with language, and some without, splitting these between different properties makes this less convenient, not more convenient for developers.

One possibility is to use the language tag @none as a stand-in for no language, so which might look like the following:

{
  "title": {
    "en": "The Queen",
    "@none": "alternate value, without language"
  }
}

I would say that this is equivalent to zxx ("No linguistic content/Not applicable"), but more consistent with keyword use in JSON-LD (although this comes from framing).

Furthermore, to follow Postel's Law, we might treat @none, zxx, and und equivalent when expanding a language map (and possible a value object) to expand to no a value object with no @language (or @type). Compacting to a language map would also consider literals without type or language and gather them under the @none key.

We may want to consider what happens with RDF literals having zxx or und language tags, which I have never seen in real world data.

lanthaler · 2017-05-03T19:02:40Z

@lanthaler the idea is to not need to use different properties for such cases. If you have some dc:title properties, some with language, and some without, splitting these between different properties makes this less convenient, not more convenient for developers.

I'd argue that's not the case, at least not for the "" proposal. That feels very unnatural (to say the least) in basically every programming language. @none looks better but I feel that this is such a corner case with a not-too-bad workaround that I'd opt for consciously not solving it with syntactic sugar.

lanthaler · 2017-05-03T19:06:21Z

After re-reading @index a couple of times, I'm sorry but I don't see how this applies?

The proposal was to implicitly tag a string to have a language tag of "" or @none even though it wasn't there in the expanded form. To be consistent, we would need to do the same for index maps which I strongly think we should not do.

@index containers explicitly persist through compaction and expansion, and there's no way to generate them from RDF directly.

JSON-LD as a superset of RDF. We decided to guarantee lossless compaction/expansion but not roundtrips to RDF.

gkellogg · 2017-05-03T19:31:16Z

Empty-string issues aside, I disagree that using @none in an index or language map is unnatural; I think we need to get some votes from other parties. Please 👍 or 👎 this comment based on your agreement with the proposal.

PROPOSAL: Index Maps and Language Maps may include the @none key; When expanding, values do not receive an @index or @language entry. When compacting, value objects having neither @language, @index, or @type are included within the mapped values.

The alternative, which exists currently, is that such values cannot be serialized using such a term, and are serialized either using another matching term, or using an absolute IRI.

gkellogg · 2017-05-15T17:57:54Z

RESOLVED: Index Maps and Language Maps may include the @none key; When expanding, values do not receive an @index or @language entry. When compacting, value objects having neither @language, @index, or @type are included within the mapped values.

niklasl · 2017-07-07T04:40:40Z

There might be a possible conflict here between language containers and the use of terms with an explicit language of null (i.e. string values). That is, for compaction in JSON-LD 1.0 you can define a language container term (e.g. labelByLang: {@id: rdfs:label, @container: @language}) and then a "companion term" catching the non-lang-tagged strings (e.g. labelString: {@id: rdfs:label, @language: null}) . It would be great if this feature doesn't interfere with that (and it would be backwards incompatible otherwise).

@graph

* Add @graph container tests. * A couple of more graph expansion and compaction tests for corner cases. * Add tests for expanding and compacting named graphs where term definition includes `@graphid`. * Expand and compact `@container: @index` where value is a graph. * Disable highlight.js, and update our use of the "highlight" class to "hl-bold". This makes rendered JavaScript examples slightly less pretty, but restores the specific highlighting used with "****" in examples. * Sort many definition lists automatically using `@data-sort`. * Add @Version to context definitions in examples using 1.1 features. * Syntax updates to describe Graph Containers. * Add inline ednotes for places that are affected by issue #480. * Add compaction and expansion tests for `@graph` with `@index` and `@id`. * Add syntax for graph maps and API algorithms for all graph containers.

… containers. Changes behavior for id maps to use `@none` instead of a blank node identifier. Fixes #480

gkellogg · 2018-01-20T19:31:10Z

Replaced by #569.

gkellogg added api spec-design syntax labels Apr 12, 2017

gkellogg added this to the JSON-LD 1.1 milestone Apr 12, 2017

gkellogg self-assigned this Apr 12, 2017

azaroth42 mentioned this issue May 4, 2017

Simplify language assignment? IIIF/api#755

Closed

gkellogg added a commit that referenced this issue Dec 5, 2017

Add inline ednotes for places that are affected by issue #480.

8c3826f

gkellogg added a commit that referenced this issue Jan 17, 2018

Adds indexing on @none, or an alias of @none for all types of map…

42c89ee

… containers. Changes behavior for id maps to use `@none` instead of a blank node identifier. Fixes #480

gkellogg closed this as completed Jan 20, 2018

azaroth42 mentioned this issue Feb 3, 2018

Revisit empty string as term #584

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Language maps on literals with no language don't play well together #480

Language maps on literals with no language don't play well together #480

gkellogg commented Apr 12, 2017

gkellogg commented Apr 12, 2017

azaroth42 commented Apr 12, 2017

azaroth42 commented Apr 12, 2017

workergnome commented Apr 12, 2017 via email

gkellogg commented Apr 12, 2017

gkellogg commented Apr 13, 2017

dlongley commented Apr 14, 2017

dlongley commented Apr 14, 2017

dlongley commented Apr 14, 2017

azaroth42 commented Apr 14, 2017

dlongley commented Apr 14, 2017

azaroth42 commented Apr 14, 2017

gkellogg commented Apr 14, 2017 via email

lanthaler commented Apr 19, 2017

azaroth42 commented May 3, 2017

azaroth42 commented May 3, 2017

gkellogg commented May 3, 2017

lanthaler commented May 3, 2017

lanthaler commented May 3, 2017

gkellogg commented May 3, 2017

gkellogg commented May 15, 2017 •

edited

Loading

niklasl commented Jul 7, 2017

gkellogg commented Jan 20, 2018

Language maps on literals with no language don't play well together #480

Language maps on literals with no language don't play well together #480

Comments

gkellogg commented Apr 12, 2017

gkellogg commented Apr 12, 2017

azaroth42 commented Apr 12, 2017

azaroth42 commented Apr 12, 2017

workergnome commented Apr 12, 2017 via email

gkellogg commented Apr 12, 2017

gkellogg commented Apr 13, 2017

dlongley commented Apr 14, 2017

dlongley commented Apr 14, 2017

dlongley commented Apr 14, 2017

azaroth42 commented Apr 14, 2017

dlongley commented Apr 14, 2017

azaroth42 commented Apr 14, 2017

gkellogg commented Apr 14, 2017 via email

lanthaler commented Apr 19, 2017

azaroth42 commented May 3, 2017

azaroth42 commented May 3, 2017

gkellogg commented May 3, 2017

lanthaler commented May 3, 2017

lanthaler commented May 3, 2017

gkellogg commented May 3, 2017

gkellogg commented May 15, 2017 • edited Loading

niklasl commented Jul 7, 2017

gkellogg commented Jan 20, 2018

gkellogg commented May 15, 2017 •

edited

Loading