Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Spec questions/errors #12

Closed
anba opened this issue Jan 26, 2018 · 13 comments
Closed

Spec questions/errors #12

anba opened this issue Jan 26, 2018 · 13 comments

Comments

@anba
Copy link
Contributor

anba commented Jan 26, 2018

ApplyOptionsToTag

  • Step 3 should read:
    If tag matches the langtag production and does not match grandfathered,
    to cover the case of regular grandfathered language tags.
  • There are a few copy-paste errors in sub-steps of step 3 (wrong production names, wrong variable names).

FindExtension

Intl.Locale

  • I don't understand why ResolveLocale is used here. For example if you step through the relevant steps of ResolveLocale and BestFitMatcher, you'll realise that using ResolveLocale will always remove the script and region subtags, which doesn't really seem to be the expected result. 😉

Internal slots

  • all two- or three-character strings with code points in the range "a" through "z" doesn't cover all possible strings which can be generated from the language production.
  • Do we want to keep the same restrictions for "co-standard" and "co-search" which are applied for Intl.Collator?
  • Do we want to restrict the input for Intl.Locale if certain Unicode extensions values aren't supported in Intl.{Collator, NumberFormat, DateTimeFormat}?
@zbraniecki
Copy link
Member

Do we want to restrict the input for Intl.Locale if certain Unicode extensions values aren't supported in Intl.{Collator, NumberFormat, DateTimeFormat}?

I don't think we want that. Intl.Locale is supposed to handle any bcp47 language tag string, not just the one we support in other formatters.

@anba
Copy link
Contributor Author

anba commented Feb 6, 2018

#13 should fix all mentioned issues except for:

Step 3 should read:
If tag matches the langtag production and does not match grandfathered,
to cover the case of regular grandfathered language tags.

@littledan
Copy link
Member

@anba My intention was to simply not support grandfathered language tags.

@anba
Copy link
Contributor Author

anba commented Feb 6, 2018

Only irregular or also regular grandfathered language tags?

@littledan
Copy link
Member

Oh sorry, I meant irregular grandfathered tags. So, the change you are suggesting seems important.

@anba
Copy link
Contributor Author

anba commented Feb 8, 2018

Oh sorry, I meant irregular grandfathered tags. So, the change you are suggesting seems important.

Hmm, I'm a bit confused right now. The suggested change is only important if we also disallow regular grandfathered language tags.

@anba
Copy link
Contributor Author

anba commented Feb 8, 2018

On second the possible restriction should probably use If tag matches neither the privateuse nor the grandfathered production, followed by Assert: tag matches the langtag production. instead of If tag matches the langtag production and does not match grandfathered,, because the former more clearly states the two unsupported productions.

@littledan
Copy link
Member

I've tried to represent these questions on the agenda of the Intl call next Friday so we can get more input.

@anba
Copy link
Contributor Author

anba commented Feb 10, 2018

There are at least two more issue with grandfathered tags:

1. If the input is a regular grandfathered tag without a modern replacement, appending Unicode extension sequences will create a non-valid language tag ("valid" in the sense of https://tools.ietf.org/html/rfc5646#section-2.2.9):

var loc = new Intl.Locale("cel-gaulish", {nu: "latn"});
print(loc.toString()); "cel-gaulish-u-nu-latn"

Since cel-gaulish-u-nu-latn no longer matches a grandfathered production, it has to be interpreted as a langtag language tag, which in turn means gaulish has to interpreted as a variant subtag. But gaulish is not a registered variant subtag, so the generated tag cel-gaulish-u-nu-latn is non-valid per RFC 5646.

While Intl.Locale in general doesn't care about non-valid language tags, we may still want to prevent generating non-valid language tags if the input itself was a valid language tag.


2. ICU uses its own replacement for grandfathered tags without modern replacements (and en-GB-oed).

  • cel-gaulish → xtg-x-cel-gaulish
  • zh-min → nan-x-zh-min
  • en-GB-oed → en-GB-x-oed (Modern replacement available: en-GB-oxendict)
  • i-default → en-x-i-default
  • i-enochian → x-i-enochian
  • i-mingo → see-x-i-mingo

As with the other grandfathered tags issues this shouldn't matter much in practice, because I don't expect that grandfathered tags are actually used in normal code.

@zbraniecki
Copy link
Member

Is there an option for us to not support grandfathered tags at all?

@srl295
Copy link
Member

srl295 commented Mar 16, 2018

@anba

en-GB-oed → en-GB-x-oed (Modern replacement available: en-GB-oxendict)

Good catch. filed IcuBug:13650

@littledan
Copy link
Member

We discussed grandfathered tags at the last Intl meeting. The resolution, I believe, was to support regular and irregular grandfathered tags by mapping them each to modern tags, at the beginning of the process of parsing them. I believe this change was recommended for all of Intl, and not just Locale.

littledan added a commit that referenced this issue Jul 15, 2018
- Options which can't be applied to private use or grandfathered tags,
  whether script, region or Unicode extensions, trigger a RangeError
  in the constructor. (Closes #25 and #12)
- Insert additional calls to CanonicalizeLanguageTag for grandfathered
  tags, to enable options to be applied. (Closes #17)
- For grandfathered or private use tags, the entire tag is treated
  as the "language", both for the purposes of the
  Intl.Locale.prototype.language getter and for the language property
  of the options bag in the constructor.
  (Follows #25 (comment))
littledan added a commit that referenced this issue Jul 28, 2018
- Options which can't be applied to private use or grandfathered tags,
  whether script, region or Unicode extensions, trigger a RangeError
  in the constructor. (Closes #25 and #12)
- Insert additional calls to CanonicalizeLanguageTag for grandfathered
  tags, to enable options to be applied. (Closes #17)
- For grandfathered or private use tags, the entire tag is treated
  as the "language", both for the purposes of the
  Intl.Locale.prototype.language getter and for the language property
  of the options bag in the constructor.
  (Follows #25 (comment))
@littledan
Copy link
Member

We've switched to Unicode BCP 47 Locale Identifiers, fixing the remaining issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants