Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normative: Require the latest available Unicode version instead of a fixed version number #620

Merged
merged 1 commit into from
Jul 28, 2016
Merged

Conversation

mathiasbynens
Copy link
Member

@mathiasbynens mathiasbynens commented Jun 23, 2016

As of June 21st, Unicode 9.0.0 is the latest version.

Update July 27: This PR has been updated to refer to the latest available Unicode version rather than v9.0.0 specifically, as per the July 27 meeting.

@bterlson
Copy link
Member

Should probably do a review of the changes before committing to this. Due diligence and all of that. Can you comment on whether/which changes are relevant?

@domenic
Copy link
Member

domenic commented Jun 23, 2016

No due dilligence, I want my Unicode power symbol now!!

@mathiasbynens
Copy link
Member Author

@bterlson Sure.

  • Space_Separator hasn’t changed between Unicode 8 and 9.
  • Unicode 8 has 2,518 ID_Start symbols; Unicode 9 has 2,669, i.e. 151 more.
  • Unicode 8 has 109,830 ID_Continue symbols; Unicode 9 has 117,007, i.e. 7,177 more.
  • Unicode 8 defines 1,245 simple case mappings (1,217 C + 28 S); Unicode 9 defines 1,325 (1,297 C + 28 S), i.e. 80 more.

Did I miss anything?

@bterlson
Copy link
Member

@mathiasbynens Additions don't seem worrying. Anything by way of "breaking changes" there? Removals from ID_Start/ID_Continue and the like?

@mathiasbynens
Copy link
Member Author

mathiasbynens commented Jun 23, 2016

No removals from ID_Start, ID_Continue, or C or S case-fold mappings.

I’ve checked all of the above using the Unicode data files directly, but it can all be verified quite easily by running npm install unicode-8.0.0 unicode-9.0.0 and writing some quick Node.js scripts à la:

// Look for removals in `ID_Continue`:
const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js');
const b = new Set(require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js'));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff);
// → Set { }

Once we’ve established there are no removals, we can easily count the number of new symbols:

const a = require('unicode-8.0.0/Binary_Property/ID_Continue/code-points.js').length;
const b = require('unicode-9.0.0/Binary_Property/ID_Continue/code-points.js').length;
console.log(b - a);
// 7339

The same goes for other properties, e.g.:

// Look for removals in `S` case folding:
const a = Object.keys(require('unicode-8.0.0/Case_Folding/S/code-points.js'));
const b = new Set(Object.keys(require('unicode-9.0.0/Case_Folding/S/code-points.js')));
const diff = new Set(a.filter(x => !b.has(x)));
console.log(diff);

@bterlson
Copy link
Member

Good point! I will explore some and report back.

@allenwb
Copy link
Member

allenwb commented Jun 23, 2016

Unicode now seems to be on a yearly update schedule that slightly lags ECAM-262.

If out intent is to update these references every year , wouldn't it be better to use an open ended reference to the current Unicode standard. In standards documents, a normative reference to another standard that does not include a specific version or date qualifier means the "current version".

@mathiasbynens
Copy link
Member Author

@allenwb That’s what I proposed three years ago: https://bugs.ecmascript.org/show_bug.cgi?id=2071#c0

@rwaldron
Copy link
Contributor

Whichever update process is used, Ecma-402 will need to be updated as well

@bterlson
Copy link
Member

@allenwb I think that is what we had consensus for as well. An open-ended reference seems fine but I was thinking a specific reference at least cued us to do the due diligence of looking for potential issues. I worry without that (and @mathiasbynens's expertise) we'd grow complacent :-P Happy to update to an open-ended reference though.

@littledan
Copy link
Member

When I proposed upgrading to Unicode 8.0, I had no idea that an open-ended reference was in the cards as a legal possibility for a spec to do, and didn't realize that we got consensus on that. I thought the consensus was annual bumps like this. I like Allen's idea. Let's keep doing the due diligence, but I don't think we need explicit bump commits to enforce that.

@bterlson
Copy link
Member

bterlson commented Jun 24, 2016

Going through the notes I see that we had consensus for "8 or greater" which doesn't actually imply that we can use an unversioned reference. For now I won't take this PR, and will add to the agenda for Redmond that we discuss the unversioned reference.

@bterlson bterlson added the needs consensus This needs committee consensus before it can be eligible to be merged. label Jun 24, 2016
ljharb added a commit to tc39/agendas that referenced this pull request Jul 1, 2016
@mathiasbynens
Copy link
Member Author

PR updated to refer to the latest available Unicode version, as per the July 27 meeting.

@mathiasbynens mathiasbynens changed the title Normative: Require Unicode 9.0.0 Normative: Require the latest available Unicode version instead of a fixed version number Jul 28, 2016
@bterlson
Copy link
Member

@mathiasbynens Looks great, thanks so much!

@bterlson bterlson merged commit 0eb8b2f into tc39:master Jul 28, 2016
@caridy
Copy link
Contributor

caridy commented Aug 26, 2016

@mathiasbynens do you plan to update 402 as well?

@mathiasbynens
Copy link
Member Author

@caridy tc39/ecma402#103

bterlson pushed a commit that referenced this pull request Sep 2, 2016
Instead of referring to a version snapshot, link to the latest version of UTR15.

Ref. #620.
caridy pushed a commit to tc39/ecma402 that referenced this pull request Sep 27, 2016
@mathiasbynens mathiasbynens removed the needs consensus This needs committee consensus before it can be eligible to be merged. label Jun 20, 2018
ljharb added a commit to ljharb/ecma262 that referenced this pull request Sep 17, 2019
 - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16)
 - 2017: the latest version of Unicode is mandated (tc39#620)
 - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890)
 - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220)
 - 2019: `await` was changed to require fewer ticks (tc39#1250)
ljharb added a commit to ljharb/ecma262 that referenced this pull request Oct 1, 2019
 - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16)
 - 2017: the latest version of Unicode is mandated (tc39#620)
 - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890)
 - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220)
 - 2019: `await` was changed to require fewer ticks (tc39#1250)
ljharb added a commit to ljharb/ecma262 that referenced this pull request Oct 17, 2019
…ns (tc39#1698)

 - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16)
 - 2017: the latest version of Unicode is mandated (tc39#620)
 - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890)
 - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220)
 - 2019: `await` was changed to require fewer ticks (tc39#1250)
ljharb added a commit to ljharb/ecma262 that referenced this pull request Oct 17, 2019
…ns (tc39#1698)

 - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16)
 - 2017: the latest version of Unicode is mandated (tc39#620)
 - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890)
 - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220)
 - 2019: `await` was changed to require fewer ticks (tc39#1250)
ljharb added a commit to ljharb/ecma262 that referenced this pull request Oct 18, 2019
…ns (tc39#1698)

 - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16)
 - 2017: the latest version of Unicode is mandated (tc39#620)
 - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890)
 - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220)
 - 2019: `await` was changed to require fewer ticks (tc39#1250)
Phanter5 added a commit to Phanter5/agendas that referenced this pull request Jul 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants