Require Unicode 8.0.0 #300

littledan · 2016-01-19T21:32:47Z

Interpretation of some basic things like whitespace changed after
Unicode 5.1. This patch requires the latest Unicode standard.

mathiasbynens · 2016-01-19T22:52:10Z

Ref. https://bugs.ecmascript.org/show_bug.cgi?id=2071, especially https://bugs.ecmascript.org/show_bug.cgi?id=2071#c4.

littledan · 2016-01-19T23:37:10Z

A note with respect to that thread: Chrome runs on Windows 7, but it supports Unicode 8.0.0 (with a couple exceptions in V8 using 7.0, but a fix is in progress). I think we should be OK with saying that software which doesn't update to pretty recent Unicode versions isn't implementing the latest ECMAScript spec. @bterlson What do you think, as Microsoft?

bterlson · 2016-01-20T19:32:59Z

I am a fan of this personally, although I have concerns with how well Chakra will be able to support this as we depend on platform in many cases. Will dig into this more. In the meantime, this is a simple change and yet is something that is discussed in committee. We can get a quick sign off without going through the normal proposal process I bet.

littledan · 2016-02-08T13:39:01Z

At the January 2016 TC39 meeting, we reached consensus in support of this proposal. Is anything else needed to merge this? I fixed the merge conflict.

mathiasbynens · 2016-02-08T13:49:21Z

Relevant meeting notes: https://github.com/rwaldron/tc39-notes/blob/master/es7/2016-01/2016-01-26.md#unicode-fix-httpsgithubcomtc39ecma262pull300-de

It’s probably intentional, but just to make sure this is not being overlooked — this patch leaves the following section intact: https://tc39.github.io/ecma262/#sec-white-space

<p>ECMAScript implementations must recognize as <emu-nt><a href="#prod-WhiteSpace">WhiteSpace</a></emu-nt> code points listed in the “Separator, space” (Zs) category by Unicode 5.1. ECMAScript implementations may also recognize as <emu-nt><a href="#prod-WhiteSpace">WhiteSpace</a></emu-nt> additional category Zs code points from subsequent editions of the Unicode Standard.</p>

i.e. WhiteSpace is still based on Unicode 5.1.0 + Unicode 8 or later, meaning U+180E is considered whitespace. I’m not sure if this is a necessity for backwards compatibility.

littledan · 2016-02-08T15:47:21Z

Oh, I missed that section. Actually, this patch was partly motivated by getting U+180E out of whitespace! This was all discussed pretty explicitly at the meeting, so I'll upload a new patch with that modified.

mathiasbynens · 2016-02-08T19:23:03Z

👍

ECMAScript 6 required Unicode v5.1.0 `Zs` symbols to be recognized as whitespace in addition to any `Zs` symbols in whatever Unicode version the engine implemented. Per tc39/ecma262#300 this is no longer the case in ES2016. 🎉 The only observable change is that U+180E is no longer considered whitespace.

Interpretation of some basic things like whitespace changed after Unicode 5.1. This patch requires the latest Unicode standard.

bterlson · 2016-02-10T23:24:36Z

Committed as 24dad16. Thanks @littledan!

littledan · 2016-02-11T09:18:31Z

Thanks for the reviews and landing, everyone!

It’s now part of the ECMAScript spec: tc39/ecma262#300

It’s now part of the ECMAScript spec: tc39/ecma262#300 Closes #28.

@anba

Ref. https://hashseed.blogspot.com/2014/08/in-ecma-262-5.html Ref. tc39/ecma262#300 (comment) Ref. mathiasbynens/regexpu-core@9b10d2a Fix and add @anba’s U+180E tests

This character was recategorized from Zs to Cf in Unicode 6.3, and remains categorized as such in Unicode 9.0 (current target version). The current version of ECMAScript requires that only characters classed as Zs in the target version of Unicode be recognized as Whitespace. This change is now reflected in Test262 so making this change will improve our Test262 score rather than regress it. Additionally, update comments about location of UnicodeData.txt and about the definition of Whitespace characters. See: tc39/ecma262#300 See: tc39/test262@3a5a09e See: mathiasbynens/regexpu-core@9b10d2a Fixes chakra-core#2120

…Whitespace classification. Merge pull request #2121 from dilijev:regex-ws This character was recategorized from Zs to Cf in Unicode 6.3, and remains categorized as such in Unicode 9.0 (current target version). The current version of ECMAScript requires that only characters classed as Zs in the target version of Unicode be recognized as Whitespace. This change is now reflected in Test262 so making this change will improve our Test262 score rather than regress it. Additionally, update comments about location of UnicodeData.txt and about the definition of Whitespace characters. See: tc39/ecma262#300 See: tc39/test262@3a5a09e See: mathiasbynens/regexpu-core@9b10d2a Fixes #2120

…PARATOR from Whitespace classification. Merge pull request #2121 from dilijev:regex-ws This character was recategorized from Zs to Cf in Unicode 6.3, and remains categorized as such in Unicode 9.0 (current target version). The current version of ECMAScript requires that only characters classed as Zs in the target version of Unicode be recognized as Whitespace. This change is now reflected in Test262 so making this change will improve our Test262 score rather than regress it. Additionally, update comments about location of UnicodeData.txt and about the definition of Whitespace characters. See: tc39/ecma262#300 See: tc39/test262@3a5a09e See: mathiasbynens/regexpu-core@9b10d2a Fixes #2120

ES2016 and later require Unicode 8.0, which does not consider U+180E to be whitespace, as opposed to earlier versions. See: tc39/ecma262#300 Node.js seem to have incorporated this change some time after 8.1.2. This commit simply removes the tests that expected U+180E to be whitespace.

https://bugs.webkit.org/show_bug.cgi?id=191415 Reviewed by Saam Barati. JSTests: * ChakraCore/test/es5/regexSpace.baseline: * ChakraCore/test/es6/unicode_whitespace.js: Update tests to latest version. (See chakra-core/ChakraCore@7c097b6.) * test262.yaml: * test262/config.yaml: * test262/expectations.yaml: Update expectations. Source/JavaScriptCore: Mongolian Vowel Separator stopped being a valid whitespace character as of ES2016. (tc39/ecma262#300) * parser/Lexer.h: (JSC::Lexer<UChar>::isWhiteSpace): * runtime/ParseInt.h: (JSC::isStrWhiteSpace): * yarr/create_regex_tables: LayoutTests: * js/ToNumber-expected.txt: * js/parseFloat-expected.txt: * js/script-tests/ToNumber.js: * js/script-tests/parseFloat.js: Update tests and expectations. * sputnik/Conformance/09_Type_Conversion/9.3_ToNumber/9.3.1_ToNumber_from_String/S9.3.1_A2-expected.txt: * sputnik/Conformance/09_Type_Conversion/9.3_ToNumber/9.3.1_ToNumber_from_String/S9.3.1_A3_T1-expected.txt: * sputnik/Conformance/09_Type_Conversion/9.3_ToNumber/9.3.1_ToNumber_from_String/S9.3.1_A3_T2-expected.txt: * sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.2/15.10.2.12_CharacterClassEscape/S15.10.2.12_A1_T1-expected.txt: * sputnik/Conformance/15_Native_Objects/15.10_RegExp/15.10.2/15.10.2.12_CharacterClassEscape/S15.10.2.12_A2_T1-expected.txt: * sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.2/15.1.2.2_parseInt/S15.1.2.2_A2_T10-expected.txt: * sputnik/Conformance/15_Native_Objects/15.1_The_Global_Object/15.1.2/15.1.2.3_parseFloat/S15.1.2.3_A2_T10-expected.txt: * sputnik/Unicode/Unicode_410/S15.10.2.12_A1_T6-expected.txt: * sputnik/Unicode/Unicode_410/S15.10.2.12_A2_T6-expected.txt: * sputnik/Unicode/Unicode_410/S7.2_A1.6_T1-expected.txt: * sputnik/Unicode/Unicode_500/S15.10.2.12_A1_T6-expected.txt: * sputnik/Unicode/Unicode_500/S15.10.2.12_A2_T6-expected.txt: * sputnik/Unicode/Unicode_500/S7.2_A1.6_T1-expected.txt: * sputnik/Unicode/Unicode_510/S15.10.2.12_A1_T6-expected.txt: * sputnik/Unicode/Unicode_510/S15.10.2.12_A2_T6-expected.txt: * sputnik/Unicode/Unicode_510/S7.2_A1.6_T1-expected.txt: Let outdated sputnik checks fail. git-svn-id: http://svn.webkit.org/repository/webkit/trunk@238004 268f45cc-cd09-0410-ab3c-d52691b4dbfc

Summary: ES2016 updated the list of whitespace from Unicode 5.1 to Unicode 8. between those versions, U+180e (mongolian vowel separator) was moved from `Zs` to `Cf`, so it is no longer whitespace. tc39/ecma262#300 (comment) Reviewed By: samwgoldman Differential Revision: D15249134 fbshipit-source-id: 228560604fbde3567e86dfd281a141f930b0e347

- 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)

- 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)

…ns (tc39#1698) - 2016: the Unicode change affected what was considered whitespace (tc39#300 / 24dad16) - 2017: the latest version of Unicode is mandated (tc39#620) - 2018: changed tagged template literal objects to be cached per source location rather than per realm (tc39#890) - 2019: Atomics.wake was renamed to Atomics.notify (tc39#1220) - 2019: `await` was changed to require fewer ticks (tc39#1250)

* Treat single `let` as Identifier in parsing ExpressionStatement * The `let` contextual keyword must not contain Unicode escape sequences. * U+180E had been changed to `Other, Format [Cf]` from `Separator, Space [Zs]` see tc39/ecma262#300, chakra-core/ChakraCore#2120 * The `let` contextual keyword must not contain Unicode escape sequences * We need to allocate new env on right side of for in-of head when there is lexical decl on left side Signed-off-by: Seonghyun Kim <[email protected]>

bterlson added the needs consensus This needs committee consensus before it can be eligible to be merged. label Jan 20, 2016

mathiasbynens mentioned this pull request Feb 8, 2016

Remove Unicode version requirement whatwg/javascript#28

Closed

littledan force-pushed the unicode branch from 78129b7 to 40a921e Compare February 8, 2016 13:38

littledan force-pushed the unicode branch from 40a921e to 83ecc0b Compare February 8, 2016 15:48

rwaldron mentioned this pull request Feb 8, 2016

Align normative external reference with ES2016: Unicode 8.0.0 tc39/ecma402#72

Merged

Require Unicode 8.0.0

536f361

Interpretation of some basic things like whitespace changed after Unicode 5.1. This patch requires the latest Unicode standard.

littledan force-pushed the unicode branch from 83ecc0b to 536f361 Compare February 10, 2016 10:18

bterlson closed this Feb 10, 2016

mathiasbynens added a commit to whatwg/javascript that referenced this pull request Feb 11, 2016

Remove Unicode database version requirement

28eea7e

It’s now part of the ECMAScript spec: tc39/ecma262#300

mathiasbynens added a commit to whatwg/javascript that referenced this pull request Feb 11, 2016

Remove Unicode database version requirement

4f1a517

It’s now part of the ECMAScript spec: tc39/ecma262#300 Closes #28.

michaelficarra mentioned this pull request Feb 11, 2016

upgrade to Unicode 8.0.0 shapesecurity/shift-parser-js#263

Closed

mathiasbynens mentioned this pull request Feb 26, 2016

Fix a potentially failing smoke test thlorenz/redeyed#5

Closed

mathiasbynens mentioned this pull request Jun 29, 2016

Ensure U+180E is no longer considered whitespace tc39/test262#709

Closed

This was referenced Dec 5, 2016

[RegExp] Remove U+180E MONGOLIAN VOWEL SEPARATOR from Whitespace classification. chakra-core/ChakraCore#2120

Closed

Removed U+180E MONGOLIAN VOWEL SEPARATOR from Whitespace classification. chakra-core/ChakraCore#2121

Merged

Xotic750 mentioned this pull request Sep 30, 2017

Node 8.3 and 8.4 do not throw TypeError: parseInt(Symbol('')); es-shims/es5-shim#450

Closed

ljharb mentioned this pull request Sep 16, 2019

Editorial: add missing Annex E entries #1698

Merged

dbalde mentioned this pull request Jun 21, 2023

[Snyk] Security upgrade ecmarkup from 4.0.0 to 7.1.0 dbalde/ecma262#57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require Unicode 8.0.0 #300

Require Unicode 8.0.0 #300

littledan commented Jan 19, 2016

mathiasbynens commented Jan 19, 2016

littledan commented Jan 19, 2016

bterlson commented Jan 20, 2016

littledan commented Feb 8, 2016

mathiasbynens commented Feb 8, 2016

littledan commented Feb 8, 2016

mathiasbynens commented Feb 8, 2016

bterlson commented Feb 10, 2016

littledan commented Feb 11, 2016

Require Unicode 8.0.0 #300

Require Unicode 8.0.0 #300

Conversation

littledan commented Jan 19, 2016

mathiasbynens commented Jan 19, 2016

littledan commented Jan 19, 2016

bterlson commented Jan 20, 2016

littledan commented Feb 8, 2016

mathiasbynens commented Feb 8, 2016

littledan commented Feb 8, 2016

mathiasbynens commented Feb 8, 2016

bterlson commented Feb 10, 2016

littledan commented Feb 11, 2016