Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node 18.13 ICU 72 Version Breaking Change Date/Time format #46123

Closed
CreativeTechGuy opened this issue Jan 6, 2023 · 24 comments
Closed

Node 18.13 ICU 72 Version Breaking Change Date/Time format #46123

CreativeTechGuy opened this issue Jan 6, 2023 · 24 comments

Comments

@CreativeTechGuy
Copy link

Version

18.13.0

Platform

N/A

Subsystem

No response

What steps will reproduce the bug?

Related #45171. The whitespace before AM/PM in a time is now U+202F rather than a space character.

How often does it reproduce? Is there a required condition?

100% reproduction on the affected Node versions

What is the expected behavior?

Same behavior as previous. A space character to separate instead of unicode whitespace.

What do you see instead?

A NARROW NO-BREAK SPACE (U+202f) character instead of a space character.

Additional information

This is a breaking change, especially for automated tests. Upon upgrading to 18.13 our CI builds started to fail when asserting that a datetime string was what we expected. We eventually tracked this down to the change mentioned above. I understand the change itself isn't a bug, but the introduction of this in a minor LTS version seems like a bug.

While it's partially user error for many cases of misusing these formatted strings, it's unexpected for this to change in a minor version and a very difficult problem to track down.

@bnoordhuis
Copy link
Member

Thanks for the report but it's been reported before and it's not considered a bug or backwards incompatible change in behavior. Tests that break because of this change are making bad assumptions about localized strings.

@bnoordhuis bnoordhuis closed this as not planned Won't fix, can't repro, duplicate, stale Jan 7, 2023
kazupon added a commit to intlify/vue-i18n that referenced this issue Jan 12, 2023
kazupon added a commit to intlify/vue-i18n that referenced this issue Jan 12, 2023
…Format components on SFC template and JSX/TSX (#1310)

* fix: support type inference of Translation, NumberFormat and DatetimeFormat components on SFC template and JSX/TSX

* fix: timezone setting

* fix: pinned node v18 minor version

* fix

* fix: disalbe node v18

related: nodejs/node#46123
related: nodejs/node#45171
@kleinfreund
Copy link

kleinfreund commented Jan 13, 2023

Tests that break because of this change are making bad assumptions about localized strings.

That may well be, but it doesn’t change the fact that this change easily breaks all sorts of tests. One notable example are snapshot tests that contain date strings formatted in this manner. They will break with this change and aside from the consideration whether snapshot tests are a good idea or not, there can’t be any doubt that there was no way to make a "bad assumption about localized strings" when utilizing them.

For now, I’m restricting Node.js 18 to >=18 <18.13.0 via the engines field (and 18.12 for the .nvmrc) in my project to workaround the broken tests. Once 18.13 is distributed sufficiently across systems (I don’t get it yet on Ubuntu), I’ll change that to >=18.13.0 and regenerate the snapshots.

@markcarroll
Copy link

Thanks for the report but it's been reported before and it's not considered a bug or backwards incompatible change in behavior. Tests that break because of this change are making bad assumptions about localized strings.

@bnoordhuis how is this not a breaking change?

The behavior that has been the same for years has suddenly started causing regular expression matchers to fail. The \s whitespace matcher in regex has not been updated to include this character. If that change had accompanied this update that would been more excusable, but making a change that breaks existing behavior like this should not be in a minor release IMO.

At the very least it would be helpful if this were noted in the release docs so that folks trying to track down the problem have an easier change of finding it.

@bnoordhuis
Copy link
Member

At the very least it would be helpful if this were noted in the release docs

It was. The update to ICU to 72.1 was called out as a notable change.

@CreativeTechGuy
Copy link
Author

It was. The update to ICU to 72.1 was called out as a notable change.

I hope you can understand how to someone who isn't familiar with the nodejs codebase or these core libraries that it's unclear about this change. I spent ~3 hours digging through trying to root cause this and it wasn't until I found a thread about Node 19 which referenced ICU that I even was pointed in that direction. Before a week ago, I never had even heard of ICU or had any clue what it did. And even now, after looking at their repo, it's not clear that the time format was changed.

The ICU 72.1 changelog just says that they updated Unicode 15, CLDR 42, time zone data version 2022e, etc. Which those are now a bunch more things that you'd need to dig into to find where exactly this was changed. Honestly, I still haven't found the exact commit in any repo which changed this exact thing.

The Node changelog says: "deps: update ICU to 72.1". It would take a lot for a developer to make sense of that and I'm not even sure the steps they'd need to take to determine from that line that the above change was a result of that.

Out of curiosity, where exactly did the change take place? And was it called out in the ICU changelog somewhere that I missed?

@bnoordhuis
Copy link
Member

The ICU update includes a CLDR update (from the Unicode consortium) which is the "human" representation of dates, numbers, etc. See #45945 (comment) for an example.

The "human" part is also why this isn't considered a breaking change or a regression. i18n is for humans, not machines.

@CreativeTechGuy
Copy link
Author

The "human" part is also why this isn't considered a breaking change or a regression. i18n is for humans, not machines.

Can you confirm if Node supports automated testing use-cases? Automated tests are machines which imitate humans to ensure that something works as expected for a human. It's not like we are trying to parse the strings, just compare them to a known value to ensure our code works as expected. It seems like most people facing problems due to this (on this thread and others) is a result of automated tests finding a difference due to the space character.

@bnoordhuis
Copy link
Member

I'm aware that's what people are running into and no, we don't consider that a bug. Tests shouldn't make assumptions about the format of localized outputs because those can change at a whim (or the Unicode Consortium's whim anyway.)

Automated tests are machines which imitate humans to ensure that something works as expected for a human.

Clearly not imitating humans well enough. In general we don't break down in the presence of a non-breaking space.

(Unmatched parentheses on the other hand...)

@kleinfreund
Copy link

The "human" part is also why this isn't considered a breaking change or a regression. i18n is for humans, not machines.

A software's tests are for humans, too. And they break now.

@bnoordhuis
Copy link
Member

If you think that kind of argument is going to sway our position: it won't. If you're just trying to get the last word in: please don't.

@gu-stav
Copy link

gu-stav commented Jan 14, 2023

While I understand your position, I just wanted to add this change started breaking snapshot tests for us at https://github.com/strapi/strapi. It came as a surprise in a minor version and took us some time to understand why test failures like the following started appearing because first we thought it might stem from the testing library (jest):

Screenshot 2023-01-14 at 11 17 04

I believe this could have been communicated better or moved into a major version. To run snapshot tests across different node versions now requires mocking every instance of Date.

kleinfreund pushed a commit to kumahq/kuma-gui that referenced this issue Jan 16, 2023
Requires Node.js v18.13 throughout the project to address a breaking change in v18.13 (see nodejs/node#46123).

Signed-off-by: Philipp Rudloff <[email protected]>
joaquimrocha added a commit to headlamp-k8s/headlamp that referenced this issue Jan 16, 2023
Node has changes how they format the dates when using "toLocaleString",
introducing a special character (U+202f) instead of a space.
Our first instinct was to lower the Node version and wait for a fix,
but according to [1], Node doesn't seem to be reverting it soon and
mentions one shouldn't use localized strings for testing...

So, in order to avoid this, this patch makes the dates show up in
ISO format.

[1] nodejs/node#46123
@hubertlepicki
Copy link

hubertlepicki commented Jan 20, 2023

@bnoordhuis this change is causing a lot of trouble, to a lot of folks. The bugs that surface are hard to debug.

Moreover, this change also introduces another incompatibility between the JavaScript of the browser, and the JavaScript of the server (or the test environment).

If you use (new Date()).toLocaleString(), or one of the related functions in your server-rendered (or test-rendered) templates, they will output something different than the same render done on the client, in the browser.

This thing broke our integration tests, where we took server-rendered HTML and compared it with client-rendered HTML, and I spent a day trying to debug the issue.

@targos
Copy link
Member

targos commented Jan 20, 2023

If you use (new Date()).toLocaleString(), or one of the related functions in your server-rendered (or test-rendered) templates, they will output something different than the same render done on the client, in the browser.

This is true for now, but only temporarily. V8 also updated ICU to version 72 and it will ship in Chrome/Edge 110.

@hubertlepicki
Copy link

@targos oh boy that will be interesting when this rolls out.

@buzinas
Copy link

buzinas commented Jan 27, 2023

You're the maintainer and you can decide whatever is best for the project whatsoever.

That said, if you write some code that produces an output on version v18.12 and the same code outputs a different thing on v18.13, it is a breaking change – it doesn't matter if it's for humans, machines or monkeys. There is no argument here.

NodeJS broke SEMVER and as a consequence, NodeJS is wasting thousands of hours of people's time because of that.

PS: And no, I'm not trying to sway your position (because it's clear that you don't know what a breaking change means), and I'm not trying to get the last word in either, just wanted to share my frustration with everyone reading this thread.

@bnoordhuis
Copy link
Member

A gentle warning: this is not the place to vent your frustration, that's what twitter is for. Comments like ^ add nothing, they just clog up people's inboxes.

@Morl99
Copy link

Morl99 commented Feb 2, 2023

Could you pin this thread so that people can find this easily? This hurts quite a bite, and I guess there is nothing to do but to communicate it the best way possible.

FirelightFlagboy added a commit to Scille/parsec-cloud that referenced this issue Feb 23, 2023
Currently our unit test started to fail because of a change introduced in node-18.13.0 (nodejs/node#46123).
So using a version before that change would correct the test that fail (#4071).

Changes
-------

- Change node-version in `ci.yml` to `18.12.0`
- Use github action `actions/setup-node` in `Web tests`
ethanmick added a commit to ethanmick/lets-build-clock that referenced this issue Feb 23, 2023
This was an annoying little bug. On certain versions of Node.js the
function `Date.prototype.toLocaleString()` does not use a regular
unicode space for the seperator, but rather a U+202F Narrow No-Break
Space.

```
> Buffer.from('8:30 AM')
<Buffer 38 3a 33 30 e2 80 af 41 4d>

// E280AF is Narrow No-Break Space!
// https://codepoints.net/U+202F

> Buffer.from('8:30 AM')
<Buffer 38 3a 33 30 20 41 4d>
```

This is only true for the Node.js side though, not the browser rendered
version. So when then client renders the string it uses a normal space,
resulting in a hydration error mismatch.

nodejs/node#46123

There are two ways to fix this.

1. Clean the string on the server side by removing the unusual
   whitespace.
2. Only render the time on the browser

While option 1 seems okay initially, it's open to new issues in the
future. The truth is that the code can't make assumptions about how the
string will appear and we are rendering it on two differnt platforms,
Node.js and the client browser. Different browsers may render the string
even more differently and there will still be a mismatch. V8 vs Gecko
for example.

Therefore, the best solution is to just render the final string on the
client. To do so, we move the only the time rendering component into
it's own file and dynamically import it. Also turn server side rendering
off.

This isn't ideal, but because of the complexity of strings and dates,
it's the only way to ensure there is no hydration mismatch.

Reported by Sivanth_P http://www.youtube.com/channel/UCXWFDIEFQwbIzxU-nkHV90w

Thanks!
ethanmick added a commit to ethanmick/lets-build-clock that referenced this issue Feb 23, 2023
This was an annoying little bug. On certain versions of Node.js the
function `Date.prototype.toLocaleString()` does not use a regular
unicode space for the seperator, but rather a U+202F Narrow No-Break
Space.

```
> Buffer.from('8:30 AM')
<Buffer 38 3a 33 30 e2 80 af 41 4d>

// E280AF is Narrow No-Break Space!
// https://codepoints.net/U+202F

> Buffer.from('8:30 AM')
<Buffer 38 3a 33 30 20 41 4d>
```

This is only true for the Node.js side though, not the browser rendered
version. So when then client renders the string it uses a normal space,
resulting in a hydration error mismatch.

nodejs/node#46123

There are two ways to fix this.

1. Clean the string on the server side by removing the unusual
   whitespace.
2. Only render the time on the browser

While option 1 seems okay initially, it's open to new issues in the
future. The truth is that the code can't make assumptions about how the
string will appear and we are rendering it on two differnt platforms,
Node.js and the client browser. Different browsers may render the string
even more differently and there will still be a mismatch. V8 vs Gecko
for example.

Therefore, the best solution is to just render the final string on the
client. To do so, we move the only the time rendering component into
it's own file and dynamically import it. Also turn server side rendering
off.

This isn't ideal, but because of the complexity of strings and dates,
it's the only way to ensure there is no hydration mismatch.

Reported by Sivanth_P http://www.youtube.com/channel/UCXWFDIEFQwbIzxU-nkHV90w

Thanks!
juanarbol pushed a commit that referenced this issue Mar 5, 2023
Original commit message:

    [intl] Revert date formatting behavior change from ICU 72

    Replace U+202F with U+0020 after formatting date. This lets websites
    continue to work without any changes.

    This matches Firefox behavior, according to
    https://bugzilla.mozilla.org/show_bug.cgi?id=1806042#c17.

    Bug: chromium:1414292, chromium:1401829, chromium:1392814
    Change-Id: I7c2b58414d0890f8705e737f903403dc54e5fe57
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/4237675
    Commit-Queue: Adam Klein <[email protected]>
    Reviewed-by: Shu-yu Guo <[email protected]>
    Cr-Commit-Position: refs/heads/main@{#85757}

Refs: v8/v8@90be99f
PR-URL: #46646
Refs: #46123
Reviewed-By: Jiawen Geng <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
targos added a commit that referenced this issue Mar 13, 2023
Original commit message:

    [intl] Revert date formatting behavior change from ICU 72

    Replace U+202F with U+0020 after formatting date. This lets websites
    continue to work without any changes.

    This matches Firefox behavior, according to
    https://bugzilla.mozilla.org/show_bug.cgi?id=1806042#c17.

    Bug: chromium:1414292, chromium:1401829, chromium:1392814
    Change-Id: I7c2b58414d0890f8705e737f903403dc54e5fe57
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/4237675
    Commit-Queue: Adam Klein <[email protected]>
    Reviewed-by: Shu-yu Guo <[email protected]>
    Cr-Commit-Position: refs/heads/main@{#85757}

Refs: v8/v8@90be99f
PR-URL: #46646
Refs: #46123
Reviewed-By: Jiawen Geng <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
juliangruber added a commit to filecoin-station/core that referenced this issue Mar 22, 2023
juliangruber added a commit to filecoin-station/core that referenced this issue Mar 22, 2023
* ci: add macOS

* fix test

nodejs/node#46123
BethGriggs pushed a commit that referenced this issue Mar 27, 2023
Original commit message:

    [intl] Revert date formatting behavior change from ICU 72

    Replace U+202F with U+0020 after formatting date. This lets websites
    continue to work without any changes.

    This matches Firefox behavior, according to
    https://bugzilla.mozilla.org/show_bug.cgi?id=1806042#c17.

    Bug: chromium:1414292, chromium:1401829, chromium:1392814
    Change-Id: I7c2b58414d0890f8705e737f903403dc54e5fe57
    Reviewed-on: https://chromium-review.googlesource.com/c/v8/v8/+/4237675
    Commit-Queue: Adam Klein <[email protected]>
    Reviewed-by: Shu-yu Guo <[email protected]>
    Cr-Commit-Position: refs/heads/main@{#85757}

Refs: v8/v8@90be99f
PR-URL: #46646
Refs: #46123
Reviewed-By: Jiawen Geng <[email protected]>
Reviewed-By: Ben Noordhuis <[email protected]>
Reviewed-By: Richard Lau <[email protected]>
Reviewed-By: Colin Ihrig <[email protected]>
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Ruben Bridgewater <[email protected]>
zburke added a commit to folio-org/stripes-components that referenced this issue Apr 26, 2023
ICU 72 is the worst ICU ever. EVER. When is whitespace not actually
whitespace? When it's narrow no-break space whitespace. So so unamusing.

See also: nodejs/node#46123
zburke added a commit to folio-org/stripes-components that referenced this issue Apr 26, 2023
* `webpack` was unlocked from `~5.68.0` in tip-of-master branches which
  are leveraged here in CI, so even though the correct value is in place
  in `devDependencies`, newer versions are sneaking in via the CI builds
  of `stripes-cli` and `eslint-config-stripes`.

* handle stupid Stupid STUPID whitespace changes in timestamps
  ICU 72 is the worst ICU ever. EVER. When is whitespace not actually
  whitespace? When it's narrow no-break space whitespace. So so unamusing.
  See also: nodejs/node#46123

Refs STCOM-1153
@meepybub
Copy link

meepybub commented May 8, 2023

Curious if this will rolled back seeing how many dev hours have been spent figuring out the difference between ' ' and ' '...

Also, what is the best practice workaround here? I need to make sure my localized, formatted date is rendered correctly. Add replaceAll(NNSB, ' ')? Or add replaceAll(' ', NNSB)? Or slice my strings up to the NNSB? Every commit above does it differently. There doesn't seem to be a single good way of doing it so I reckon the owner has his own recommendation.

Human machine dichotomy aside we all need automated unit tests for localization code, so this issue won't go away, likely ever.

@kleinfreund
Copy link

@meepybub This has been rolled back.

18.13 shipped with the breaking change and 18.15 shipped with the revert.

@ecki
Copy link

ecki commented Oct 24, 2023

(Funfact Java also had this discussion but they warned in an outreach https://inside.java/2023/03/28/quality-heads-up/)

@stabai
Copy link

stabai commented Oct 25, 2023

Also, what is the best practice workaround here? I need to make sure my localized, formatted date is rendered correctly. Add replaceAll(NNSB, ' ')? Or add replaceAll(' ', NNSB)? Or slice my strings up to the NNSB? Every commit above does it differently. There doesn't seem to be a single good way of doing it so I reckon the owner has his own recommendation.

Human machine dichotomy aside we all need automated unit tests for localization code, so this issue won't go away, likely ever.

This is the crux of why this seems like it should be important to people. The least bad workaround I see to this is to have your tests validate with regex and replace all the spaces in your expected string with \s. The idea that you would do this for all of your tests of this nature seems like an anti-pattern though. I don't think making tests resilient against this sort of behavioral change is a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests