-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: deps: checkin a full ICU #21676
Conversation
Main discussion: #19214 |
linter works locally for me Edit: missed the "markdown linter not installed" thing.. |
I renamed the icu directory from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rubber stamp lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed the icu directory from
deps/icu-small
todeps/icu-full
to keep from there being any confusion about what is checked in. I guess it would reduce churn a bit to keep the same name and just usedeps/icu-small
for the full ICU data.
+1 for renaming, although you will also need to update tools/license-builder.sh
and doc/api/intl.md#options-for-building-nodejs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we just use the name deps/icu
? The full
part won't age well, and the checkout is not technically full since unnecessary files are removed.
tools/icu/README.md
Outdated
|
||
You are ready to check in the updated `deps/small-icu`. This is a big commit, | ||
You are ready to check in the updated `deps/full-icu`. This is a big commit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
icu-full
or just icu
, as I recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you do configure --with-icu-source=somewhere/ssomewhere/icu4c-62_1-src.tgz
then deps/icu
gets created with the contents of the specified tarball. so deps/icu
is already taken, and expected to be transient.
We could also just call it small-icu
- it is smaller (trimmed out unneeded stuff), just not small in data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, in that case I'd rather rename the transient directory if that's not too much work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ok, i misread this , and i implemented renaming the modified source to dep/icu-small
(even though it has all locales, just smaller source)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- will fix this up
56f5608
to
28ca326
Compare
@@ -38,26 +38,25 @@ in [BUILDING.md][]. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier in this file there's a section:
Node.js (and its underlying V8 engine) uses ICU to implement these features in native C/C++ code. However, some of them require a very large ICU data file in order to support all locales of the world. Because it is expected that most Node.js users will make use of only a small portion of ICU functionality, only a subset of the full ICU data set is provided by Node.js by default. Several options are provided for customizing and expanding the ICU data set either when building or running Node.js.
The underlined portion should probably be edited.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, you could still build ICU with English only, but you'd have to do your own ICU build to get it. Such tools should live in ICU-land, not in Node.
That'd be my preference too, if it's not too much of a hassle. Which semver tag should we give this PR? I'm leaning towards |
@@ -78,7 +78,7 @@ UnhandledEngine::findBreaks( UText *text, | |||
int32_t /* startPos */, | |||
int32_t endPos, | |||
UVector32 &/*foundBreaks*/ ) const { | |||
UChar32 c = utext_current32(text); | |||
UChar32 c = utext_current32(text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kind of odd that this ICU bump contains quite a lot of trailing whitespace changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happens every.single.time. I usually locally re-land with --fix-whitespace=all
and then push the landed version back to the branch, but i messed something up in the process this time.
upstream ICU still has trailing whitespace.
doc/api/intl.md
Outdated
If the `small-icu` option is used, one can still provide additional locale data | ||
at runtime so that the JS methods would work for all ICU locales. Assuming the | ||
If the `full-icu` option is used, one can still provide additional data | ||
at runtime. (This data does not override the built in data.) Assuming the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"built-in"
BUILDING.md
Outdated
```console | ||
> .\vcbuild full-icu download-all | ||
``` | ||
enabled by default. The `small-icu` option has been removed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The previously available small-icu
option has been removed"
You know, perhaps I should accept Also, in |
It seems this section also needs updating (particularly, the "Encodings Supported by Default (With ICU)" part): |
Sounds good. I'd make sure to print a warning that the option was removed and |
doc/api/intl.md
Outdated
### `small-icu` | ||
|
||
Support for `small-icu` (English only) has been removed. See ICU’s | ||
documentation for information on how to customize your ICU build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1310723
to
8732b12
Compare
ok. a lot of churn due to renaming |
- remove mechanism for 'small' icu - fix license-builder - remove a copy step in icu-generic - update docs Fixes: nodejs#19214
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, I wouldn't go with LFS, it would only introduce additional complexity, and the files aren't that big after all.
Is it worth to consider this as Node.js 11 feature? |
@vsemozhetbyt could be, but needs some consensus on direction. (NB this PR could be updated to 63.1 after #23244 lands ) |
What are the benefits of this over the current setup? AFAICT we could set |
Refs: #24039 |
@srl295 Is this still a thing that's being worked on? |
Closing due to long inactivity and it seems like there is preference for a different approach. @srl295 please reopen / open a new PR in case you would like to work on this again. |
🚧
Fixes: Building with full-icu by default #19214
To be based on #21452
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes