Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src: fix generation of path objects in Windows #56696

Closed
wants to merge 2 commits into from

Conversation

yamachu
Copy link
Contributor

@yamachu yamachu commented Jan 22, 2025

fix: #56650
ref: #56657

This PR fixes a problem that caused a segmentation fault in module resolution when creating a require object with multibyte characters in a non-English environment.

Since the issue occurred when generating a std::filesystem::path object, some patches were applied to the changed areas in the following PR.

a7dad43

Since the issue was reproduced only in different locales, such as ja-JP on Windows, I rewrote the locale in the CI of coverage-windows to run the test.
https://github.com/yamachu/node/pull/2/files#diff-29094741d50149aa772b3e577ad509116bad722ad2de85689b6cb2c01e806a46

.github/workflows/coverage-windows.yml

+      - name: Change locale ja-JP for testing on SJIS environment
+        run: Set-WinSystemLocale -SystemLocale "ja-JP"
+      # to avoid configure, nobuild and noprojgen is needed
+      - name: Test on SJIS environment
+        run: ./vcbuild.bat nobuild noprojgen test-ci-js; node -e 'process.exit(0)'
+        env:
+          NODE_V8_COVERAGE: ./coverage/tmp

The previous PR did not solve the problem when using unicode, so the process was separated for windows and the path object was generated.

The part where std::filesystem::path is used can be fixed by changing this PR.

The results of the test conducted in a Japanese environment can be seen below.
https://github.com/yamachu/node/actions/runs/12903928231/job/35980111577#step:12:16844

@nodejs-github-bot nodejs-github-bot added c++ Issues and PRs that require attention from people who are familiar with C++. needs-ci PRs that need a full CI run. labels Jan 22, 2025
Copy link

codecov bot commented Jan 22, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.20%. Comparing base (d978610) to head (76bd5d1).
Report is 22 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #56696      +/-   ##
==========================================
- Coverage   89.21%   89.20%   -0.01%     
==========================================
  Files         662      662              
  Lines      191945   191978      +33     
  Branches    36948    36953       +5     
==========================================
+ Hits       171238   171258      +20     
- Misses      13549    13555       +6     
- Partials     7158     7165       +7     
Files with missing lines Coverage Δ
src/node_file.cc 77.23% <ø> (ø)
src/node_modules.cc 78.91% <100.00%> (+0.13%) ⬆️
src/util-inl.h 83.98% <ø> (ø)

... and 45 files with indirect coverage changes

Copy link
Member

@anonrig anonrig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A similar already landed last August: #54653

We should use ToU8StringView() rather than introducing a new API.

@yamachu
Copy link
Contributor Author

yamachu commented Jan 23, 2025

@anonrig

Thanks for the review!
Does this mean in short that I should use std::u8string?

However, there is a problem with the u8string approach.
As I show in the description, the test-require-unicode.js test fails when I run code using u8string in SJIS and en-US(en-US...? e.g. GHA Windows Runner default locale)environment on Windows (so affected ALL WINDOWS).

Therefore, I provided and used an API to handle wstrings without using ToU8StringView.

@anonrig
Copy link
Member

anonrig commented Jan 23, 2025

I recommend improving existing solution to fit both cases rather than implementing a new solution.

@yamachu
Copy link
Contributor Author

yamachu commented Jan 23, 2025

What are "both cases" presented here?

I am sure that ToU8StringView is not used in current codebase.
Since it is not being used, it does not currently solve the problem.
It is not a solution because I know that using it again will cause problems.

I don't think that trying hard to use u8string in path is a better way to go....

@yamachu
Copy link
Contributor Author

yamachu commented Jan 23, 2025

The base branch is still bf59539 , so the associated test seems to be fail. (experimental flag)
Should I rebase the branch and force push?

@lpinca
Copy link
Member

lpinca commented Jan 23, 2025

Should I rebase the branch and force push?

Yes, thank you.

@yamachu yamachu force-pushed the fix-windows-path-string branch from 0697f02 to cba8830 Compare January 23, 2025 12:57
@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 23, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 23, 2025
@nodejs-github-bot
Copy link
Collaborator

@yamachu
Copy link
Contributor Author

yamachu commented Jan 23, 2025

I had to rush to write my comments before work.
I apologize if any of my comments offended you.

I have rebased and force pushed the file, so please check with CI.

I would have liked to have been able to do this beforehand, but I looked over the entire code.
There I found the following helper functions.

node/src/node_file.cc

Lines 3149 to 3160 in d978610

std::wstring ConvertToWideString(const std::string& str) {
int size_needed = MultiByteToWideChar(
CP_UTF8, 0, &str[0], static_cast<int>(str.size()), nullptr, 0);
std::wstring wstrTo(size_needed, 0);
MultiByteToWideChar(CP_UTF8,
0,
&str[0],
static_cast<int>(str.size()),
&wstrTo[0],
size_needed);
return wstrTo;
}

This was exactly what I was looking for.
I rewrote what I wrote in this PR and added the CP settings and the test passed(on SJIS Windows environment).

@@ -326,6 +330,22 @@ const BindingData::PackageConfig* BindingData::TraverseParent(
return nullptr;
}

#ifdef _WIN32
std::wstring ConvertToWideString(const std::string& str) {
auto cp = GetACP();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section was copied from node_file.cc, but this GetACP is particularly important.
As shown in the description, when the Windows locale is changed and executed, if the original UTF-8 code is left here, the strings cannot be handled properly.

Copy link
Member

@lpinca lpinca Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this utility function can be moved to util-inl.h? so that the same implementation (taking into account GetACP) is used both here and in node_file.cc without duplication. I think this is also what @anonrig is requesting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do the same for node_fs, test-fs-cp.mjs will falil...

It is very strange, but I have found that if I run the failed path(utf/~~~) in addition to the test-require-~ , it can be handled correctly.
It may be that the pre or post-process (if there is one) regarding path generation in node_fs is having an effect.

Therefore, this time I confined the implementation within node_modules instead of applying it to the whole system.

The following is the error log when applied to the entire system.
https://github.com/yamachu/node/actions/runs/12941644792/job/36098035649?pr=3#step:7:5033

code diff
yamachu@0041793

Copy link
Member

@lpinca lpinca Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about making cp a function parameter and use CP_UTF8 in node_file.cc and GetACP() here. Does it make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.
Do you mean to make these changes?

util-inl.h

#ifdef _WIN32
inline std::wstring ConvertToWideString(const std::string& str, UINT cp) {
  int size_needed = MultiByteToWideChar(
      cp, 0, &str[0], static_cast<int>(str.size()), nullptr, 0);
  std::wstring wstrTo(size_needed, 0);
  MultiByteToWideChar(
      cp, 0, &str[0], static_cast<int>(str.size()), &wstrTo[0], size_needed);
  return wstrTo;
}
#endif  // _WIN32

node_file.cc

#define BufferValueToPath(str)                                                 \
  std::filesystem::path(ConvertToWideString(str.ToString(), CP_UTF8))

I thought that this implementation would certainly have less impact and could be implemented with less code duplication.

Copy link
Member

@lpinca lpinca Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. FWIW, I'm also ok with this as is.

Copy link
Contributor Author

@yamachu yamachu Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apply 👍 : ef0ade1

@yamachu
Copy link
Contributor Author

yamachu commented Jan 23, 2025

Copy link
Member

@lpinca lpinca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSLGTM

@lpinca
Copy link
Member

lpinca commented Jan 23, 2025

Can you please squash the last 3 commits into the second or everything into one?

@yamachu yamachu force-pushed the fix-windows-path-string branch from 24ac7b9 to ea943d8 Compare January 24, 2025 04:07
@yamachu
Copy link
Contributor Author

yamachu commented Jan 24, 2025

I squashed and added a test case for the newly found problem.

@yamachu yamachu force-pushed the fix-windows-path-string branch from ea943d8 to 4ddd7ae Compare January 24, 2025 09:33
@yamachu yamachu force-pushed the fix-windows-path-string branch from 4ddd7ae to ef0ade1 Compare January 24, 2025 10:05
@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2025
@nodejs-github-bot
Copy link
Collaborator

@lpinca
Copy link
Member

lpinca commented Jan 24, 2025

Can you please squash (last commit) again?

take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.
@yamachu yamachu force-pushed the fix-windows-path-string branch from 29a6c75 to 76bd5d1 Compare January 24, 2025 18:50
@yamachu
Copy link
Contributor Author

yamachu commented Jan 24, 2025

Squashed and force-pushed 👍

@lpinca lpinca added the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2025
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Jan 24, 2025
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@lpinca lpinca added commit-queue-rebase Add this label to allow the Commit Queue to land a PR in several commits. commit-queue Add this label to land a pull request using GitHub Actions. labels Jan 25, 2025
@nodejs-github-bot nodejs-github-bot added commit-queue-failed An error occurred while landing this pull request using GitHub Actions. and removed commit-queue Add this label to land a pull request using GitHub Actions. labels Jan 25, 2025
@nodejs-github-bot
Copy link
Collaborator

Commit Queue failed
- Loading data for nodejs/node/pull/56696
✔  Done loading data for nodejs/node/pull/56696
----------------------------------- PR info ------------------------------------
Title      src: fix generation of path objects in Windows (#56696)
Author     Yusuke Yamada <[email protected]> (@yamachu, first-time contributor)
Branch     yamachu:fix-windows-path-string -> nodejs:main
Labels     c++, needs-ci, commit-queue-rebase
Commits    2
 - test: added test that uses multibyte for path and resolves modules
 - src: fix to generate path from wchar_t via wstring.
Committers 1
 - yamachu <[email protected]>
PR-URL: https://github.com/nodejs/node/pull/56696
Fixes: https://github.com/nodejs/node/issues/56650
Refs: https://github.com/nodejs/node/pull/56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/56696
Fixes: https://github.com/nodejs/node/issues/56650
Refs: https://github.com/nodejs/node/pull/56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
--------------------------------------------------------------------------------
   ℹ  This PR was created on Wed, 22 Jan 2025 11:36:15 GMT
   ✔  Approvals: 3
   ✔  - James M Snell (@jasnell) (TSC): https://github.com/nodejs/node/pull/56696#pullrequestreview-2567707728
   ✔  - Yagiz Nizipli (@anonrig) (TSC): https://github.com/nodejs/node/pull/56696#pullrequestreview-2573055602
   ✔  - Luigi Pinca (@lpinca): https://github.com/nodejs/node/pull/56696#pullrequestreview-2573469880
   ✔  Last GitHub CI successful
   ℹ  Last Full PR CI on 2025-01-25T06:27:37Z: https://ci.nodejs.org/job/node-test-pull-request/64736/
- Querying data for job/node-test-pull-request/64736/
   ✔  Last Jenkins CI successful
--------------------------------------------------------------------------------
   ✔  No git cherry-pick in progress
   ✔  No git am in progress
   ✔  No git rebase in progress
--------------------------------------------------------------------------------
- Bringing origin/main up to date...
From https://github.com/nodejs/node
 * branch                  main       -> FETCH_HEAD
✔  origin/main is now up-to-date
- Downloading patch for 56696
From https://github.com/nodejs/node
 * branch                  refs/pull/56696/merge -> FETCH_HEAD
✔  Fetched commits as 4a5d2c7538b4..76bd5d181020
--------------------------------------------------------------------------------
[main a02abddc1f] test: added test that uses multibyte for path and resolves modules
 Author: yamachu <[email protected]>
 Date: Wed Jan 22 20:21:46 2025 +0900
 2 files changed, 27 insertions(+)
 create mode 100644 "test/fixtures/copy/utf/\346\226\260\345\273\272\346\226\207\344\273\266\345\244\271/experimental.json"
 create mode 100644 test/parallel/test-module-create-require-multibyte.js
Auto-merging src/node_modules.cc
[main beb75f0ce4] src: fix to generate path from wchar_t via wstring.
 Author: yamachu <[email protected]>
 Date: Wed Jan 22 20:24:38 2025 +0900
 3 files changed, 37 insertions(+), 18 deletions(-)
   ✔  Patches applied
There are 2 commits in the PR. Attempting autorebase.
Rebasing (2/4)
Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
test: added test that uses multibyte for path and resolves modules

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>

[detached HEAD b6ab6e15c2] test: added test that uses multibyte for path and resolves modules
Author: yamachu <[email protected]>
Date: Wed Jan 22 20:21:46 2025 +0900
2 files changed, 27 insertions(+)
create mode 100644 "test/fixtures/copy/utf/\346\226\260\345\273\272\346\226\207\344\273\266\345\244\271/experimental.json"
create mode 100644 test/parallel/test-module-create-require-multibyte.js
Rebasing (3/4)
Rebasing (4/4)
Executing: git node land --amend --yes
--------------------------------- New Message ----------------------------------
src: fix to generate path from wchar_t via wstring.

take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>

[detached HEAD e0bf5cb157] src: fix to generate path from wchar_t via wstring.
Author: yamachu <[email protected]>
Date: Wed Jan 22 20:24:38 2025 +0900
3 files changed, 37 insertions(+), 18 deletions(-)
Successfully rebased and updated refs/heads/main.

✔ b6ab6e15c2a0558406cc78dfaab7c2515486c098
✔ 0:0 no Co-authored-by metadata co-authored-by-is-trailer
✔ 2:7 Valid fixes URL. fixes-url
✔ 0:0 blank line after title line-after-title
✔ 0:0 line-lengths are valid line-length
✔ 0:0 metadata is at end of message metadata-end
✔ 1:8 PR-URL is valid. pr-url
✔ 0:0 reviewers are valid reviewers
✔ 0:0 valid subsystems subsystem
✔ 0:0 Title is formatted correctly. title-format
⚠ 0:50 Title should be <= 50 columns. title-length
✖ e0bf5cb1579741228240a7b44b6cca0f490a2dbb
✔ 0:0 no Co-authored-by metadata co-authored-by-is-trailer
✔ 5:7 Valid fixes URL. fixes-url
✔ 0:0 blank line after title line-after-title
✔ 0:0 line-lengths are valid line-length
✔ 0:0 metadata is at end of message metadata-end
✔ 4:8 PR-URL is valid. pr-url
✔ 0:0 reviewers are valid reviewers
✔ 0:0 valid subsystems subsystem
⚠ 0:50 Title should be <= 50 columns. title-length
✖ 0:51 Do not use punctuation at end of title. title-format

ℹ Please fix the commit message and try again.
Please manually ammend the commit message, by running
git commit --amend
Once commit message is fixed, finish the landing command running
git node land --continue

https://github.com/nodejs/node/actions/runs/12962955769

lpinca pushed a commit that referenced this pull request Jan 25, 2025
PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
lpinca pushed a commit that referenced this pull request Jan 25, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
@lpinca
Copy link
Member

lpinca commented Jan 25, 2025

Landed in 97a3a82...1760024.

@lpinca lpinca closed this Jan 25, 2025
@yamachu yamachu deleted the fix-windows-path-string branch January 25, 2025 16:08
aduh95 pushed a commit that referenced this pull request Jan 27, 2025
PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
aduh95 pushed a commit that referenced this pull request Jan 27, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
sevenc-nanashi pushed a commit to sevenc-nanashi/node that referenced this pull request Jan 29, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: nodejs#56696
Fixes: nodejs#56650
Refs: nodejs#56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
sevenc-nanashi pushed a commit to sevenc-nanashi/node that referenced this pull request Jan 29, 2025
PR-URL: nodejs#56696
Fixes: nodejs#56650
Refs: nodejs#56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
aduh95 pushed a commit that referenced this pull request Jan 30, 2025
PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
aduh95 pushed a commit that referenced this pull request Jan 30, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
hvanness pushed a commit to hvanness/node that referenced this pull request Jan 30, 2025
PR-URL: nodejs#56696
Fixes: nodejs#56650
Refs: nodejs#56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
hvanness pushed a commit to hvanness/node that referenced this pull request Jan 30, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: nodejs#56696
Fixes: nodejs#56650
Refs: nodejs#56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
aduh95 pushed a commit that referenced this pull request Jan 31, 2025
PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
aduh95 pushed a commit that referenced this pull request Jan 31, 2025
Take a similar approach to node_file and allow the creation of paths
code point must be specified to convert from wchar_t to utf8.

PR-URL: #56696
Fixes: #56650
Refs: #56657
Reviewed-By: James M Snell <[email protected]>
Reviewed-By: Yagiz Nizipli <[email protected]>
Reviewed-By: Luigi Pinca <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c++ Issues and PRs that require attention from people who are familiar with C++. commit-queue-failed An error occurred while landing this pull request using GitHub Actions. commit-queue-rebase Add this label to allow the Commit Queue to land a PR in several commits. needs-ci PRs that need a full CI run.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segmentation Fault When Passing Paths with Japanese Characters to createRequire in Node.js 22+
5 participants