-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
path: fix unicode path problems in path.relative #27644
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
path/relative-win32.js n=100000 paths='C:\\\\|D:\\\\' *** -76.13 % ±3.75% ±5.05% ±6.71%
path/relative-win32.js n=100000 paths='C:\\\\foo\\\\bar\\\\baz|C:\\\\foo\\\\bar\\\\baz' 4.74 % ±5.08% ±6.76% ±8.80%
path/relative-win32.js n=100000 paths='C:\\\\foo\\\\BAR\\\\BAZ|C:\\\\foo\\\\bar\\\\baz' 0.01 % ±2.56% ±3.41% ±4.44%
path/relative-win32.js n=100000 paths='C:\\\\foo\\\\bar\\\\baz\\\\quux|C:\\\\' *** -70.69 % ±1.98% ±2.67% ±3.54%
path/relative-win32.js n=100000 paths='C:\\\\orandea\\\\test\\\\aaa|C:\\\\orandea\\\\impl\\\\bbb' *** -64.37 % ±1.53% ±2.04% ±2.68%
a040b78
to
2c08292
Compare
Hey @mscdex I've removed regex replace statements and made sure splits and joins are executed as lazily as possible. I've also executed benchmarks in my machine ( |
|
2c08292
to
7423125
Compare
I've kept the two necessary |
Hm, here’s a new benchmark CI run: https://ci.nodejs.org/view/Node.js%20benchmark/job/benchmark-node-micro-benchmarks/369/ I think some performance impact is okay if it’s a correctness issue, but of course it’s nice to keep that minimal… |
7423125
to
9f859a5
Compare
Hey @addaleax thank you for the benchmark. Results look better and I was able to remove another |
16:38:11 confidence improvement accuracy (*) (**) (***)
16:38:11 path/relative-win32.js n=100000 paths='C:\\\\|D:\\\\' -1.82 % ±4.66% ±6.20% ±8.07%
16:38:11 path/relative-win32.js n=100000 paths='C:\\\\foo\\\\bar\\\\baz|C:\\\\foo\\\\bar\\\\baz' 1.82 % ±4.78% ±6.37% ±8.29%
16:38:11 path/relative-win32.js n=100000 paths='C:\\\\foo\\\\BAR\\\\BAZ|C:\\\\foo\\\\bar\\\\baz' * -2.73 % ±2.50% ±3.33% ±4.33%
16:38:11 path/relative-win32.js n=100000 paths='C:\\\\foo\\\\bar\\\\baz\\\\quux|C:\\\\' *** -41.19 % ±1.45% ±1.93% ±2.51%
16:38:11 path/relative-win32.js n=100000 paths='C:\\\\orandea\\\\test\\\\aaa|C:\\\\orandea\\\\impl\\\\bbb' *** -37.46 % ±1.53% ±2.05% ±2.69%
16:38:11
16:38:11 Be aware that when doing many comparisons the risk of a false-positive
16:38:11 result increases. In this case there are 5 comparisons, you can thus
16:38:11 expect the following amount of false-positive results:
16:38:11 0.25 false positives, when considering a 5% risk acceptance (*, **, ***),
16:38:11 0.05 false positives, when considering a 1% risk acceptance (**, ***),
16:38:11 0.01 false positives, when considering a 0.1% risk acceptance (***) |
I'm OK with a performance hit on edge cases if it's for correctness. We can always improve performance subsequently. |
can we have one or more reviews on this please? |
8ae28ff
to
2935f72
Compare
This will need to be rebased in order to move forward |
9f859a5
to
e89c814
Compare
@jasnell Done! |
This commit changes the way two paths are compared in path.relative: Instead of comparing each char code in path strings one by one, which causes problems when the number of char codes in lowercased path string does not match the original one (e.g. path contains certain Unicode characters like 'İ'), it now splits the path string by backslash and compares the parts instead. Fixes: nodejs#27534
e89c814
to
9acdfcd
Compare
Current benchmark results still show significant regressions:
|
Thank you all guys for your kindness and respect towards a first-time contributor! I'm afraid I currently don't have time to investigate any possible performance improvements (if there are any left). It's been more than a year and there's already a better fix in #27662 by @mscdex. So you should probably go with that. |
Closed as per comment above (#27644 (comment)). Feel free to reopen. |
This commit changes the way two paths are compared in path.relative:
Instead of comparing each char code in path strings one by one, which
causes problems when the number of char codes in lowercased path string
does not match the original one (e.g. path contains certain Unicode
characters like 'İ'), it now splits the path string by backslash and
compares the parts instead.
Fixes: #27534
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes