Switch from std::regex to boost::regex#7762
Conversation
|
I guess we should merge this after the |
thufschmitt
left a comment
There was a problem hiding this comment.
Nice :)
Can you confirm that the closure size doesn't increase too much (with a nixpkgs that has the enableIcu patch)?
Other than that, I agree with Eelco, let's just wait for enableIcu to land in nixpkgs (can you link to the PR for that?), and then we can merge
|
enableIcu is NixOS/nixpkgs#205166, but not backported to 22.11 Sizes: |
|
(answering #7336 (comment) here since it's where the @SuperSandro2000 that's a very valid point. @yorickvP do you think you could benchmark this quickly to see whether it has an impact? Something like |
|
boost::regex: std::regex: The above also tests linking time. If compilation speed is a consideration, I'd suggest going with PCRE ;). |
|
I think a 2s difference is quite OK. I benchmarked the time of a full |
|
The difference should be much bigger when boost would be fully removed which is a design decision completely out of scope of this PR. See https://www.factorio.com/blog/post/fff-206#:~:text=step%203%20-%20getting%20rid%20of%20boost for an example and some stats |
|
@SuperSandro2000 Factorio looks like it was using a lot more boost than Nix. it might be worth it to remove boost::lexical_cast from util.hh or boost::format from logging, but the rest of the boost usage seems to be very contained and not likely to matter much for compile times. I don't predict more than a 10% compile time increase if we replace it. |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/tweag-nix-dev-update-44/25546/1 |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-04-10-nix-team-meeting-minutes-47/27357/1 |
|
Any reason why this never got merged? |
|
Needed to wait for enableIcu to land on the boost dependency, which I think is the case now. |
|
Revisited in Nix team meeting:
@yorickvP since the preconditions are met, can you rebase? |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-10-27-nix-team-meeting-minutes-98/34695/1 |
d06b144 to
d2f5e26
Compare
|
Rebased! |
|
Thanks, @yorickvP ! Set to auto-merge since it was already approved on principle. |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/2023-11-27-nix-team-meeting-minutes-107/36112/1 |
|
This broke After: |
NixOS/nix#7762 switched `nix` to use the `boost::regex` implementation, which expects that all literal braces are escaped. That is, the regex `\{crate}` no longer parses. We modify `lib.escapeRegex` to accommodate this change. Fortunately, the old regex implementation doesn't mind if closing braces are escaped as well, so this is backwards compatible and we don't need to worry about version-gating it. Without this patch: Old regex implementation: $ nix repl nixpkgs Welcome to Nix 2.18.1. Type :? for help. Loading installable 'flake:nixpkgs#'... Added 5 variables. nix-repl> builtins.match (lib.escapeRegex "{}") "{}" [ ] $ ~/nix/result/bin/nix repl nixpkgs Welcome to Nix 2.20.0pre20231130_dirty. Type :? for help. `boost::regex` implementation: $ ~/nix/result/bin/nix repl nixpkgs Welcome to Nix 2.20.0pre20231130_dirty. Type :? for help. Loading installable 'flake:nixpkgs#'... Added 5 variables. nix-repl> builtins.match (lib.escapeRegex "{}") "{}" error: … while calling the 'match' builtin at «string»:1:1: 1| builtins.match (lib.escapeRegex "{}") "{}" | ^ error: invalid regular expression '\{}'
|
This should be reverted, I opened #9508 to do that |
|
Thanks for the revert @infinisil ! Could have been caught earlier?. Is there a nixpkgs lib test suite that we can use? (We already test that the output of |
|
We have |
|
Hrm, sad. I hope it just needs |
Oh, thanks, I missed that one. |
|
|
This avoids C++'s standard library regexes, which aren't the same across platforms, and have many other issues, like using stack so much that they stack overflow when processing a lot of data. To avoid backwards and forward compatibility issues, regexes are processed using a function converting libstdc++ regexes into Boost regexes, escaping characters that Boost needs to have escaped, and rejecting features that Boost has and libstdc++ doesn't. Related context: - Original failed attempt to use `boost::regex` in CppNix, failed due to boost icu dependency being large (disabling ICU is no longer necessary because linking ICU requires using a different header file, `boost/regex/icu.hpp`): NixOS/nix#3826 - An attempt to use PCRE, rejected due to providing less backwards compatibility with `std::regex` than `boost::regex`: NixOS/nix#7336 - Second attempt to use `boost::regex`, failed due to `}` regex failing to compile (dealt with by writing a wrapper that parses a regular expression and escapes `}` characters): NixOS/nix#7762 Closes #34. Closes #476. Change-Id: Ieb0eb9e270a93e4c7eed412ba4f9f96cb00a5fa4
Motivation
Requested in #7336
Context
Fixes #2147, fixes #4758
See also #3826.
s/std::regex/regex/g, added aliases to enable switching back and forth in the future.Checklist for maintainers
Maintainers: tick if completed or explain if not relevant
tests/**.shsrc/*/teststests/nixos/*