-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[wgsl-in] Overhaul number lexing / parsing #1863
Conversation
I wish edit: filed lifthrasiir/hexf#22 |
@teoxoy Have you profiled the effect of using regular expressions to parse numeric literals? You can get some big shaders from the sources given in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for improving the number parsing!
We are generally trying hard to keep the code simple and dependencies to the minimum. So involving regex
should come with a good justification. I suppose in case of your PR the justification looks like this:
- it fixes a bunch of cases
- WGSL spec uses regular expressions already, so we can encode it more directly
If that sounds right to you, please consider building regexes in a non-lazy way.
Regarding the perf impact, I've done some benchmarking using the criterion benchmarks we have in the repo and the regex approach seems to be around 60% slower. Tbh I didn't expect the difference to be this large...
Indeed, however the perf impact is concerning. I'll experiment with the manual state machine we had before and see how it goes. |
@teoxoy One comment in my review (I'll finish it soon) was, I wonder if the regexps ending in |
The Progress on the manual state machine: it's giving me a headache 😅. Do you guys know of other ways that we could optimize the regex approach? |
tl;dr: I don't think so, unfortunately. I think it would have to entail using some sort of build-time code generator. We've generally shied away from those because of the build-time impact, since that affects developers constantly ( I've also noticed a tendency for projects to move away from code generators in favor of hand-written parsers. This is pretty disappointing to me. The reason seems to be that generators have to get a whole range of things Just Right to be competitive with hand-written code:
Generators let you write your code at a higher level, but coders have this "that's hard, but I could do it" mindset that undervalues that. Generators generate more correct code, but the use of safe Rust makes this less crucial than it used to be. |
3441dd6
to
ff6f54b
Compare
I rewrote the number parser and removed the regex dependency. The perf impact of this PR is now in the range of -2% to -5% from my limited benchmarking. I think it should be good to go now. On another note, we now only have 2 instances where we need to create a new
I think instead of pushing for a new API for hexf (i.e. |
ping @jimblandy (since it seems I can't re-request a review) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty intricate. Thanks for taking this on!
I have a few changes I'd like to see and then re-review.
@jimblandy thanks for the review and suggestions - they were great! I addressed the comments, simplified some things further and added a few more docs (I know you like those 😉). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really nice. Just a few more points I'd like to see addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just the one remaining change (in a reply to your comment) about the return type of parse
.
@teoxoy Since I had misunderstood some bits about the behavior of |
@jimblandy feel free to push the change; I think the reasoning for it makes sense 👍 |
Bring the lexer's parsing of numeric literals in line with the WGSL specification as of 86a23b83 (2022-05-10).
b9ee4c8
to
8d03f0f
Compare
Updated number literal format and behavior to latest spec version.
Reduced the number errors to the following errors:
(the span of the invalid error can't be more accurate than the first character since we don't know where it ends)
fixes #1843
gets us closer to gpuweb/gpuweb#2227
Related
gpuweb/gpuweb#2762
gpuweb/gpuweb#2769