refactor(es/parser): Split parser into also-lex/parse-only#10399
refactor(es/parser): Split parser into also-lex/parse-only#10399kdy1 merged 84 commits intoswc-project:mainfrom
Conversation
🦋 Changeset detectedLatest commit: 3db38aa The changes in this PR will be included in the next version bump. Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
CodSpeed Performance ReportMerging #10399 will create unknown performance changesComparing Summary
Benchmarks breakdown
|
2c80e1a to
c2b1b12
Compare
|
The performance gains demonstrate a compelling case for implementation. Let me explain my idea:
@kdy1 Do you agree with these changes? If so, I will fix CI and make it ready for review. |
|
I have seen the two lexers and two parsers in |
Yes, I think so. Perhaps we can find a better way to integrate it. |
|
Why do we need two separate parser/lexer/token types? |
The Additionally, the
Since the lexical analysis implementations differ,
Using a single parser to handle different lexical logic introduces unnecessary development complexity. While I've separated them for now, I believe there should be a better way to consolidate this logic. |
I'm not sure why |
I guess this is the root cause of the splitting. But I'm not too clear about it. Could you please provide more detailed examples? |
|
|
Since both of |
I understood, but I think the gain is too small if we have to maintain two tokens/lexers/parsers. My guess is that #10397 is the sweet point, but not sure. |
The
I suggest considering landing this change for three reasons:
|
I'm not too clear about
Do you mean copy the new Lexer and Parser, optimize on top of the new ones, and replace the original ones once they are completed? If not, why can't do so? |
Consider the tokenization of
The primary purpose of (Alternatively, we might consider eventually removing
I'm having difficulty understanding the intended meaning here. Could you please clarify? |
Learnt, and I agree this is a very good idea.
I see. Does this mean
Like |
Certainly not - the current token and parser state combination is sufficient for producing a correct AST. This approach aligns with many established parser implementations, including TypeScript's.
I agree with this perspective in principle, but I don't believe creating a separate crate would actually reduce the workload. My primary goal is to optimize To summarize these changes:
|
|
Ok, I got the idea and I like it. Would it be possible to use a single struct with const generic or generic to make maintenance easy? I'm worried about the maintenance burden. |
Agree. I expect finally we should eliminate all duplication and get a clear lexer and a clear parser (just like now, and faster), but we should still keep the code in the process as maintainable as possible, which also bring less burden for bug fixing. |
|
If it's inevitable at the moment, I think we can merge this PR as is and refactor these with other PRs, but I think we should at least have concrete plan for future deduplication. |
I am also worried about the maintenance cost: At present, it seems troublesome to abstract
What do you think? |
|
It sounds acceptable for now, but I want to hear opinion from @CPunisher, too. |
### What? Update swc_core to v26 ### Why? swc-project/swc#10399 made the parser 7% faster
This constitutes a substantial code revision. We should first evaluate whether it yields measurable performance enhancements.