-
-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(parser): GritQL parser #1998
Conversation
✅ Deploy Preview for biomejs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
CodSpeed Performance ReportMerging #1998 will not alter performanceComparing Summary
|
I will review this eventually. Can you help us and explain how we should review the PR? How does error recovery work? What's the encoding of grit files? Do we handle UTF-16 strings? |
Yeah, I’m finishing up the precedence rules then I’ll document it a bit more. AFAIK, it’s all UTF8 only :) |
Okay, precedence rules seem to work. I've updated the PR description to ask for guidance so I can move it over the finish line. |
It's really weird that grit files are UTF-8, while JavaScript is WTF-16 (https://simonsapin.github.io/wtf-8/#ill-formed-utf-16) This means that we already found the first limitation for possible extensions/plugins for Biome. |
I don't fully understand this comment. JavaScript doesn't specify an explicit encoding for JavaScript files, although I've yet to encoder the first one in my life that wasn't encoded with UTF-8. JavaScript uses UTF-16 (WTF-16) for its internal string object representation, but that only affects runtimes, not parsers. |
For the record, we never explicitly chose an encoding for .grit files either. I'm not sure why this would be a limitation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is very long and it will take a while to look at it. I wished it was broken down to smaller pieces.
I left some comments, here the summary of the things you should look after:
- Diagnostics: we can do better, and we can provide better messages to help users to fix the possible grammar errors. Some errors can leverage the internal APIs to provide better UI in those messages.
- Lists: you need to use the infra provided by
biome_parser
, e.g.ParseNodeList
orParseSeparatedList
. The CSS parser uses it, for example. - Bogus: unfortunately I couldn't review much around parsing, but it seems that we aren't using the bogus nodes defined in grammar. An example is
GritBogusDefinition
, which I would expected to find indefinitions.rs
or where we parse the definitions (I didn't findGRIT_BOGUS_DEFINITION
) - Error cases: I think we should add more error cases, especially cases that should prove the error recovery
diagnostics: Vec<ParseDiagnostic>, | ||
} | ||
|
||
impl<'src> Lexer<'src> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use the trait Lexer
that is provided by biome_parser
.
Using it would help us to make the all lexers all the same, at least having all the same methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm noticing a few more methods in the JS lexer that can be removed as well, since they match what's inside the generic lexer. Hope you don't mind if I move them in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer not. This PR is already big as I said before, and it would be best to ease the job of the reviewers instead of making it harder.
You can raise a PR against this branch or main
crates/biome_grit_parser/tests/grit_test_suite/err/missing_version.grit.snap
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is very long and it will take a while to look at it. I wished it was broken down to smaller pieces.
I left some comments, here the summary of the things you should look after:
- Diagnostics: we can do better, and we can provide better messages to help users to fix the possible grammar errors. Some errors can leverage the internal APIs to provide better UI in those messages.
- Lists: you need to use the infra provided by
biome_parser
, e.g.ParseNodeList
orParseSeparatedList
. The CSS parser uses it, for example. - Bogus: unfortunately I couldn't review much around parsing, but it seems that we aren't using the bogus nodes defined in grammar. An example is
GritBogusDefinition
, which I would expected to find indefinitions.rs
or where we parse the definitions (I didn't findGRIT_BOGUS_DEFINITION
) - Error cases: I think we should add more error cases, especially cases that should prove the error recovery
I will definitely try to add some more test cases, especially around error recovery, but I think I've addressed almost all review comments. Thanks a ton! |
crates/biome_grit_parser/tests/grit_test_suite/err/incorrect_string_prefix.grit.snap
Outdated
Show resolved
Hide resolved
Alright, there might be more improvements for error handling down the line, but as you said, this PR is already too big, so I'm going to leave it at this. I think it's generally in a quite decent shape, just let me know if you think there are any blockers still. |
crates/biome_grit_parser/tests/grit_test_suite/err/dotdotdotdot_clause.grit.snap
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't have any bogus node, which means we aren't able to recover to parser.
Considering the grammar, I would expect at least a top-level bogus GritBogusPattern
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does a missing required node though, and there's also a diagnostic, so what kind of recovery would be necessary here? I thought recovery means advancing the parser to a point where it can continue parsing again, but that seems unnecessary here, because it already sees an else
token and picks up from there. But I'm happy to fix it in a follow-up PR if I understand what I'm missing :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does a missing required node though
completely missed that :) sorry
Summary
Implements the GritQL parser based on the merged grammar.
Most of the parser is a pretty straightforward implementation following the grammar. Things of note:
parse_math_pattern()
).CONTRIBUTING.md
it is suggested to use.ok()
when parsing nodes that may be absent, while I had already usedlet _ =
for this purpose. Just let me know if you like me to change it :)ParseBlockBody
might be applicable here. Please advise on the best approach here.TODO:
Test Plan
Tests are added.