tree-sitter-grammars: Enable grammar testing#391520
tree-sitter-grammars: Enable grammar testing#391520adfaure wants to merge 8 commits intoNixOS:stagingfrom
Conversation
ambroisie
left a comment
There was a problem hiding this comment.
I'm surprised the emacs plug-in doesn't seem to do the same as nvim-treesitter (separately locked grammars, specific to the plug-in).
Should we worry about reporting a grammar as broken "just" because some its tests aren't passing, if it builds and actually parses its inputs? I'm kind of surprised at the number of grammars which turned up broken = true here.
There was a problem hiding this comment.
| isBroken = grammar ? isBroken && grammar.isBroken; | |
| isBroken = grammar ? isBroken or false; |
There was a problem hiding this comment.
isBroken is always false indeed.
isBroken = grammar ? isBroken;There was a problem hiding this comment.
Your suggestion raises an error. I am not sure why...
nix-repl> attr = {a=true; b=false;}
nix-repl> (attr ? c) or false
error: undefined variable 'or'
at «string»:1:1:
1| (attr ? c) or false
| ^
nix-repl> attr.a or false
trueMaybe instead: broken = grammar.broken or false.
There was a problem hiding this comment.
Might as well rename it to broken and inherit it here IMO.
There was a problem hiding this comment.
isBroken is always false (since it is generated by the update script), the name indicates that is flagged as broken.
I don't mind changing it to broken.
The reason behind this choice is to avoid updating all the JSON files without a new and unnecessary attribute.
...cations/editors/emacs/elisp-packages/manual-packages/tree-sitter-langs/default-grammars.json
Outdated
Show resolved
Hide resolved
...plications/editors/emacs/elisp-packages/manual-packages/tree-sitter-langs/update-defaults.py
Outdated
Show resolved
Hide resolved
...plications/editors/emacs/elisp-packages/manual-packages/tree-sitter-langs/update-defaults.py
Outdated
Show resolved
Hide resolved
|
Honestly, I can't say. That's why it is important to have feedback. What I can say (working on this PR for almost a year now), is that the recent bump from tree-sitter 0.24 to 0.25 has increased the number of broken grammar. The lack of testing is probably at cause here. |
Very much appreciate the effort to add some testing. |
There was a problem hiding this comment.
I think this could just use a variable instead of having to run the check twice.
There was a problem hiding this comment.
Wow, good catch. Thank you.
a81917f to
1439f9f
Compare
|
We should keep the commit history clean- can we squash the fixups? |
I split everything into atomics commit for the review. Indeed, the goal is to merge, but should I do it know? Please be mindful of my time, git manipulations like that take time and can be error-prone. |
|
The absence of versioning in the treesitter ecosystem is maddening. Running some tests help maintainers. I wont merge this since I can't maintain my tree-sitter ecosystem for more than a week so I dont feel qualified but merging this now, at the beginning of a nixos cycle is better than closer to 25.11. |
4b2ba95 to
727a4e2
Compare
|
I don't want to block this, just want some clarity - what will be the effects of this PR on consumers? I went through and made a list of the broken grammars after this: Is the intention to fix some of the more high profile ones in staging, or just push this out quickly to unstable, and hope that upstream issues are resolved? I worry that some of the more high-profile grammars being marked as broken based on failing tests (especially if they seem to be working fine in practice) will leave their consumers to come into this PR to yell at us for merging this. Not against this PR, to be clear! I just want to mention this kind of thing before we get an angry mob mentioning it after-the-fact. |
|
I proposed to @adfaure to make the tests optional and provide a visible way how to run them regardless. Or maybe turn tests on by default but make it possible to ignore them, something like that. |
Hi, I agree with your concerns. As @fricklerhandwerk stated, I will try to find a way to add tests without everything's falling apart. I also want to mention, that when we opened this PR, tree-sitter 25 was just released. My first step tomorrow will be to update the grammars and check if there is any improvement. |
|
Lists in Nix allows for any element to have a different type, correct? My idea is to use an attribute value with key |
| for g in grammars: | ||
| exists = check_grammar_exists(nixpkgs, g) | ||
| if exists: | ||
| existing.append(fmt_grammar(g)) | ||
| broken = check_grammar_broken(nixpkgs, g) | ||
| if not broken: | ||
| existing.append(fmt_grammar(g)) | ||
| else: | ||
| sys.stderr.write("Grammar is broken: " + fmt_grammar(g) + "\n") | ||
| sys.stderr.flush() | ||
| else: | ||
| sys.stderr.write("Missing grammar: " + fmt_grammar(g) + "\n") | ||
| sys.stderr.flush() |
There was a problem hiding this comment.
I propose a rewrite:
for g in grammars:
grammar = fmt_grammar(g)
if not check_grammar_exists(nixpkgs, g):
print(f"Missing grammar: {grammar}", file=sys.stderr)
continue
if check_grammar_broken(nixpkgs, g):
print(f"Broken grammar: {grammar}", file=sys.stderr)
continue
existing.append(grammar)There was a problem hiding this comment.
Hi, thanks.
This script is only for Emacs plugins, so I think that would add the grammars tagged as broken back into the emacs ecosystem.
Which is something that I might have forgotten to handle without your feedback.
PerchunPak
left a comment
There was a problem hiding this comment.
Is the intention to fix some of the more high profile ones in staging, or just push this out quickly to unstable, and hope that upstream issues are resolved?
We can't break a build even for a single grammar, because many use nvim-treesitter.withAllGrammars. Even if we mark some grammars as broken and exclude it from withAllGrammars, we will start to see bug reports like "grammar X is not found", which basically breaks workflow for the end user as it is an extremely annoying error (you have to press enter on each cursor move)
|
@PerchunPak completely agree not breaking things, even temporarilly, is important (in general). It's worth noting that From what I've seen working on #408414 the current state of grammar sources is not... great. That PR is ready for review |
I just wanted to stress that breaking even a single grammar is unacceptable
Nope, |
|
Hi, I understand the problem. First, I want to mention, is that this PR has been released shortly after a new major version of tree sitter (24 to 25), leading to many broken grammars. Currently, I am trying to update grammars to see if there is any improvement (I believe there will). However, I am unable to build the updating scripts since it takes forever to build on my laptop The second changes proposed by @fricklerhandwerk (as an answer to #391520 (comment)) is instead of having a @kimburgess your help will be greatly appreciated indeed if your PR is merged before, thanks ! |
| src, | ||
| location ? null, | ||
| generate ? false, | ||
| broken ? false, |
There was a problem hiding this comment.
seems like it is brokenTests rather than the whole grammar ? might be best to add overrides for those grammars in an overrides.nix rather than automate it. Among the advantages is that we dont introduce a new setting broken, it would be doCheck = false like for every other package, and it encodes it just in overrides.nix rather than the json + nix.
|
now might be a good time to rebase/ attempt a merge if you have the time. |
Especially now that #408414 has been merged :) |
Description of changes
Context
While working on #320783 with @fricklerhandwerk. We used
tree-sitter-grammarsto build Python bindings, we explored ways to test these grammars. We found that the best approach is to use Tree-sitter's dedicated testing framework (documentation).This PR introduces a check phase using
tree-sitter test.Changes
Activate Grammar Testing
Adds a test phase and an
isBrokenattribute.Runs the tests during the update phase and sets the
isBrokenattribute if the tests fail.Update the Emacs Script
Emacs has an automated mechanism for updating grammars. This script is updated to use the
isBrokenflag.Things done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.