Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It's very common for users to want to match
TODO
in addition toTODO:
(ending in colon). To match that, users needed to match over all text nodes, since every word is a text node, that was slow. So the main improvement here is to not expose text nodes, and instead expose any uppercase word as a node. Unless you write all of your comments very angry in SCREAMING case, performance should improve.Changes
This should improve performance, but it will yield more false positives, but I think it's better if users explicitly match the tag names they want (TODO, NOTE, etc), so it shouldn't be a problem in practice.
:
.TODO (something):
.(user)
node was renamed to(annotation)
#10
or!10
. If people want, I can introduce a way to match those things, maybe a(reference)
node/[@#!]([a-zA-Z0-9)/
.While this new approach makes the parser less flexible without the text nodes, it will make it faster, or at least faster to match, since the parsing itself is/was already fast (doing a match over text nodes is the slow operation). Another approach I was thinking is to parse the keywords directly, but that makes it even less flexible, since any new keywords will need to be added in the parser instead of a query.
Other notes
I tried moving everything to JS in f2318ba, but it was slow, due to the use of
conflicts
.Original comment:
Moving everything to JS means losing a little bit of control over the exact matching of the tag name, resulting in matches for:
TODO:nospace
(previously, it was needed to have a space after:
).TODO(\n note \n):
(\n
is a literal new line).I think it's fine for those to match, don't think you see much of those in the wild, if they are a problem, we can try adjusting the rules, or go back to have that small logic in the external scanner.
UPDATE: Did a benchmark, it's SLOWWWW, from 36 ms to 1264 ms, this is parsing a file with 10K lines full of comments. This is probably due to the conflicts that tree-sitter needs to resolve...