Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely expensive highlighting patterns #240

Open
mattmassicotte opened this issue Oct 7, 2022 · 22 comments
Open

Extremely expensive highlighting patterns #240

mattmassicotte opened this issue Oct 7, 2022 · 22 comments

Comments

@mattmassicotte
Copy link
Contributor

In general, swift's highlights.scm is really expensive for the tree-sitter runtime to validate. It takes many seconds with compiler optimizations off. In comparison, Go's highlights, which are comparable in complexity, to my eyes anyways, take less than 1 second.

There are two patterns in particular that are bad.

(call_expression ; foo.bar.baz(): highlight the baz()
  (navigation_expression
    (navigation_suffix (simple_identifier) @function)))
(ternary_expression
  ["?" ":"] @conditional)

I'm not sure I understand why, particularly the second. But, I figured I'd open an issue because this parser's highlights are such an outliner. Perhaps it has something to do with the grammar?

@alex-pinkus
Copy link
Owner

alex-pinkus commented Oct 8, 2022

Yes, this parser is definitely at the high end for query parsing. I'd previously investigated this (and found improvements) in tree-sitter/tree-sitter#1578 and tree-sitter/tree-sitter#1589, but haven't revisited to see if there are simplifications to the grammar.

The Go parser isn't an entirely fair comparison, I think, because that grammar.js is half the size; this probably results in far fewer nodes. But the closest comparison I'm aware of is elixir, which was still quite a bit faster than Swift when I last ran my benchmarks. (Elixir's grammar is similar in size to Go's, but I would need to dig up my benchmarking infrastructure to really compare the two)

Thank you for isolating it down to just those patterns -- this makes me wonder if it has something to do with the tricks I do with aliased nodes to disambiguate function calls properly in the presence of all of the other language constructs. I will try to find some time to spin up my benchmarks again and see what can be optimized.

Out of curiosity, why are you using a build with compiler optimizations turned off? How commonly do you do this, and is it feasible to simply use the build with compiler optimizations?

@mattmassicotte
Copy link
Contributor Author

Holy hell the analysis and work you put into those optimization bugs is amazing.

I cannot take credit for finding those particular patterns. I just suggested to someone else they start commenting out patterns to see if some are to blame, and they found them.

Soooo, if you incorporate a parser using SPM, it inherits a lot of the build settings for whatever your final target will be. If you use this in a standard Xcode environment, it will be debug (with zero optimizations) for debug builds. This is not ideal, I know. Pre-built binaries make sense for many reasons, so I'm trying to make that happen now. But, it's a lower-priority thing for me at the moment.

@alex-pinkus
Copy link
Owner

Actually, it just occurred to me that the ABI 14 improvements aren't going to kick in here, since with-generated-files branch doesn't build for ABI 14. That's going to be a huge slowdown here. That would be pretty easy low-hanging fruit...

@mattmassicotte
Copy link
Contributor Author

That would be awesome!

@mattmassicotte
Copy link
Contributor Author

Ok, I found one that is possibly an infinite loop in ts_query__analyze_patterns. I would normally file this with the runtime, since that's where it belongs. But, the maintainers have become pretty unresponsive.

(function_declaration
  "func" @structure.anchor
  body: (function_body
    "{" @structure.open
    "}" @structure.close))

Any ideas? It's related to the "body" pattern. Removing that allows this to terminate.

@mattmassicotte
Copy link
Contributor Author

It wasn't really fair to put this on you anyways. I've opened an issue with the runtime, and I should now experiment to see if ABI 13 makes a difference. But, not sure I can get to that in the near-term.

tree-sitter/tree-sitter#1916

@alex-pinkus
Copy link
Owner

Whether or not it's fair, it's certainly an interesting problem :)

Interestingly, my desktop can run ts_query__analyze_patterns on this query in around 50 seconds as long as I'm not trying to debug it. There are definitely a lot of expensive sorted insertions happening; we end up with an AnalysisStateSet containing 29932 items at one point. That number seems to be the intrinsic result of something structural in the grammar, since it's identical whether I'm using ABI 13 or 14 (both of which take the same amount of time).

To me there are two questions:

  1. What is it about my grammar that's causing so many candidate states to be in-scope for that body: query?
  2. Is there a way to optimize that code so that it isn't doing millions of sorted insertions to service these states?

@mattmassicotte
Copy link
Contributor Author

Huh. Ok, well it's great to know that it isn't an ABI regression.

@augustwester
Copy link

Hey @alex-pinkus, thanks for sharing this project 🙏

I was wondering if there's an update on this issue? I came here when I noticed that highlighting a tiny Swift script takes 1.52s on my machine, whereas highlighting a similarly sized Python script takes 0.02s. (I am by no means an expert on parsing and syntax highlighting, so I hope this is not an unfair comparison.)

@mattmassicotte
Copy link
Contributor Author

mattmassicotte commented May 24, 2023

@augustwester Does highlighting take that long, or is it preparing the highlighting query? This bug is about the latter.

Also, are you running in Debug? Tree-sitter's parsers and runtime are extremely sensitive to compiler optimizations.

@augustwester
Copy link

@mattmassicotte It's building the query! I'm new to tree-sitter, so I wouldn't know how to verify this for the tree-sitter CLI, but I'm currently using SwiftTreeSitter in my app and—from there—have noticed that building the query is very slow (upwards of 8s compared to 7ms for tree-sitter-go).

I'm not sure which optimization level I'm using when running tree-sitter-swift from the tree-sitter CLI. (How do I check this?) All I can tell you is that setting the Clang optimization flag to -Os for my app results in no speed-ups.

@mattmassicotte
Copy link
Contributor Author

@augustwester Ok, so then, yes, this is what this bug is all about.

I'm the author of SwiftTreeSitter - glad you are finding it useful! One thing I do want to do is update to the very latest tree-sitter release in SwiftTreeSitter. I don't have a good reason to believe that will address this, but I also don't want to fall to far behind and who knows?

SPM packages build with the same level as your final target. So if you are just running your app, it's almost certainly debug. I'm actually not that familiar with the CLI, so I don't actually know what it's doing or how it works. But, I've found I need to modify some of the contents of highlights.scm to reduce the preparation time to something reasonable. The details for that are up above. But, you should still be prepared for query prep to be a slow process, I'm afraid. Go isn't a great comparison because the grammar is vastly simpler.

@augustwester
Copy link

@mattmassicotte That's awesome, I'm finding it very useful 👍

Fair enough about Go, I don't really have much insight into what makes a grammar complex or not. Maybe C# would be a better comparison? In that case, SwiftTreeSitter builds the query in 1.2s vs. 8s for Swift.

Anyway, I don't wanna beat a dead horse here. Just wanted to follow up, since tree-sitter-swift is an outlier in this respect.

@mattmassicotte
Copy link
Contributor Author

Definitely worth following up. This class of problems (expensive query prep time) is something I'd really love for the tree-sitter maintainers to take a look at. tree-sitter/tree-sitter#1916 is the bug I filed there.

@wSedlacek
Copy link

I have just started learnig Swift and I have been setting it up with Neovim.
I noticed that when I install treesitter for neovim it makes adding new lines feel laggy.
I believe the issue described here is what I am experencing.

Is there work around for this?

@mattmassicotte
Copy link
Contributor Author

This bug is really about the initial query processing - a one-time cost. So, if you are seeing a long initial delay before indentation calculations work, then yes, this could be the issue. But if you are seeing persistent lag even after correct behavior this is probably not related.

@wSedlacek
Copy link

wSedlacek commented Dec 16, 2023

It’s hard to describe exactly what the lag feels like but the cases I notice it are

a. When first opening a swift file is the longest. Even with a file only 134 lines but takes a good 500ms+ to open.
b. When using o and O to add new lines. (This moves things up and down), this one has a huge variance. When I spam it and u to undo it’s not every insertion that feels slow but like every 5th one or something. It’s like there is some expiring cache or something going on.

Interesting when scroling in switching back to the buffer from another buffer it feels fine.

A note I have quite a few plugins that use treesitter including indent-blankline, nvim-ufo, guess-indent, nvim-treesitter-textobjects.
I have tried disabling some of these but I haven’t been to find any particular one (or combination) that is making this problem occur (particular in case b)

I really do prefer the highlighting (and other features) I get when using treesitter so I am really hoping there is something that can be done to fix this.

@mattmassicotte
Copy link
Contributor Author

A could be caused by this issue. I don't think any of the other phenomena could. But, I have found that tree-sitter also has many other performance issues which are parser-dependent. Simple grammars go faster, and Swift is definitely not a simple grammar.

Regardless, these are definitely issues on the tree-sitter runtime side.

@wSedlacek
Copy link

wSedlacek commented Dec 17, 2023

Intersting I was able to run :TSBufDisable indent and the slowness completely went away.
If there was a way to disable this by default (I would rather not use an autocmd as then the file would still be slow to open) then that would be a reasonable work around for me.

Update! Got it without a autocmd. Setting up nvim-treesitter with this config gits it disabled only on swift. File open times are sitll a bit slow but at least the lag is gone when editing. Amazing!

opts = {
  indent = { enable = true, disable = { “swift", "text" } },
}

@mattmassicotte
Copy link
Contributor Author

Ok so then this is entirely related to the performance of the indentation system. Would take more work to understand where the problem is coming from, but I think we can be pretty confident it is not related to this bug.

@sainttttt
Copy link

Just wondering, has there been any progress on this issue? Disabling indenting fixed the issue of there being lag on a n initial newline entry, but if I say for example hold down enter, it lags a lot when entering newlines, even with syntax and indenting turned off. However with tree-sitter-swift totally uninstalled it's extremely smooth

@mattmassicotte
Copy link
Contributor Author

Just wondering, has there been any progress on this issue?

No. But, I'm pretty sure we have established that this issue isn't at all a factor in poor indentation performance. My gut is that this is a problem within the client editor itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants