-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TOML syntax highliting breaks with strings longer than 1024 characters #6402
Comments
It's not. This is a deliberate limitation of the old prettylights syntax highlighting engine which provides the highlighting for files that use Textmate-compatible grammars, like TOML. This was implemented waaaay back in 2014 for performance reasons as very few people will realistically read such long lines on GitHub so there's little point in consuming resources highlighting them. This limit isn't going to change as the old highlighting engineer is now in maintenance mode as we slowly migrate to a new tree-sitter based highlighter. I also doubt many people realistically read such long lines on GitHub beyond a cursory glance.
Almost. That is likely to be hitting the other deliberate decision of limiting the stack depth to 256 which should be very rarely hit and when it is, it's almost certainly due to an error in the grammar. Closing as will not fix. |
I opened this issue because this caused a code review to fail. Politiwatch/privacyspy#154 (comment) In privacyspy, we use long strings in our TOML files, as these are complete extracts from privacy policies, and legal documents don't tend to be short. In the code review that I linked, the reviewer thought that because the TOML string wasn't highlighted, the string had invalid syntax, so he asked for that to be revised again. I then had to manually check that this was actually an issue with github.
That's precisely why this should be fixed. This is not a new feature, it's a bug which was caused by an undocumented behavior. And it's blocking a key functionality of Github: code reviews. I can't imagine that increasing a limit set in 2014 almost 10 years later would be very computationally expensive. If the fix is as simple as increasing a number, then this shouldn't take long to fix. But if the limit truly cannot be increased, then a visible warning should be displayed that the syntax-highlighting failed, if possible with an explanation of why, and a link to its related discussion/issue. This limit should also be documented somewhere where users having the same issue will easily find it, to avoid unnecesary spending in support-peoples salary, as they'll be the first ones who will have to deal with the confused users, victims of this undocumented behavior. One example of an easily findable page title could be: "Github Syntax Highlighting Docs". There, these hidden limits would be explained. This issue would remain open until those changes are implemented, as to avoid duplicate issues in the future. What do you think @lildude ? |
This isn't a bug. It's a deliberately implementation decision.
And there's the rub. You're assuming it's a simple fix. Removing such a limitation may not be trivial and may require way more work elsewhere within the main GitHub code base which is working on the knowledge of this limitation. We won't find this out until this change is made and tested, however the old highlighter is no longer getting funding so no active development, even for actual bugs, is happening. I've raised an issue with the team that "supports" the old highlighter, but it's not likely to see any action as is the case with the issue I opened for #5069 which is an actual bug. This behaviour has also been like this for nearly 10 years without issue, until now 😁. That said, the other solution is to change your code and introduce a line break in these long lines. I see several other fields in that PR contain line breaks.
This also isn't likely to happen for this old highlighter as it's on its way out and would require funding for someone to trawl through the code and find and document all of these limitations. I'm not part of the team that maintains the syntax highlighting engines, but I believe all these limitations will be gone once we've completed the move to the new engine which uses tree-sitter grammars. |
Describe the bug
The toml syntax-highliting breaks in github if the line is 1024 (or more) characters long. Example: Line 1022 in this file: https://github.com/ItsIgnacioPortal/testrepo/blob/c638d9977ac4b9f1b2e1f8ed282a344167705d71/biglines.toml#L1022
Expected behaviour
That line 1022 be highlighted in the same color as line 1021 was.
Related discussion
N/A
Additional notes
The line in which the syntax-highlighter broke, is 1024 characters long (1024 = 2^10). This doesn't seem like a coincidence.
In this repository I've included a python script that can generate toml files with long strings, so that this issue can be tested more easily.
This might be related to #6227
The text was updated successfully, but these errors were encountered: