Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LSP syntax highlighting #814

Closed
kirawi opened this issue Oct 5, 2021 · 20 comments
Closed

LSP syntax highlighting #814

kirawi opened this issue Oct 5, 2021 · 20 comments
Assignees
Labels
A-language-server Area: Language server client A-plugin Area: Plugin system C-enhancement Category: Improvements

Comments

@kirawi
Copy link
Member

kirawi commented Oct 5, 2021

This will likely require us to refactor the current syntax highlighting system to accommodate other syntax highlighting methods.

@kirawi kirawi added C-enhancement Category: Improvements A-language-server Area: Language server client E-hard Call for participation: Experience needed to fix: Hard / a lot labels Oct 5, 2021
@kirawi kirawi self-assigned this Oct 8, 2021
@kirawi kirawi removed the E-hard Call for participation: Experience needed to fix: Hard / a lot label Oct 9, 2021
@archseer
Copy link
Member

I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..

@kirawi
Copy link
Member Author

kirawi commented Oct 17, 2021

I plan to make it optional in config.toml or languages.toml. There are some cases in which tree-sitter isn't enough.

@pickfire
Copy link
Contributor

tree-sitter is limited such that it does not have context of the language, like similar returns highlight or similar .await highlights.

@kirawi kirawi added the A-plugin Area: Plugin system label Dec 7, 2021
@poliorcetics
Copy link
Contributor

Also, for Rust, unsafe highlighting is nice in VSCode and what I miss the most in helix

@jakenvac
Copy link
Contributor

jakenvac commented Jul 4, 2022

I'm not a big fan of the LSP highlighting spec. It requires a lot of back and forth traffic with the LSP to do highlighting in real time. VSCode should adapt tree-sitter already..

Does this mean this won't ever be a first class feature of helix? I'm not really fussed either way. I only ask because I'm working on a tool to convert themes between several text editors formats and this would factor into the implementation longer term.

@LouisGariepy
Copy link

LouisGariepy commented Aug 31, 2022

While LSP is heavy, tree-sitter does not provide comparable highlighting. One solution could be to have LSP as an opt-in, with tree-sitter as the default.

As petty as it sounds, the lack of semantic highlighting is one of the very few things that hold me from daily driving Helix. Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.

@pickfire
Copy link
Contributor

pickfire commented Sep 1, 2022

@LouisGariepy IIRC in the release notes I remember there is space-h select the same symbols in the function, that way it can be similar like part of the syntax highlight feature.

@LouisGariepy
Copy link

LouisGariepy commented Sep 1, 2022

Could a maintainer point out the conditions for an acceptable solution then? If you simply don't want this feature in Helix (which is totally fair!), maybe you could close or mark as "wontfix" to avoid confusion.

To be clear, I'm asking from the perspective of someone who'd like to contribute to this effort. I just don't want to waste my (and your) time with a feature that will never be merged.

@jakenvac
Copy link
Contributor

jakenvac commented Sep 6, 2022

@LouisGariepy

Being able to jump in and see the same theming quality as I'm used to is a must for me, and I imagine other prospective users as well.

I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind? In my experience, it's the opposite. I quite often find tree sitter highlighting being better than the alternatives.

@univerz
Copy link

univerz commented Sep 6, 2022

I'd be interested to see a side by side comparison with the theme/languages you use if you don't mind?

i miss mutable variables (in rust) with bold modifier.

@archseer
Copy link
Member

archseer commented Sep 6, 2022

That's doable with a custom locals.scm query.

@poliorcetics
Copy link
Contributor

That's doable with a custom locals.scm query.

Is it though ? They're talking about highlighting the variable as mutable each time it's used, not just at the definition point.

I honestly don't know if this is possible, if it is it seems like a quick win to add it to helix

@kirawi
Copy link
Member Author

kirawi commented Sep 6, 2022

Yeah, that's doable with a locals.scm. It will track the use of an identifier throughout the scope and apply the color it was highlighted with at the definition.

@LouisGariepy
Copy link

LouisGariepy commented Sep 6, 2022

Edit for future readers: Please read archseer's response to my comment, which gives a more correct explanation of what is and what is not possible to do with tree-sitter Vs LSP highlighting. My comment contains some inaccuracies, but I'll leave it as is for posterity.


@jakeHL

Alright! I jotted down some basic code to see the potential differences. I tried to emphasize differences due to "limitations", not stylistic ones. There might be more that I'm missing for complex code bases, this is just basic stuff.

Here goes:

Public items are distinct from private ones.

Note that this holds for any item that can be made public, not just functions.

public_item

public_item_helix

Mutable variables and operators are distinct from immutable ones.

Mutable functions/method calls are likewise distinct.

mutable_variables

mutable_variables_helix

Variables in format strings are recognized as such (distinct from the rest of the string).

format_variables

format_variables_helix

Doc comments are distinct from "regular" comments.

Might be hard to see (stylistic choice), but the doc comments are lighter in color.

doc_comments

doc_comments_helix

Generic types are distinct from their trait bounds.

trait_bounds

trait_bounds_helix

Macro rules binding arguments are distinct from constant patterns.

$e and $es are distinct in color compared to the rest of the macro_rules syntax.

macro_rules

macro_rules_helix

Attribute syntax is recognized properly.

attributes

attributes_helix

Imports (uses) are properly highlighted according to their types.

imports

imports_helix

Trait methods are distinct from non-trait methods.

Notice how the trait method is italicized.

trait_methods

trait_methods_helix

Enum variants are distinct from structs and function calls

enum_variants

enum_variants_helix

Conclusion

While some of these features might be possible to implement in tree-sitter, I'd like for us to think of the practicality of doing so. I have put many hours into trying to make my "perfect" tree-sitter theme, and I had to give up because I simply could not find a decent/possible way to make some features work. I had a hard time finding documentation and examples of advanced features.

For end users, this results in two things:

  1. Being unable to use the plethora of LSP themes that already exist.
  2. The themes that do exist in tree-sitter form do not have as extensive highlighting capabilities.

Alternatively, if some of these features are known to be possible in tree-sitter, maybe they should be implemented in one of the default themes, so that users can reuse the same logic without having to reinvent the wheel.

@kirawi
Copy link
Member Author

kirawi commented Sep 6, 2022

Most of these are currently achievable by modifying the queries except for imports and accurate type highlighting. With #1252 however, I think that would be addressed as well.

There are certainly limitations to tree-sitter, such as not being able to highlight markdown in doc comments, but maybe that's something that could be worked on in the future. Language-specific features such as trait methods probably do need LSP highlighting.

@kirawi
Copy link
Member Author

kirawi commented Sep 6, 2022

It looks like you deleted your last comment, but I think you have a misunderstanding of how tree-sitter works. From how I understand it, tree-sitter a parser generator with an AST geared towards syntax highlighting. However, you also need to write queries to capture AST nodes such that they can be highlighting in themes.

You can check the tree-sitter playground to see some of what these ASTs look like.

See:
https://tree-sitter.github.io/tree-sitter/syntax-highlighting#queries
https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries

@jakenvac
Copy link
Contributor

jakenvac commented Sep 7, 2022

@LouisGariepy Thanks for that rundown. I think you've demonstrated why it would be a useful feature.

I do think that the main thing it gives us that tree sitter doesn't is type related highlighting. I'm pretty sure with enough time you could create tree-sitter queries for anything that is purely syntax focused.

Here's an example of doc comments & string interpolation already working in Typescript in helix. I might be using a modified Everforest theme, I can't remember.

Screen Shot 2022-09-07 at 11 10 59

@archseer
Copy link
Member

archseer commented Sep 7, 2022

Yeah, tree-sitter grammars are parsers that build an in-memory AST of the file. Then the highlighting queries are like selectors that tag specific parts of the tree with span names, and themes can hook into these spans to highlight them as they wish.


Public items are distinct from private ones.

This is as simple as adding a highlighting query that themes can hook into.

Mutable variables and operators are distinct from immutable ones.

Can be done with a locals query.

Variables in format strings are recognized as such (distinct from the rest of the string).

Probably needs a change in the parser, but it could be injected into such strings and the embedded parts interpolated. It works in other languages.

Doc comments are distinct from "regular" comments.

This is something I've been waiting for tree-sitter/tree-sitter-rust#128

Generic types are distinct from their trait bounds.

This is as simple as adding a highlighting query that themes can hook into.

Macro rules binding arguments are distinct from constant patterns.

Not sure yet since macros produce a token tree but probably possible.

Attribute syntax is recognized properly.

Fixed in 301f5d7

Imports (uses) are properly highlighted according to their types.
Trait methods are distinct from non-trait methods.

These two are to do so yeah, it could be LSP specific.

Enum variants are distinct from structs and function calls

Probably possible with locals.


I wish LSP provided these highlights as an augmented layer on top of regular highlighting instead of completely overriding highlighting and churning out tons and tons of spans per keypress.

Anyway, for this to be implemented, a vec of produced spans would have to sit somewhere on the document, that would then convert to a HighlightIter and hook into our regular rendering mechanism without changes.

@archseer
Copy link
Member

archseer commented Sep 7, 2022

I do think that the main thing it gives us that tree sitter doesn't is type related highlighting.

I think in combination with stack graphs we wouldn't just get go to definition support but also a way to determine the item type since we can peek at the definition

@kirawi
Copy link
Member Author

kirawi commented Jan 3, 2023

A discussion on how LSP syntax highlighting can be gracefully implemented was discussed on Matrix:

Tree-sitter will be the default highlighter, while range-based LSP syntax highlighting will be merged asynchronously. Additionally, we would compute the correct highlights for locals based on what the LSP syntax highlighter sends.

@kirawi kirawi converted this issue into discussion #5589 Jan 19, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
A-language-server Area: Language server client A-plugin Area: Plugin system C-enhancement Category: Improvements
Projects
None yet
Development

No branches or pull requests

7 participants