Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve highlighting in basic IEx snippets #17

Closed
wants to merge 1 commit into from
Closed

Conversation

jonatanklosko
Copy link
Member

@jonatanklosko jonatanklosko commented Dec 17, 2021

Code with IEx snippets is usually not a valid Elixir syntax (as soon as we introduce blocks), however tree-sitter parses most of code just fine. Consequently, with a few rules we can significantly improve highlighting.

IEx snippets are mostly used in Markdown like this:

iex(1)> if true do
...(1)>   1
...(1)> end

Before

image

image

image

After

image

image

image

As seen on the last snippet, more complex (especially nested) code still has glitches, but that's expected given the syntax is invalid.

Currently the syntaxes above are invalid because end is parsed as identifier and the actual block end is missing. Once tree-sitter/tree-sitter#246 is in place, we would be able to make sure reserved keywords are never parsed as identifiers, which would most likely make the blocks parse fine (the parsing error would be in a different node), and consequently the highlighting would also work for nested blocks.


@patrickt it would be neat if there was a token type that gets CSS user-select: none;, so that the code can be copied without the prefixes, like here. Ruby has similar prefixes, so maybe it could be used there too. That said, this may be out of the scope given it is based on invalid syntax in the first place, but just throwing an idea :)

cc @maxbrunsfeld in case you have any thoughts regarding this use case

@the-mikedavis
Copy link
Member

I wonder if it makes sense to do this with injections instead? I.e. having a separate tree-sitter-iex grammar that does a combined-injection of all the code on the right-hand-side of the > as elixir?

I imagine using injections would make it a little less glitchy when there are big nested blocks as you say, but it already seems to work pretty well so idk if it's worth the overhead of maintaining separate grammars

@jonatanklosko
Copy link
Member Author

jonatanklosko commented Dec 17, 2021

As far as I understand the injection would need a single specific node, while the iex>|...> prefix nodes are interspersed with the elixir code. So we would have a number of nodes with Elixir fragments, but those fragments would only make sense parsed together. Let me know if I'm missing some other way of doing this :)

@the-mikedavis
Copy link
Member

the-mikedavis commented Dec 17, 2021

I think you can use combined injections to parse all nodes together

injection.combined - indicates that all of the matching nodes in the tree should have their content parsed as one nested document.

(from the docs here)

This was something I was attempting to use with the heex grammar (see helix-editor/helix#881, although the editor does not yet support combined injections), so if you had a snippet like

<main class="container">
  <%= if true do %>
    <p>Hello, tree-sitter!</p>
  <% end %>
</main>

you could parse the nodes on lines 2 and 4 together. I think this also works with erb templates (for example the do and end pair here: https://github.com/joruri/zplugin4-content-docin/blob/18154c650033688d423391dc50a68d9f8c621f77/app/views/zplugin/content/docin/admin/front/index.html.erb#L13-L20)

edit: that heex pr has the combined injection query commented out here: https://github.com/helix-editor/helix/pull/881/files#diff-dfd9ae7d40dd28d7f3fe8d34cb496744db11e59a89c4a1a51b1e4e1e48d58f32

@jonatanklosko
Copy link
Member Author

jonatanklosko commented Dec 17, 2021

Ohh awesome, that's totally how embeddings work too. Sounds like the right solution then! @josevalim so I was wrong, we can actually do it :D

@jonatanklosko
Copy link
Member Author

One thing to note is that we will need to annotate markdown snippets with iex and not ex/elixir. This is a good thing, except that existing snippets won't automatically get the updated highlighting.

@josevalim
Copy link
Member

I think that’s fine. I believe makeup for Elixir handles both elixir and iex right?

@jonatanklosko
Copy link
Member Author

@josevalim correct!

@the-mikedavis
Copy link
Member

the-mikedavis commented Jan 8, 2022

I tried out the combined-injections approach and it works pretty well!

And the grammar comes out to be super small, only 28 lines for now: https://github.com/the-mikedavis/tree-sitter-iex/blob/6fef34bc1d530de496bb527876bcc8d5ed997aab/grammar.js

The highlights above are in the helix editor but the highlights and injections queries in the repo work with tree-sitter highlight too (and so I would assume with GitHub)

@jonatanklosko
Copy link
Member Author

@the-mikedavis thanks for pushing this forward! Closing in favour of the new grammar :D

@jonatanklosko jonatanklosko deleted the jk-iex branch January 8, 2022 15:08
@the-mikedavis
Copy link
Member

@jonatanklosko should tree-sitter-iex be moved under elixir-lang? I think it would make sense to have tree-sitter-{elixir,iex,eex} all under elixir-lang and tree-sitter-{heex,surface} under phoenixframework, but maybe there should be an elixir-trees organization instead since there are so many? sorry I think I'm rehashing some of the discussion in #2

@jonatanklosko
Copy link
Member Author

@the-mikedavis moving sounds good to me! Either grouping works, I'll defer to @josevalim :)

@josevalim
Copy link
Member

Maybe it should be part of this repository instead?

@jonatanklosko
Copy link
Member Author

I'd keep the repositories separate. This aligns with what tree-sitter does and I think editor integrations point to GitHub repos and they may not necessarily support sub-directories (I may be wrong here).

@the-mikedavis
Copy link
Member

I've seen some other tree-sitters that have multiple grammars in one repo (like the ocaml grammar, see ocaml/ and interface/), so that's an option, but I think it's usually only done when the grammar.js files depend on one another and/or the grammars want to share highlight/injection queries

Plus as @jonatanklosko says, it's less straightforward to integrate with editors (although I think possible, like ocaml in helix)

@josevalim
Copy link
Member

Alright, let's move it in then. Can you please add me as admin to the project? Then I think can move it. :) Or try to transfer it to me or to the org. Thanks!

@the-mikedavis
Copy link
Member

hmm I'm not actually sure how to elevate you to admin? I know where the buttons are for organization stuff but it seems to be different for personal repositories 🤔

@the-mikedavis
Copy link
Member

I think I can transfer ownership to you and then you can transfer it to elixir-lang? Shall we try that?

@josevalim
Copy link
Member

@the-mikedavis let's try that, yes! and thank you!

@the-mikedavis
Copy link
Member

ok it's sent! thanks 🚀

@josevalim
Copy link
Member

@the-mikedavis merged and I have invited you to the tree-sitter team! ❤️ 💯 However, if you prefer to not be involved, let me know or feel free to remove yourself. Thank you!

@the-mikedavis
Copy link
Member

thanks @josevalim, I'm honored, and I'd love to be involved!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants