Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inject language based on file extension & shebang #3970

Merged
merged 6 commits into from
Apr 28, 2023

Conversation

nrdxp
Copy link
Contributor

@nrdxp nrdxp commented Sep 25, 2022

Nodes can now be captured with "injection.filename". If this capture contains a valid file extension known to Helix, then the content will be highlighted as that language. In addition, a new "injection.shebang" is added which injects based on a valid shebang.

For consistency, both implementations use the same logic as used to detect the document filetype as a whole.

I'm not sure if the following is the most efficient way to capture the built in configuration, or if it can be passed in from somewhere else:
https://github.com/nrdxp/helix/blob/7f61de137f07cf5897d5a3240b1ece98febc0eb0/helix-core/src/syntax.rs#L1836

Also, it would probably be nice to extend this to user defined languages as well. 65bc9da

Also includes an additional query for Nix so one can test this behavior themselves. A few screens with that query:
image
image

This PR also includes a new document briefly describing language injeciton.

@the-mikedavis
Copy link
Member

I think I would prefer injection.filename for this since the language is computed by the extension but the captured value is the full filename

@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 26, 2022

I think I would prefer injection.filename for this since the language is computed by the extension but the captured value is the full filename

good point, I'll push a fix

@nrdxp nrdxp force-pushed the injection-extension branch 2 times, most recently from b1876f9 to 996a43c Compare September 26, 2022 01:01
@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 26, 2022

Updated to "injection.filename" and also addressed the clippy lints

@the-mikedavis the-mikedavis added A-tree-sitter Area: Tree-sitter S-waiting-on-review Status: Awaiting review from a maintainer. labels Sep 26, 2022
@the-mikedavis the-mikedavis added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from a maintainer. labels Sep 27, 2022
@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 27, 2022

There appears to be one little bug I would like to get to the bottom of before merging. Mainly, if the capture happens more than once in a file, then the last value of injection.filename overrides all the other string contents highlighting. Not sure why this isn't an issue with injection.language since I didn't modify any of the highlighting logic.

To illustrate:

{
  f = writeText "foo.sh" ''
    echo $bar
  '';
  g = writeText "foo.py" ''
    def foo():
        print("hello")
  '';
}

Will highlight both strings as python, however, if I change "foo.py" to just "foo" then both are highlighted as haskell.

So essentially, the value is being applied to every string captured globally. This appears to have something to do with "injection.combined" since if I turn it off, this doesn't occur, but the downside is that highlighting after interpolation isn't correct either.

edit
After review the original issue thread for combined injections it would appear this is intentional. I'm still a bit confused because this does not happen with my comment style injection queries which also use injection.combined. So I wonder if I am perhaps misusing injections.combined and instead my query should somehow be improved to be more specific perhaps?

Also, if we really wanted to, we could make the implementation even simpler by just allowing "injection.language" to fallback to filetype detection if language name detection fails with this simple diff:

  diff --git a/helix-core/src/syntax.rs b/helix-core/src/syntax.rs
  index e0a984d2..f7fb8e7a 100644
  --- a/helix-core/src/syntax.rs
  +++ b/helix-core/src/syntax.rs
  @@ -539,6 +539,12 @@ pub fn language_configuration_for_injection_string(
               let configuration = &self.language_configs[i];
               return Some(configuration.clone());
           }
  +
  +        let path = Path::new(string);
  +
  +        if let Some(config) = self.language_config_for_file_name(path) {
  +            return Some(config);
  +        }
           None
       }

And that would be all that is required.

@the-mikedavis
Copy link
Member

injection.combined is for combining all nodes captured with injection.content together. There isn't currently any scoping for it - it applies for the whole document. I don't think the comment highlighting or this should be using it now that I think about it. Does something break if we remove it?

I think injection.language should be separate from this

@the-mikedavis
Copy link
Member

the downside is that highlighting after interpolation isn't correct either

Ah sorry, reading is hard 😅

To cover this you most likely want to use injection.include-children instead of injection.combined

@the-mikedavis the-mikedavis added S-waiting-on-review Status: Awaiting review from a maintainer. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Sep 28, 2022
@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 28, 2022

@the-mikedavis, thanks for the feedback, I'll try to fix the queries and then it should be ready. Also, fyi I changed the implementation just a little bit. I think it's definitely better. It should also be more efficient since we don't have to traverse the languages once to get the name and then again to get the config.

Instead I introduced a new enum InjectionCapture, to represent each @injection capture. This should make it easier to extend again in the future if we wanted to.

edit
"include-children" doesn't seem to function the way I need. Say for example in this nix string containing bash:

''
  echo "${bar}"
  echo baz
''

The second echo will not highlight correctly, because the result is not combined, even with include-children meaning the bash highlighter mistakenly thinks the string after the interpolation is the opening of a shell quote instead of the closing. In other words it thinks the quotes are unbalanced when they are not.

Perhaps we need a new capture, something like "injection.combine-siblings", that does the samething as "injection.combined" but not globally, only within the scope of the current query. Or maybe "injection.combined" could just be modified to do the same, if it doesn't break anybodies use-case. I'm not sure where a global capture like that would be useful where a scoped one would not, but maybe there is.

Either way though, it might be best to implement in a seperate PR. I think it's rare enough to define two scripts of a diffferent language in one file that merging this wouldn't be horrible in its current state.

@nrdxp nrdxp changed the title inject language based on file extension inject language based on file extension & shebang Sep 28, 2022
@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 28, 2022

My last refactor made it fairly trivial to add another commit to highlight based on a captured shebang. In addition, I wrote up a brief doc explaining language injection queries.

book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
helix-core/src/syntax.rs Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
@nrdxp nrdxp force-pushed the injection-extension branch 2 times, most recently from b33d5a9 to 0deaa68 Compare September 30, 2022 17:42
@nrdxp
Copy link
Contributor Author

nrdxp commented Sep 30, 2022

@the-mikedavis, I address your comments and cleaned up the commit history. Also, the PRs I submitted to upstream tree-sitter-nix were just merged today, so I added a small commit to bump it and included some small query fixes.

@nrdxp nrdxp force-pushed the injection-extension branch 2 times, most recently from b1affae to b751a21 Compare September 30, 2022 17:56
@pascalkuthe pascalkuthe added C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from a maintainer. labels Apr 14, 2023
Nodes can now be captured with "injection.filename". If this capture
contains a valid file extension known to Helix, then the content will
be highlighted as that language.
Nodes can now be captured with "injection.shebang". If this capture
contains a valid shebang line known to Helix, then the content will
be highlighted as the language the shebang calls for.
@nrdxp
Copy link
Contributor Author

nrdxp commented Apr 26, 2023

Thanks for the thorough review! I addressed all of your comments, and hopefully it is ready now 🤞

@pascalkuthe pascalkuthe added S-waiting-on-review Status: Awaiting review from a maintainer. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 26, 2023
pascalkuthe
pascalkuthe previously approved these changes Apr 26, 2023
Copy link
Member

@pascalkuthe pascalkuthe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now

@pascalkuthe pascalkuthe added this to the 23.04 milestone Apr 26, 2023
Copy link
Contributor

@David-Else David-Else left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are few spelling and typing errors I noticed.

book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
book/src/guides/injection.md Outdated Show resolved Hide resolved
The `@` is now highlighted properly on either side of the function arg.

Also, extending the phases with `buildPhase = prev.buildPhase + ''''`
is now highlighted properly.

Fix highlighting of `''$` style escapes (requires tree-sitter-nix bump)

Fix `inherit` highlighting.
Split out injection pair logic into its own method to make the overall
flow easier to follow.

Also transform the top-level function into a method on a
HighlightConfiguration.
@nrdxp
Copy link
Contributor Author

nrdxp commented Apr 26, 2023

Thank you @David-Else. Looks like we really need a spellchecker for Helix as well 😅

@David-Else
Copy link
Contributor

Thank you @David-Else. Looks like we really need a spellchecker for Helix as well sweat_smile

You can use https://github.com/helix-editor/helix/wiki/How-to-install-the-default-language-servers#ltex-ls , for Markdown, it is excellent!

@archseer archseer merged commit 9c6c63a into helix-editor:master Apr 28, 2023
Triton171 pushed a commit to Triton171/helix that referenced this pull request Jun 18, 2023
* inject language based on file extension

Nodes can now be captured with "injection.filename". If this capture
contains a valid file extension known to Helix, then the content will
be highlighted as that language.

* inject language by shebang

Nodes can now be captured with "injection.shebang". If this capture
contains a valid shebang line known to Helix, then the content will
be highlighted as the language the shebang calls for.

* add documentation for language injection

* nix: fix highlights

The `@` is now highlighted properly on either side of the function arg.

Also, extending the phases with `buildPhase = prev.buildPhase + ''''`
is now highlighted properly.

Fix highlighting of `''$` style escapes (requires tree-sitter-nix bump)

Fix `inherit` highlighting.

* simplify injection_for_match

Split out injection pair logic into its own method to make the overall
flow easier to follow.

Also transform the top-level function into a method on a
HighlightConfiguration.

* markdown: add shebang injection query
wes-adams pushed a commit to wes-adams/helix that referenced this pull request Jul 4, 2023
* inject language based on file extension

Nodes can now be captured with "injection.filename". If this capture
contains a valid file extension known to Helix, then the content will
be highlighted as that language.

* inject language by shebang

Nodes can now be captured with "injection.shebang". If this capture
contains a valid shebang line known to Helix, then the content will
be highlighted as the language the shebang calls for.

* add documentation for language injection

* nix: fix highlights

The `@` is now highlighted properly on either side of the function arg.

Also, extending the phases with `buildPhase = prev.buildPhase + ''''`
is now highlighted properly.

Fix highlighting of `''$` style escapes (requires tree-sitter-nix bump)

Fix `inherit` highlighting.

* simplify injection_for_match

Split out injection pair logic into its own method to make the overall
flow easier to follow.

Also transform the top-level function into a method on a
HighlightConfiguration.

* markdown: add shebang injection query
smortime pushed a commit to smortime/helix that referenced this pull request Jul 10, 2024
* inject language based on file extension

Nodes can now be captured with "injection.filename". If this capture
contains a valid file extension known to Helix, then the content will
be highlighted as that language.

* inject language by shebang

Nodes can now be captured with "injection.shebang". If this capture
contains a valid shebang line known to Helix, then the content will
be highlighted as the language the shebang calls for.

* add documentation for language injection

* nix: fix highlights

The `@` is now highlighted properly on either side of the function arg.

Also, extending the phases with `buildPhase = prev.buildPhase + ''''`
is now highlighted properly.

Fix highlighting of `''$` style escapes (requires tree-sitter-nix bump)

Fix `inherit` highlighting.

* simplify injection_for_match

Split out injection pair logic into its own method to make the overall
flow easier to follow.

Also transform the top-level function into a method on a
HighlightConfiguration.

* markdown: add shebang injection query
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tree-sitter Area: Tree-sitter C-enhancement Category: Improvements E-medium Call for participation: Experience needed to fix: Medium / intermediate S-waiting-on-review Status: Awaiting review from a maintainer.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants