-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add tags.scm queries #18
Conversation
oh whoops I didn't see #17, I'll try these queries against that branch and see if it changes anything edit: looks good, these queries are producing the same results on both branches |
This is awesome @the-mikedavis! Quick question: do arities play a role in those queries? Should they? If they do, then we need to consider |
I believe arities are not considered, I think that wouldn't play well with some languages like javascript where arities can be flexible. It seems to me like it's very similar to I haven't dug far enough into stack-graphs and precise navigation to be sure, but that's probably something we'll have to keep in mind for writing those stack-graph queries 👍 |
@the-mikedavis perhaps one option is to include the arity in the token name: |
Ah I like that idea. Afaik you don't have any control over modifying that token with how |
Hey @the-mikedavis, fantastic!
This is definitely a tree-sitter bug, it's easy to verify using the highlight queries. # Running highlighting test passes
npx tree-sitter test --filter test/highlight
# Running the query tests with the corresponding queries doesn't work
npx tree-sitter query --test queries/highlights.scm test/highlight/**/* Do you know how github/semantic fits into the picture? It seems necessary to add Elixir bindings there, but having another look I see that tag matching does the same as the corresponding tag query, so I'm wondering how these two relate. |
queries/tags.scm
Outdated
right: (identifier) @name) @reference.call | ||
|
||
; * modules | ||
(alias) @name @reference.call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intentionally use @reference.call
for all references or could we use @reference.module
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly the Python queries have neither of those for classes/modules and yet it works 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! Yep I tried out @reference.module
initially but I thought it was not working. I just gave it another try and the above output changes to:
Mint.MixProject | module def (0, 10) - (0, 25) `defmodule Mint.MixProject do`
Mix.Project | module ref (1, 6) - (1, 17) `use Mix.Project`
project | function def (6, 6) - (6, 13) `def project do`
Mix | module ref (11, 23) - (11, 26) `start_permanent: Mix.env() == :prod,`
env | call ref (11, 27) - (11, 30) `start_permanent: Mix.env() == :prod,`
elixirc_paths | call ref (12, 21) - (12, 34) `elixirc_paths: elixirc_paths(Mix.env()),`
Mix | module ref (12, 35) - (12, 38) `elixirc_paths: elixirc_paths(Mix.env()),`
env | call ref (12, 39) - (12, 42) `elixirc_paths: elixirc_paths(Mix.env()),`
deps | call ref (13, 12) - (13, 16) `deps: deps(),`
package | call ref (29, 15) - (29, 22) `package: package(),`
application | function def (46, 6) - (46, 17) `def application do`
Mint.Application | module ref (49, 12) - (49, 28) `mod: {Mint.Application, []}`
package | function def (53, 7) - (53, 14) `defp package do`
elixirc_paths | function def (60, 7) - (60, 20) `defp elixirc_paths(:test), do: ["lib", "test/support"]`
elixirc_paths | function def (61, 7) - (61, 20) `defp elixirc_paths(_env), do: ["lib"]`
deps | function def (64, 7) - (64, 11) `defp deps do`
which looks happy.
I see in some other queries now that there is a little variance like in these c-sharp tags.scm: https://github.com/tree-sitter/tree-sitter-c-sharp/blob/3104df21065af0f3d51e05a96cd0e2ff16a6f982/queries/tags.scm, which also makes me think that what is currently being tagged as @definition.implementation
should actually be @reference.interface
for code like
defimpl Jason.Encoder, for: List do
On the other hand, the ruby query has this line:
[(identifier) (constant)] @name @reference.call
which tags modules/classes as @reference.call
s though and I've seen module references work in ruby code (for example Appsignal), so I'm not sure what the right move is. Maybe all definitions/references are pooled together when performing the search? Or maybe it just helps with ranking?
I think I'll update this line and the @reference.interface
as well 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, (constant)
is the inner node for both class and module, I missed initially!
If the Jason.Encoder
alias appears anywhere else we won't know if it's a module or a protocol. I think we should just use module
for protocols as well (which is actually true), unless I'm missing some benefits of tagging them separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah I think that makes sense, I could see things being ambiguous if you do a @derive {Jason.Encoder, ..}
and search doesn't match up that @reference.module
with the @definition.interface
I'll cut out all that interface stuff in preference of modules 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, definition.function
is not consistent with reference.call
(as opposed to reference.function
), but we're probably missing some information here, so let's wait ^^
Based on this documentation, I think semantic is not in the navigation picture, it seems to all be driven by tree-sitter and tags.scm, but we'll need a GitHubber to weigh in to be sure |
Oh wow, maybe the "Used for code navigation on github.com" in Let's wait for some insights then :) |
@the-mikedavis @jonatanklosko Correct: semantic is no longer part of the tags pipeline (we switched to I’ll take this patch for a spin locally with our tagging backend. I can’t make guarantees about if or when it’ll land—we have to figure out, product/documentation-wise, how to pull in this functionality without making it part of our core supported languages. But I’m super excited about this work; thank you all! |
I can confirm that this patch works for our internal code-nav systems. Again, I can’t make promises as to when it will land—but this patch should be merged, in my opinion! |
@patrickt fantastic news, thanks a lot! Fingers crossed for the integration to happen :D @the-mikedavis awesome job! |
Elixir code navigation is now active on GitHub! |
@patrickt fantastic, thanks again for all the integration work! @the-mikedavis beautiful job with the tags 🐱 |
tags.scm
powers GitHub code navigation for languages that use tree-sitters.For example with ruby...
You can see class/method declarations:
And see the references (where those classes/methods end up being used):
These queries ended up taking inspiration from other
tags.scm
in the@tree-sitter
org repos, mostly python and ruby.I don't think there's a framework in place for testing these queries, but I think it wouldn't be too much trouble to write a small shell script that diffs the output ofWe should be able to use tree-sitter#1547 to test these queries if/when merged.tree-sitter tags fixture.ex
against a known solution for some fixtures. There is the ability to test queries like so:tree-sitter query --test query/tags.scm test/query/fixture.ex
(added intree-sitter#775
) but the precedence rules seem to be backward between querying and determining tags (e.g. module/function definitions end up as@reference.call
s in the query output but are correctly identified bytree-sitter tags
). I can look at making a PR to tree-sitter that adds a flag for testingtree-sitter tags
or maybe somehow reverses precedence intree-sitter query
.A sample output of
tree-sitter tags path/to/mint/mix.exs
with this Mint mix.exs:So the queries tag function/module definitions and usages of functions/modules. It's notable that modules are tagged as
@reference.call
when not beingdefmodule
d: this lines up with the behavior of the ruby queries and allows you to jump to definitions of modules, it's not saying that those modules are function invocations. These queries aren't smart enough to figure out some language semantics like resolvingalias
ed modules. For "precise navigation" which respects language semantics, we'd need to write some stack-graph construction queries (see this blog post), but stack-graphs are very new and it may be wise to wait until they're matured. Also the python queries aren't open-sourced afaict so I haven't really figured out the tricks yet 😅. Once I figure all that out, I'll try sending in some stack-graphs queries :)@patrickt is this something I can trouble you with setting up if/once it's merged in (and maybe lend some review eyes)? I appreciate all your help with the integration stuff :)