-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure get_range
works for all syntax nodes parsed using Sourceror.parse_string
#104
Conversation
The Elixir tree-sitter implementation's corpus is likely a better representation of "all valid syntax" than Sourceror's source. It's licensed under Apache 2.0 and MIT (some parts of corpus are MIT according to the NOTICE in the repo root), so it would certainly be possible to vendor it in. |
745dae7
to
7259203
Compare
7259203
to
8243900
Compare
Ready for review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the fixes! The PR looks good to merge to me 👍
# This range currently ends on column 5, though it should be column 6, | ||
# and appears to be a limitation of the parser, which does not include | ||
# any metadata about the parens. That is, this currently holds: | ||
# | ||
# Sourceror.parse_string!("& &1") == Sourceror.parse_string!("&(&1)") | ||
# | ||
# assert to_range(~S"&(&1)") == %{ | ||
# start: [line: 1, column: 1], | ||
# end: [line: 1, column: 6] | ||
# } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can propose a change to the elixir parser? Something similar used to happen for qualified module aliases like
Foo
.
Bar
Which is a super contrived example but still valid, we added the last
metadata to allow range calculation on them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! I can post an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this may actually just be a bug. If you look at the underlying AST, &
is just a call and should have a :closing
metadata when parens are present:
iex(1)> Code.string_to_quoted!("foo(bar)", token_metadata: true)
{:foo, [closing: [line: 1], line: 1], [{:bar, [line: 1], nil}]}
iex(2)> Code.string_to_quoted!("foo bar", token_metadata: true)
{:foo, [line: 1], [{:bar, [line: 1], nil}]}
iex(3)> Code.string_to_quoted!("&(&1)", token_metadata: true)
{:&, [line: 1], [{:&, [line: 1], [1]}]}
iex(4)> Code.string_to_quoted!("& &1", token_metadata: true)
{:&, [line: 1], [{:&, [line: 1], [1]}]}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test "should not raise on any three-element tuple parsed by parse_string" do | ||
for relative_path <- Path.wildcard("lib/*/**.ex") do | ||
assert_can_get_ranges(relative_path) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cool trick, and while for now it works it might be annoying for future PRs that unintendedly add code that braks range calculation.
I think your idea of bringing the examples from the elixir tree-sitter corpus sounds good, so we can add that in another PR 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think your idea of bringing the examples from the elixir tree-sitter corpus sounds good, so we can add that in another PR 👍
Sounds good! So just to be clear: leave this for now, and then we can strip it out when we pull in the better corpus of syntax examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup!
Sourceror.get_range(quoted) | ||
rescue | ||
e -> | ||
flunk(""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL about ExUnit.Assertions.flunk/1
, this is cool!
Fixes #103 and some other issues I found.
The goal here is that any syntax node parsed by
Sourceror.parse_string
should be passable toget_range
without raising. I had the idea that we could walk the Sourceror codebase, callget_range
on every syntax node, and ensure that it doesn't raise -- worked great and found a handful of bugs!Summary of fixes:
'foo'
,'foo#{bar}'
fn -> :ok end
&1
{:., _, args}