Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code: :closing metadata missing from unary operators using parens #12919

Closed
zachallaun opened this issue Sep 10, 2023 · 15 comments
Closed

Code: :closing metadata missing from unary operators using parens #12919

zachallaun opened this issue Sep 10, 2023 · 15 comments

Comments

@zachallaun
Copy link
Contributor

Elixir and Erlang/OTP versions

Latest main

Erlang/OTP 26 [erts-14.0.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [jit:ns]

Elixir 1.16.0-dev (0d139c1) (compiled with Erlang/OTP 26)

Operating system

Ubuntu 22.04

Current behavior

iex(1)> Code.string_to_quoted!("foo bar", token_metadata: true, columns: true)
{:foo, [line: 1, column: 1], [{:bar, [line: 1, column: 5], nil}]}

iex(2)> Code.string_to_quoted!("foo(bar)", token_metadata: true, columns: true)
{:foo, [closing: [line: 1, column: 8], line: 1, column: 1],
 [{:bar, [line: 1, column: 5], nil}]}

iex(3)> Code.string_to_quoted!("& &1", token_metadata: true, columns: true)
{:&, [line: 1, column: 1], [{:&, [line: 1, column: 3], [1]}]}

iex(4)> Code.string_to_quoted!("&(&1)", token_metadata: true, columns: true)
{:&, [line: 1, column: 1], [{:&, [line: 1, column: 3], [1]}]}

Expected behavior

When token_metadata: true, columns: true is present, the :& node should contain :closing metadata:

iex> Code.string_to_quoted!("&(&1)", token_metadata: true, columns: true)
{:&, [closing: [line: 1, column: 5], line: 1, column: 1], [{:&, [line: 1, column: 3], [1]}]}
@zachallaun
Copy link
Contributor Author

As I began investigating a "fix", I realized that this is more to do with & being a unary operator:

iex> Code.string_to_quoted!("-(1 + 1)", token_metadata: true, columns: true)
{:-, [line: 1, column: 1], [{:+, [line: 1, column: 5], [1, 1]}]}

@zachallaun zachallaun changed the title :closing metadata missing from captures with parens (&(&1)) :closing metadata missing from unary operators using parens Sep 10, 2023
@zachallaun zachallaun changed the title :closing metadata missing from unary operators using parens Code: :closing metadata missing from unary operators using parens Sep 10, 2023
@josevalim
Copy link
Member

You are right it is because it is an op, but we can go ahead and add this information either to capture or to all unary ops. Whatever seems more achievable. Feel free to ask if you need help :)

@zachallaun
Copy link
Contributor Author

Sounds good! Looking at the parser, it looks like it should be just as easy to add for all unary ops as for capture in particular, so I'll go that route. :)

@zachallaun
Copy link
Contributor Author

Reading the parser on my phone and just wanted to confirm I'm on the right track:

Planning to change this and the following 2 lines from:

matched_expr -> unary_op_eol matched_expr : build_unary_op('$1', '$2').

%% change to the form:

matched_expr -> unary_op_eol open_paren matched_expr close_paren : build_unary_op('$1', '$3', '$4')

and then add a build_unary_op/3 that, if token_metadata is set, extracts the closing meta from the end token.

Sound about right?

@zachallaun
Copy link
Contributor Author

zachallaun commented Sep 11, 2023

After trying this (and a handful of other variations), it does not seem to do the trick 😅

diff --git a/lib/elixir/src/elixir_parser.yrl b/lib/elixir/src/elixir_parser.yrl
index d5341acd6..bf9add5d5 100644
--- a/lib/elixir/src/elixir_parser.yrl
+++ b/lib/elixir/src/elixir_parser.yrl
@@ -142,8 +142,11 @@ expr -> unmatched_expr : '$1'.
 %% if calls without parentheses are do blocks in particular
 %% segments and act accordingly.
 matched_expr -> matched_expr matched_op_expr : build_op('$1', '$2').
+matched_expr -> unary_op_eol open_paren expr close_paren : build_unary_op('$1', '$3', '$4').
 matched_expr -> unary_op_eol matched_expr : build_unary_op('$1', '$2').
+matched_expr -> at_op_eol open_paren expr close_paren : build_unary_op('$1', '$3', '$4').
 matched_expr -> at_op_eol matched_expr : build_unary_op('$1', '$2').
+matched_expr -> capture_op_eol open_paren expr close_paren : build_unary_op('$1', '$3', '$4').
 matched_expr -> capture_op_eol matched_expr : build_unary_op('$1', '$2').
 matched_expr -> no_parens_one_expr : '$1'.
 matched_expr -> no_parens_zero_expr : '$1'.
@@ -757,6 +760,9 @@ build_unary_op({_Kind, {Line, Column, _}, '//'}, Expr) ->
 build_unary_op({_Kind, Location, Op}, Expr) ->
   {Op, meta_from_location(Location), [Expr]}.
 
+build_unary_op({_Kind, _Location, Op} = Begin, Expr, End) ->
+  {Op, meta_from_token_with_closing(Begin, End), [Expr]}.
+
 build_nullary_op({_Kind, Location, Op}) ->
   {Op, meta_from_location(Location), []}.

Any guidance would be greatly appreciated!

@josevalim
Copy link
Member

Yeah, the precedence is going to handle parens earlier. Unfortunately it seems there is no easy way to capture the parens here. :( Something else we could try is adding a new string_to_quoted option called unwrap_single_expression_blocks: false. That's because &(1; 2) is actually &__block__([1, 2]). Therefore, &(1) is also meant to be a block, but we unwrap it as it has a single expression. Thoughts?

@zachallaun
Copy link
Contributor Author

(Ping @doorgan who may also have some thoughts on this!)

Just want to make sure I understand your suggestion @josevalim -- so is the idea is that this new option would do the following:

iex> Code.string_to_quoted!("&(1)",
...>   token_metadata: true,
...>   columns: true,
...>   unwrap_single_expression_blocks: false
...> )
{:&,
 [line: 1, column: 1],
 [
   {:__block__,
    [closing: [line: 1, column: 4], line: 1, column: 3],
    [1]}
 ]}

If so, am I correct that, in this particular case, you'd get the same result if you did:

Code.string_to_quoted!("&(1)",
  token_metadata: true,
  columns: true,
  literal_encoder: &{:ok, {:__block__, &2, [&1]}}
)

because the actual underlying change is to add paren closing metadata when applicable to blocks, and that's the metadata that is passed in to the literal encoder?

@josevalim
Copy link
Member

josevalim commented Sep 11, 2023

Correct, in this case it ends up being the same but not if you do &(x). And if you use both literal encoder and unwrap=false, you may see nested blocks.

@doorgan
Copy link
Contributor

doorgan commented Sep 13, 2023

Instead of adding an option to prevent unwrapping blocks, what is it that prevents us from moving the block metadata to the capture op node on unwrapping? It seems to me that we'll need to update the unwrapping code anyways so maybe we can avoid a new option?

@josevalim
Copy link
Member

That would be our first goal (and we would do it automatically) but the way the parser works makes it impossible without a potential large refactoring.

@josevalim
Copy link
Member

@doorgan so is the new option we suggest here enough or should we close this for now?

@doorgan
Copy link
Contributor

doorgan commented Sep 20, 2023

@josevalim the new option would be enough!

@josevalim
Copy link
Member

Fantastic. @zachallaun, please let me know if you have any question. It probably makes sense to validate the implementation from the sourceror side too. :) Thank you all.

@josevalim
Copy link
Member

I will go ahead and close this for now, but I will be very glad to accept a PR if any of you believe this is useful. :)

@zachallaun
Copy link
Contributor Author

Thanks @josevalim! Somehow I missed your comment from 3 weeks ago -- apologies! Agreed that it's reasonable to close this for now, and hopefully I can get around to implementing soon and will open a PR then. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants