Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Macro and Macro.Env functions for language servers #13361

Closed
4 of 5 tasks
josevalim opened this issue Feb 22, 2024 · 11 comments
Closed
4 of 5 tasks

Improved Macro and Macro.Env functions for language servers #13361

josevalim opened this issue Feb 22, 2024 · 11 comments

Comments

@josevalim
Copy link
Member

josevalim commented Feb 22, 2024

In order for language servers to mimic functionality found in the Elixir compiler, we want to expose more compiler facing functionality (as per #12645 (comment)).

There are at least 5 functions necessary:

  1. Macro expansion - for non-qualified calls, we need to consider imports or local macros. For remote calls, they must have been required before. Currently Macro.expand/2 can perform this, but it will also perform tracing and additional functionality, which may not be desired. Language servers may also want to add special annotations to the AST so it can distinguish generated nodes from non-generated ones. We still need to explore which will be the best approach here, either a function in Macro or Macro.Env.

  2. Reading and writing variables - whenever a variable is defined, we must store it in Macro.Env. Currently there is no API for adding variables to Macro.Env, only reading them. However, we also know that variables may be defined anywhere in a pattern. We could add a function that traverses a pattern and updates Macro.Env with all variables, but it is most likely that language servers need to traverse the patterns as well, to expand macros and collect variable definitions, so I'd say that providing a function such as Macro.Env.put_var/2 is enough. Another topic in relation to variables is the variable context. For a variable of shape {name, meta, context}, the context is actually Keyword.get(meta, :counter, context). Today Macro.Env already expects the actual context but we may want to encapsulate that. Another topic is in relation to the variable version. Macro.Env does not have a field so we can properly version variables, we may need to change that (as they will likely play a role in the type system).

  3. Requires - requires are the simplest to implement. Macro.Env.store_require(env, meta, module) should be all that is necessary. The meta is required to handle the :defined annotation. Returns {:ok, env} or {:error, reason}. We use store_ instead of put_ since it returns ok/error.

  4. Aliases - aliases are relatively straight-forward too. Two functions will be made available: Macro.Env.store_alias(env, meta, module), where the alias is inferred, and Macro.Env.store_alias(env, meta, module, alias). Return {:ok, env} or {:error, reason}.

  5. Imports - imports are done as Macro.Env.store_import(env, meta, module, opts \\ []). Returns {:ok, env} or {:error, reason}.

@mhanberg
Copy link
Contributor

I can work on this, unless you planned on doing it yourself.

Thanks for putting this together 💪.

@mhanberg
Copy link
Contributor

Also, might need an function for module attributes.

@josevalim
Copy link
Member Author

Module attributes are not part of the compiler. They are handled by the @ macro and it mostly uses public APIs from Module which you can leverage. The challenge will be exactly in compiling a module just enough so functionality like module attributes work, while storing AST information. One potential idea is to execute the code as usual, but replace some of the Kernel macros by LanguageServer.Kernel macros, which store additional information, and then fallback to @.

defmodule LS.Kernel do
  defmacro defmodule(name, do: block) do
    # augmented version of defmodule
    # for example, you can augment the block by adding
    # some lines that fetch relevant information from the
    # module and then raises (so the module is not effectively
    # defined).
  end

  defmacro @(expr) do
    # preprocess and store relevant information
    quote do
      Kernel.@(unquote(expr))
    end
  end
end

The best way to collect this information is definitely open to debate and we may need new functionality. @lukaszsamson, how does elixir_sense handle this? Does it rely exclusively on pre-traversal of the AST? Or does it still execute the module body?

@josevalim
Copy link
Member Author

I can work on this, unless you planned on doing it yourself.

Thank you for the offer. I'd like to tackle this one, I will use it as an opportunity to refactor parts of the compiler. I will put it as high priority on my list. :)

@lukaszsamson
Copy link
Contributor

One potential idea is to execute the code as usual, but replace some of the Kernel macros by LanguageServer.Kernel macros, which store additional information

That's a neat idea. Two questions here? Where those augmented macros should reside. elixir stdlib or LSPs?

how does elixir_sense handle this? Does it rely exclusively on pre-traversal of the AST? Or does it still execute the module body?

elixir_sense relies on AST traversal. It open scopes on defmodule, def etc calls in pre callback and closes them in post callback.

The only place where some code is actually executed is in use macro expander and that currently introduces as many problems at it solves. It ties AST traversal with compilation of required modules which leads to errors like

(ArgumentError) could not call Module.get_attribute/2 because the module X is already compiled

@josevalim
Copy link
Member Author

josevalim commented Feb 22, 2024

That's a neat idea. Two questions here? Where those augmented macros should reside. elixir stdlib or LSPs?

LSPs.

The only place where some code is actually executed is in use macro expander and that currently introduces as many problems at it solves. It ties AST traversal with compilation of required modules which leads to errors like

If you override @, you can implement your own reader and writer, and then they would work on use too.


Another option is to do something akin to this:

try do
Process.put(:elixir_module_ast_to_expand, module_body)
Module.create(name, quote do
  LSP.expand_module()
end)
catch
  :done ->
    Process.delete(:elixir_module_ast_to_expand)
    :ok
end

where LSP.expand_module() will be something like this:

def expand_module() do
  body = Process.delete(:elixir_module_ast_to_expand)
  # expand body as usual
  throw :done
end

The issue with this approach, however, is that if the module is being compiled at the same time, you will get errors. So you may need to add locks around compilation (which can slow everything down). Maybe we can add a feature to Elixir to allow some sort of module preview without conflicts. I will explore this a bit.

@lukaszsamson
Copy link
Contributor

Second question related to the first approach. How would that monkey patching work with already compiled modules? Let's say we have a module with use Ecto.Schema and Ecto.Schema is already compiled. How would we make the macros called by schema dispatch to the overloaded ones?

Let's envision next steps as well. Suppose I wanted to build a dedicated LSP for phoenix. I'd need to override Router macros like scope et al.

Maybe a generic tracer that intercepts macro expansion would be an alternative. With API like

def on_macro_expand(ast, env, ast_expanded)

@josevalim
Copy link
Member Author

josevalim commented Feb 22, 2024

You are right. The overrides would only work for the immediate macros, so it only has limited use. We should probably scratch it for now.

Maybe a generic tracer that intercepts macro expansion would be an alternative. With API like

Unfortunately this can be used to introduce global modification of Elixir programs, so it is a no-no. For now, let's assume we will continue to perform manually expansion, but let's provide Module.draft(name, fn -> ... end) so you can mirror more module functionality, such as module attributes. WDYT?

@josevalim
Copy link
Member Author

Hi everyone, I have done almost all functionality here, except for variable handling. I will postpone this until we have more use cases, especially because variable handling may change in the future as we better integrate with types.

I have also expanded the scope of this feature: I believe it can also be used by those who wants to build their own languages on top of Elixir. For example, Nx could be built on top of it instead of relying on the double-compile step it does today. To do this, however, we would need the variable handling AND an API to raise compiler errors with diagnostics (we already have one to emit warnings, via IO.warn/2).

This week I want to build a mini-compiler, which you can use to build the buffer environment and leave it here as a proof of concept. Then I will close this issue. :)

@mhanberg
Copy link
Contributor

Amazing! I will try this out as soon as possible.

@josevalim
Copy link
Member Author

Here we go, this is a proof of concept showing how you can implement a mini-compiler for either Elixir (or even a sub-language) using the building blocks above: https://gist.github.com/josevalim/3007fdbc5d56d79f15adedf7821620f3

The example is focused on the language server use case and it has a lot of comments. There is a state variable which you can use to capture any AST information that you want. And it shows how you can intercept some macros (such as defmodule). You folks also now how to reach out to me if you have questions. Enjoy!

/cc @jonatanklosko @jackalcooper

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants