Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[merlin] Incoming changes to dune ocaml-merlin #4217

Closed
voodoos opened this issue Feb 12, 2021 · 7 comments
Closed

[merlin] Incoming changes to dune ocaml-merlin #4217

voodoos opened this issue Feb 12, 2021 · 7 comments
Milestone

Comments

@voodoos
Copy link
Collaborator

voodoos commented Feb 12, 2021

The disposal of .merlin based Merlin configuration was a major change in Dune 2.8 and issues appeared in some untested setups. This issue summarize these problems with known workarounds and candidate solutions. Most of these should be fixed for Dune 2.9.

✅ 1. Dune and Merlin disagree on paths when workspace is in a symlinked folder

  • Workaround: cd to the "real" directory before starting hacking.
  • Solution: exchange paths relative to the CWD of dune ocaml-merlin
    instead of absolute paths.

✅ 2. On Windows Dune and Merlin disagree with how absolute path should look like

Which case ? forward or backward slashes ? etc. That means upgrading to Dune 2.8
breaks Windows setup using both Merlin and Ocaml-LSP

  • Workaround: No easy workaround
  • Solution: on Windows (and Cygwin) dune ocaml-merlin should lowercase
    the paths before comparing them. Also using relatives paths as for issue 1
    should make the process more robust.

☑️ 4. Now that each module has a precise configuration, files must bear a module name

[ocaml/merlin#1262]

Files with an ill-formed module name like mymodule.foo.ml do not have any
configuration loaded. Previously Merlin would always load the generic
configuration for the whole directory, including for source files which does not
reflect a valid module name.

  • Workaround: write a .merlin file by hand. This can be a daunghting
    task for massive libraries/executables.
  • Possible solutions:
    1. Keep track of Dune copy rules to know the module name (I don't know if that is possible) ?
    2. ✅ Decide of a file naming convention for that case: module_name.xxxxx.ml ?
    3. Add a command line parameter for dune ocaml-merlin to dump a generic
      .merlin file (with commented Preprocessing instructions).
    4. Always generate an additional generic configuration that is given to
      Merlin if the filename does not match any known module.

✅ 5. Merlin files

[#4209; #4169]

I feel like in any case we should have a possibility to have .merlin back
for users with non-standard setups by either:

  1. Giving the ability to extract .merlin files once by using a special dune ocaml-merlin --dump-merlin-files parameter.
  2. Or adding an option to the dune-project file to enable approximate .merlin
    files promotion as before.

CC @rgrinberg @jeremiedimino @emillon @trefis

For item 4 (not-a-module-name) I am in favour of solution iv and for item 5 (merlin-files) I am in favor of solution 1, dumping a file with merged configurations (the same as the generic configuration that would be generated for 4.iv) and commented preprocessing directives.

@ghost
Copy link

ghost commented Feb 15, 2021

Nice summary!

For 1 and 2, using relative paths seems like the way to go.

For the other problems, do we know why the current system is not enough for cases reported by users? Falling back to the old style merlin files is one option, but maybe there are a couple of simple general features that could be added to Dune, would benefit everyone and would avoid the need for approximate merlin files.

@trefis
Copy link
Collaborator

trefis commented Feb 15, 2021

Regarding section 1 & 2: 👍

Regarding section 4: I don't think I understand what's going on here. Do you know how this happens / can we get a reproduction case?

Regarding section 5:

Regarding section 3: where is it?

@voodoos
Copy link
Collaborator Author

voodoos commented Feb 15, 2021

About 4:
Dune generates one configuration per module. The problem is that the detection of "what is the correct configuration for this file" is based on the filename which is assumed to be a valid module name. This can break if:

  • the filename does not reflect the module name like it was the case for example with merlin's destruct.old.ml and destruct.new.ml that were selected (and copied via a rule) at build time.
  • it may also break for top-level modules which are not required to be in a file having a valid module name but I did not test it yet.

About 5:
Yes, #4169 is not related to the recent changes. It is an issue that appeared because of the now "exact" preprocessing specification in the merlin configurations. And it should probably be solved in Dune or Merlin (or maybe simply the user's project). However one workaround for that user could be to write custom .merlin files. And I believe many edge cases can benefits from custom-written merlin file (which are not deleted by Dune >=2.8).

However writing a custom Merlin file from scratch in big projects can be a tedious task. My idea was to provide a simple command that would dump a generic merlin file with all B and S directives but with pp-related directive commented. This file could then be tweaked by users to fit their specific needs.

@trefis
Copy link
Collaborator

trefis commented Feb 15, 2021

Understood, thanks for the explanation.

@voodoos
Copy link
Collaborator Author

voodoos commented Feb 18, 2021

I investigated item 4 and I would like some input before committing to a solution:

☑️ 4. Now that each module has a precise configuration, files must bear a module name

[ocaml/merlin#1262]

Files with an ill-formed module name like mymodule.foo.ml do not have any
configuration loaded. Previously Merlin would always load the generic
configuration for the whole directory, including for source files which does not
reflect a valid module name.

Workaround: write a .merlin file by hand. This can be a daunghting
task for massive libraries/executables.

First, a concrete example can be found here.
In this project an executable depends on a file namelib.ml which is a target of a rule that depends on another sourcefile named namelib.c.ml.

The issue is that Dune generates a configuration for the module Namelib but ignores its relation with namelib.c.ml.
When the user edits the file namelib.c.ml Merlin ask for the corresponding configuration and none is found.

Before, in the .merlin times, the configuration was folder-based and thus always loaded.

I gave more thoughts to the possible solutions:

Possible solutions:

  1. Keep track of Dune rules to know the module name (I don't know if that is possible) ?

This is the most accurate of the solutions, but also the more complex one: we could scan for Rules stanzas and look at their targets and dependencies to mark them as all being related to the same Merlin configuration. However targets and dependencies are not always ml or mli files and can make use of variables so it is probably a bit tricky to get right.

  1. Decide of a file naming convention for that case: module_name.xxxxx.ml ?

This is the simplest of the solutions, it would work in a lot of cases, and we can document that choice. When asked for the configuration for file module_name.xxxxx.ml dune ocaml-merlin would answer with the configuration for Module_name.
In my opinion this is the most pragmatic solution, I am in favor of it.

  1. Add a command line parameter for dune ocaml-merlin to dump a generic
    .merlin file (with commented Preprocessing instructions).

A bad way to solve this issue, but an useful escape hatch for non-standard situations, already in PR#4250.

  1. Always generate an additional generic configuration that is given to
    Merlin if the filename does not match any known module.

We could reuse the merging mechanism in the dump-dot-merlin PR#4250 and serve this (often approximated) configuration when nothing else is available. This does not feel right after all the effort we made to have non-approximate configurations.

CC @jeremiedimino @emillon @rgrinberg @trefis

@ghost
Copy link

ghost commented Feb 22, 2021

First, a concrete example can be found here.

I was looking at this example. Here we have a ppx rewriter (ppx_cstubs) that generate both a .ml and .c file, which is not supported by Dune. So to handle this, the project calls the ppx rewriter manually, but also uses (preprocess (ppx ppx_cstubs.merlin)), which I assume is a dummy ppx rewriter that rewrites some extension points just enough to get the file to type.

It feels like this could be done differently. For instance, instead of having one rule to produce both the .ml and .c file and one ppx to get the file to type without doing the actual rewriting, one could have:

  • one normal ppx that produces the .ml file but not the .c file
  • one rule that produces just the .c file

That would fit better in the system with just as much boilerplate in the dune file. We could talk about such cases in the doc.

This is the most accurate of the solutions, but also the more complex one: we could scan for Rules stanzas and look at their targets and dependencies to mark them as all being related to the same Merlin configuration. However targets and dependencies are not always ml or mli files and can make use of variables so it is probably a bit tricky to get right.

What configuration would Dune give to merlin in such a cases? It seems that Dune could pass on the compilation flags and include directories, but what about the preprocessing directive? If the user is using a hand-written rule, it's quite possible that the original source file has syntaxes that merlin won't understand. Can merlin do something in this case? The file might even not parse with the regular OCaml parser.

This is the simplest of the solutions, it would work in a lot of cases, and we can document that choice. When asked for the configuration for file module_name.xxxxx.ml dune ocaml-merlin would answer with the configuration for Module_name.
In my opinion this is the most pragmatic solution, I am in favor of it.

Agreed. But same remark as above.

A bad way to solve this issue, but an useful escape hatch for non-standard situations, already in PR#4250.

Seems reasonable. Commenting preprocessing instructions feels a bit ad-hoc though. If the user resorts to manually editing the merlin file, I would let them comment out the preprocessing instructions.

We could reuse the merging mechanism in the dump-dot-merlin PR#4250 and serve this (often approximated) configuration when nothing else is available. This does not feel right after all the effort we made to have non-approximate configurations.

Agreed.

@voodoos
Copy link
Collaborator Author

voodoos commented Feb 22, 2021

For the record, solution 2 has been implemented. Of course it does not gives strong guarantees that the configuration is correct especially regarding PP flags. This is something that could be designed in a future issue.

However in simpler case where the users just need to selected a specific source file at build time depending on the context this solution fits perfectly.

I am closing this issue since most of the described problems have been solved.

@voodoos voodoos closed this as completed Feb 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants