Skip to content

Conversation

@Durbatuluk1701
Copy link

This is meant to address #12611.

In particular, the following things are added:

  • adding testcase for merlin config with multiple executables.
  • change merlin to load files searching for exact match first, then any possible match, then recurse

The added testcase test/blackbox-tests/test-cases/merlin/merlin-multi-exes.t/run.t should fail on main and after this fix be working.

I do not really think this is the optimal solution, but I think the root of the issue has been identified (in particular the use of List.find_map is too aggressive when searching for a possible merlin-conf) and a possible fix is here that someone with more experience could improve upon

@Alizter Alizter requested a review from voodoos October 24, 2025 17:50
@Durbatuluk1701
Copy link
Author

Oops looks like I definitely over-tuned that testcase to my system, I'll fix that

change merlin to load files searching for exact match first

Signed-off-by: Will Thomas <[email protected]>
Copy link
Collaborator

@voodoos voodoos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, your heuristic seems like a good improvement on the current version!

Comment on lines 128 to 132
(* FIXME we are racing against the build system writing these
files here *)
match Merlin.Processed.load_file file_path with
| Error msg -> Some (Merlin_conf.make_error msg)
| Ok config -> Merlin.Processed.get config ~file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's find a way not to repeat that code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I think I got it unless you want it even further abstracted

@Durbatuluk1701
Copy link
Author

Some other things I'd thought about while working on this:

  1. Basename with no extension: The functionality of Path.Build.set_extension file ~ext:"" |> Path.Build.basename seems like it might be more generally useful. Should this be hoisted to Path itself for something like Path.Build.basename : ?strip_ext:bool -> Path.Build.t -> string? It does seem like optional arguments are not suggested by the style guide, so maybe this is overkill and keeping the changes in this PR minimal would probably be better.
  2. Valid Merlin Idents: Should maybe instead of:
  let good_names =
    List.map
      ~f:(fun pref -> Printf.sprintf "%s-%s" pref basename)
      [ "lib"; "exe"; "melange" ]

Merlin_ident could manage this and expose something like Merlin_ident.possible_idents : string -> list string? It might be a bit easier to maintain in case the patterns in Merlin_ident.to_string change?

@voodoos
Copy link
Collaborator

voodoos commented Oct 29, 2025

I thought more about this, and I think there is a much better thing to do. The current proposal does improve the situation a bit, but only when the module name is the same as the library or executable. We can do better: each of the lib-*, exe-* merlin file actually contains a map with module names as keys. What we want to do is open these and find-out which one contains the configuration for the requested module.

Ideally we could know to which library or executable a module is associated in advance, but I am not sure that's possible without loading the project which is costly. Is that right @Alizter ? (In the future we probably want to connect to any running Dune via RPC.)

Reading Dune_rules.Merlin.Processed.get, the function that is called on each merlin-config archive to find the values for the requested module, it is not immediately obvious anymore why the current implementation fails. It does have a few fallbacks, and ultimately returns None if it didn't find a candidate, it's not immediately clear why it returns a false positive with the first archive in the alphabetically sorted list.

I think we should fix it so that we do return the config for the actually requested module, always :-)

@Durbatuluk1701 Does this makes sense to you ? Are you interested in having a closer look at it ?

@Durbatuluk1701
Copy link
Author

Reading Dune_rules.Merlin.Processed.get, the function that is called on each merlin-config archive to find the values for the requested module, it is not immediately obvious anymore why the current implementation fails. It does have a few fallbacks, and ultimately returns None if it didn't find a candidate, it's not immediately clear why it returns a false positive with the first archive in the alphabetically sorted list.

This is a great insight, so it seems that Dune_rules.Merlin.Processed.get is working as we would expect and only returns if a match for ~file is found in the per_file_config. The issue seems to be that the obj_dirs (from Merlin.Processed.get.config) are a critical component and ultimately what gets included in the return sexp. You can have a per_file_config match for c.ml but utilize the b.ml config.

This maybe leads me to the question: should .../bin/c.ml be showing up in .merlin-conf/exe-b's per_file_config?

  • If yes, then we probably need to find a way to check that both a per_file_config includes the intended file and config is relevant to the intended file. Or should the necessary info be moved?
  • If no, this may simplify things as if exe-b no longer includes c in its per_file_config then we should likely just be able to move past it when we do not find a match?

I think we should fix it so that we do return the config for the actually requested module, always :-)

Definitely seems optimal

@Durbatuluk1701 Does this makes sense to you ? Are you interested in having a closer look at it ?

Happy to keep looking, although guidance on questions such as the one above will be greatly appreciated

Some debugging prints I did to discover this

I've annotated this with (* <My Commentary> *) to draw attention to important parts

! (* We start just debug printing both the found Merlin Confs *)
Found merlin file: _build/default/multi-exes/bin/.merlin-conf/exe-b
Successfully loaded merlin file: _build/default/multi-exes/bin/.merlin-conf/exe-b
with content: { config =
    { stdlib_dir =
        Some (External "/OCAMLC_WHERE")
    ; source_root = In_source_tree "."
    ; obj_dirs = set { In_build_dir "default/multi-exes/bin/.b.eobjs/byte" }
    ; src_dirs = set { In_source_tree "multi-exes/bin" }
    ; hidden_obj_dirs = set {}
    ; hidden_src_dirs = set {}
    ; flags =
        [ "-w"
        ; "@[email protected]@30..39@[email protected]@[email protected]"
        ; "-strict-sequence"
        ; "-strict-formats"
        ; "-short-paths"
        ; "-keep-locs"
        ; "-g"
        ]
    ; extensions = []
    ; indexes =
        [ In_build_dir "default/multi-exes/bin/.c.eobjs/cctx.ocaml-index"
        ; In_build_dir "default/multi-exes/bin/.b.eobjs/cctx.ocaml-index"
        ; In_build_dir "default/multi-exes/lib/.lib.objs/cctx.ocaml-index"
        ]
    ; parameters = []
    }
; per_file_config =
    map
      { "default/multi-exes/bin/b" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "B" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "b"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      ; "default/multi-exes/bin/b.ml" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "B" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "b"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
! (* Note: This is .merlin-conf/exe-b, and "c" is being managed within its `per_file_config` *)
      ; "default/multi-exes/bin/c" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "C" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "c"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      ; "default/multi-exes/bin/c.ml" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "C" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "c"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      }
; pp_config = { map = map {}; values = [| None |] }
}
Found merlin file: _build/default/multi-exes/bin/.merlin-conf/exe-c
Successfully loaded merlin file: _build/default/multi-exes/bin/.merlin-conf/exe-c
with content: { config =
    { stdlib_dir =
        Some (External "/OCAMLC_WHERE")
    ; source_root = In_source_tree "."
    ; obj_dirs =
! (* These are the objs we actually need for "c". They only appear in `.merlin-conf/exe-c` *)
        set
          { In_build_dir "default/multi-exes/bin/.c.eobjs/byte"
          ; In_build_dir "default/multi-exes/lib/.lib.objs/byte"
          }
    ; src_dirs =
        set
          { In_source_tree "multi-exes/bin"
          ; In_source_tree "multi-exes/lib"
          }
    ; hidden_obj_dirs = set {}
    ; hidden_src_dirs = set {}
    ; flags =
        [ "-w"
        ; "@[email protected]@30..39@[email protected]@[email protected]"
        ; "-strict-sequence"
        ; "-strict-formats"
        ; "-short-paths"
        ; "-keep-locs"
        ; "-g"
        ]
    ; extensions = []
    ; indexes =
        [ In_build_dir "default/multi-exes/bin/.c.eobjs/cctx.ocaml-index"
        ; In_build_dir "default/multi-exes/bin/.b.eobjs/cctx.ocaml-index"
        ; In_build_dir "default/multi-exes/lib/.lib.objs/cctx.ocaml-index"
        ]
    ; parameters = []
    }
; per_file_config =
    map
      { "default/multi-exes/bin/b" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "B" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "b"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      ; "default/multi-exes/bin/b.ml" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "B" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/b.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "b"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      ; "default/multi-exes/bin/c" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "C" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "c"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      ; "default/multi-exes/bin/c.ml" :
          { opens = []
          ; module_ =
              { source =
                  { path = [ "C" ]
                  ; files =
                      { impl =
                          Some
                            { path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; original_path =
                                In_build_dir "default/multi-exes/bin/c.ml"
                            ; dialect = "ocaml"
                            }
                      ; intf = None
                      }
                  }
              ; obj_name = "c"
              ; pp = None
              ; visibility = "public"
              ; kind = "impl"
              ; install_as = None
              }
          ; reader = None
          }
      }
; pp_config = { map = map {}; values = [| None |] }
}
! (* We are now in Merlin.Processed.get *)
Trying to load merlin file: _build/default/multi-exes/bin/.merlin-conf/exe-b
File Config:
map
  { "default/multi-exes/bin/b" :
      { opens = []
      ; module_ =
          { source =
              { path = [ "B" ]
              ; files =
                  { impl =
                      Some
                        { path = In_build_dir "default/multi-exes/bin/b.ml"
                        ; original_path =
                            In_build_dir "default/multi-exes/bin/b.ml"
                        ; dialect = "ocaml"
                        }
                  ; intf = None
                  }
              }
          ; obj_name = "b"
          ; pp = None
          ; visibility = "public"
          ; kind = "impl"
          ; install_as = None
          }
      ; reader = None
      }
  ; "default/multi-exes/bin/b.ml" :
      { opens = []
      ; module_ =
          { source =
              { path = [ "B" ]
              ; files =
                  { impl =
                      Some
                        { path = In_build_dir "default/multi-exes/bin/b.ml"
                        ; original_path =
                            In_build_dir "default/multi-exes/bin/b.ml"
                        ; dialect = "ocaml"
                        }
                  ; intf = None
                  }
              }
          ; obj_name = "b"
          ; pp = None
          ; visibility = "public"
          ; kind = "impl"
          ; install_as = None
          }
      ; reader = None
      }
  ; "default/multi-exes/bin/c" :
      { opens = []
      ; module_ =
          { source =
              { path = [ "C" ]
              ; files =
                  { impl =
                      Some
                        { path = In_build_dir "default/multi-exes/bin/c.ml"
                        ; original_path =
                            In_build_dir "default/multi-exes/bin/c.ml"
                        ; dialect = "ocaml"
                        }
                  ; intf = None
                  }
              }
          ; obj_name = "c"
          ; pp = None
          ; visibility = "public"
          ; kind = "impl"
          ; install_as = None
          }
      ; reader = None
      }
  ; "default/multi-exes/bin/c.ml" :
      { opens = []
      ; module_ =
          { source =
              { path = [ "C" ]
              ; files =
                  { impl =
                      Some
                        { path = In_build_dir "default/multi-exes/bin/c.ml"
                        ; original_path =
                            In_build_dir "default/multi-exes/bin/c.ml"
                        ; dialect = "ocaml"
                        }
                  ; intf = None
                  }
              }
          ; obj_name = "c"
          ; pp = None
          ; visibility = "public"
          ; kind = "impl"
          ; install_as = None
          }
      ; reader = None
      }
  }

  Raw Config:
{ stdlib_dir = Some (External "/OCAMLC_WHERE")
; source_root = In_source_tree "."
! (* The wrong obj_dirs. We ultimately accept them because we find "c" in `per_file_config` *)
; obj_dirs = set { In_build_dir "default/multi-exes/bin/.b.eobjs/byte" }
; src_dirs = set { In_source_tree "multi-exes/bin" }
; hidden_obj_dirs = set {}
; hidden_src_dirs = set {}
; flags =
    [ "-w"
    ; "@[email protected]@30..39@[email protected]@[email protected]"
    ; "-strict-sequence"
    ; "-strict-formats"
    ; "-short-paths"
    ; "-keep-locs"
    ; "-g"
    ]
; extensions = []
; indexes =
    [ In_build_dir "default/multi-exes/bin/.c.eobjs/cctx.ocaml-index"
    ; In_build_dir "default/multi-exes/bin/.b.eobjs/cctx.ocaml-index"
    ; In_build_dir "default/multi-exes/lib/.lib.objs/cctx.ocaml-index"
    ]
; parameters = []
}

  Found exact match for file: _build/default/multi-exes/bin/c.ml
Returning: { opens = []
; module_ =
    { source =
        { path = [ "C" ]
        ; files =
            { impl =
                Some
                  { path = In_build_dir "default/multi-exes/bin/c.ml"
                  ; original_path =
                      In_build_dir "default/multi-exes/bin/c.ml"
                  ; dialect = "ocaml"
                  }
            ; intf = None
            }
        }
    ; obj_name = "c"
    ; pp = None
    ; visibility = "public"
    ; kind = "impl"
    ; install_as = None
    }
; reader = None
}
pp flag: None
unit_name: c
Found merlin config in: _build/default/multi-exes/bin
with resulting csexp ((?:INDEX?:$TESTCASE_ROOT/_build/default/multi-exes/bin/.c.eobjs/cctx.ocaml-index)(?:INDEX?:$TESTCASE_ROOT/_build/default/multi-exes/bin/.b.eobjs/cctx.ocaml-index)(?:INDEX?:$TESTCASE_ROOT/_build/default/multi-exes/lib/.lib.objs/cctx.ocaml-index)(?:STDLIB?:/OCAMLC_WHERE)(?:SOURCE_ROOT?:$TESTCASE_ROOT)(?:EXCLUDE_QUERY_DIR)(?:B?:$TESTCASE_ROOT/_build/default/multi-exes/bin/.b.eobjs/byte)(?:S?:$TESTCASE_ROOT/multi-exes/bin)(?:FLG(?:-w?:@[email protected]@30..39@[email protected]@[email protected]?:-strict-sequence?:-strict-formats?:-short-paths?:-keep-locs?:-g))(?:UNIT_NAME?:c))

@Alizter
Copy link
Collaborator

Alizter commented Oct 29, 2025

@voodoos We have Dune_rules.Ml_sources.Origin.t and related functions for tying a source to a stanza. This will require loading the build system unfortunately.

@rgrinberg rgrinberg self-requested a review October 29, 2025 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants