Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Dune_engine.Dep_path by a more general mechanism in Memo #4605

Merged
1 commit merged into from May 13, 2021
Merged

Replace Dune_engine.Dep_path by a more general mechanism in Memo #4605

1 commit merged into from May 13, 2021

Conversation

ghost
Copy link

@ghost ghost commented May 13, 2021

The goal of Dep_path was to accumulate the path from the main goal to an error. It was working by wrapping exceptions and adding a Dep_path.Entry.t list to them. At certain strategic points, we were catching and re-raising exceptions with an augmented dependency path.

This feature is now redundant with the memo stack. Given that, this PR moves the wrapping to Memo and removes the forward references Memo.unwrap_exn. It also allows to attach "human readable description" to memo nodes. To print the dependency path, we now simply need to extract human readable description of stack frames from the memo stack. These descriptions are also used when displaying dependency cycles. A few error messages changed as a result of these changes. The new messages seem fine to me but they could be tweaked if needed.

Since exceptions raised by the Memo module now systematically have a stack attached to them, I removed the stack from Memo.Cycle_error.t.

Recovering locations

Another feature of Dep_path was to "augment" error messages with a location recoved from the dependency path for messages that didn't have a location already.

To preserve this feature, it is also possible to attach a location to stack frames. In the end, both the human readable description and the location are captured in a new Memo.Stack_frame.Info.t record:

I replaced this feature by a simpler one where we insert the location at the point where we execute a rule. This seems to produce better error message, as the previously recovered location where pointing to the use site rather than the definition site.

Less exceptions related functions in Memo.Build

After these changes, the only place in the code where we observe exceptions in the memo monad is in merlin.ml where we ignore failures related to computing the compilation flags. So I removed the various with_error_handler, collect_error, ... functions from Memo.Build and replaced them by a single swallow_errors one:

  val swallow_errors : (unit -> 'a t) -> ('a, unit) Result.t t

This should give us more freedom to deal with exceptions inside Memo. And if we drop this last use case in merlin.ml, we could remove this last function, which would give us even more freedom inside Memo.

@ghost ghost requested review from snowleopard and aalekseyev May 13, 2021 00:18
@@ -11,7 +11,7 @@ module Error : sig
(** Errors when building a target *)
type t

val info : t -> User_message.t * Dep_path.Entry.t list option * Path.t option
val info : t -> User_message.t * Path.t option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Continuing the conversation from slack: (cc @cwong-ocaml)

if instead of a structured value, you get a Pp.t list (= List.map ~f:Dep_path.Entry.pp ), would that be sufficient for the RPC?

it wouldn't be sufficient lsp. In lsp, every single error needs to belong to a source file (the path that we currently provide is the directory where the rule is executed - but it is irrelevant). I intended to extract this source file from the dep_path stack, but now I need to use this new mechanism and I'm not quite sure it's enough. The information I need is basically already pushed in build_file_impl, but I don't know how to extract it. Is there a way to access the memo stack frame for the error?

Copy link
Author

@ghost ghost May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do it the same way we recover locations for "no rule found for ..." errors. For these, we walk up the memo stack and look for stack frames that correspond to a call to "execute_rule ":

        List.find_map stack ~f:(fun frame ->
            match
              Memo.Stack_frame.as_instance_of frame ~of_:execute_rule_memo
            with
            | Some r -> Some (Rule.loc r)
            | None -> None)

For your purpose, you could look for build_file_memo, the argument would be the file that was requested to build.

However, I'm not sure that's going to be much help because you are not going to find a source file this way. What you'll get is the cmx or cmo. That is true with the new code, but was also true with the old Dep_path module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I'm not sure that's going to be much help because you are not going to find a source file this way. What you'll get is the cmx or cmo. That is true with the new code, but was also true with the old Dep_path module.

I think this can be addressed using our standard trick for testing if a path is in the source. We trim the build context, and check if the result exists in Source_tree.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I remember having a similar discussion with @cwong-ocaml. One idea would be look up the stack for the first rule that has dependencies in the source tree. Such source files are likely to be the one you care about.

get_rule_or_source t path >>= function
| Source digest -> Memo.Build.return digest
| Rule (path, rule) ->
let+ { deps = _; targets } = execute_rule rule in
Path.Build.Map.find_exn targets path)
~info:(fun () ->
{ desc = Some (Pp.text (Dpath.describe_path path)); loc = None })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reference to previous comment: I'd like this information to be available in Build_system.Error.t. Preferably, as a raw path that I can serialize rather than a pretty printed string.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but first let's make sure that's actually what we want. If you want a source file, passing this along is not going to help.

val human_readable_description : t -> User_message.Style.t Pp.t option
end

module Error : sig
Copy link
Collaborator

@snowleopard snowleopard May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name Error seems too generic. My understanding is that this Error corresponds to errors raised by user-supplied functions that have been augmented with Memo call stack information. If that's the case, perhaps we should call this module User_error, and also add the following as a comment:

Suggested change
module Error : sig
(** Errors raised by user-supplied memoized functions that have been augmented with
Memo call stack information. *)
module Error : sig

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment seems good. I'm not too sure about the name User_error because it is already used in Dune to mean "error that are reported in a way that the user can understand and act on".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "user" considered too generic :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking a bit more, User_error is also too generic, since Cycle_error is a user error too, so I retract my suggestion. I'm still not quite happy about Error (because it doesn't include Cycle_error despite its generic name), but I can live with it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about representing Cycle_error as a constructor of Error?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that sounds good actually!

Copy link
Collaborator

@snowleopard snowleopard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's great to get rid of some special purpose error-handling mechanisms and rely on Memo instead. I left a few comments but overall this PR seems fine. I mostly focused on changes around Memo and on new error messages. Some of the new error messages seem worth improving.

@aalekseyev
Copy link
Collaborator

In the PR description you imply that memo stack is sufficient to compute the same information as the dep path, but I think that's not true.
If the same error is reachable via different paths through the memo graph, then the old dep path mechanism produces the correct path, while the memo stack produces the wrong path.
One can argue (and I tend to agree) that it's fine to only show one of the paths the error is reachable by.

There is a problem, though, if watching mode is used and errors are cached.

Consider the following scenario:

  • A and B both refer to X, and X has an error
  • dune reports a stack ["A"; "X"]
  • the user goes "oh, I never meant to use X in A", and removes that reference
  • dune still complains with ["A"; "X"] because the X node is still broken and is still reachable (via B)

Now, I did not carefully read the changes so perhaps my mental model of the new world is wrong, but going from the description alone I think this is a problem. What do others think?

@ghost
Copy link
Author

ghost commented May 13, 2021

Alright, I implemented what we discussed offline: augmenting the stack by one in Value.get_exn where we do the Fiber.reraise_all. It seems to have cleaned up some of the error messages.

Copy link
Collaborator

@snowleopard snowleopard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most recent changes look good too.

@ghost ghost merged commit 6e25cdd into ocaml:main May 13, 2021
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants