GHC is not currently “fault-tolerant”: if it encounters a problem it simply stops. This means that it does not report as many diagnostics as it could do, and also that tools which rely on compiler output from a later stage cannot work if there is an earlier problem. We propose to make GHC’s compiler pipeline more fault-tolerant in phases.
It is a well-known observation that during interactive development the user’s program is mostly in an invalid state. It is therefore important to make sure that compiler diagnostics are as helpful (and plentiful) as possible on various kinds of broken programs.
At the same time, compilers are often structured as a pipeline of stages. In the case of GHC, the main ones are:
- Parsing
- This produces parse errors
- Renaming
- This produces all kinds of diagnostics relating to name resolution, and some others
- Typechecking
- This produces type errors
- Desugaring to Core
- This produces some diagnostics such as from pattern-match coverage checking
- Interface file creation
- This step is necessary for typechecking downstream modules
- Code generation
- This step is necessary for doing code-generation for downstream modules
At each stage problems may be discovered and diagnostics emitted. The simplest thing to do if a problem is discovered is to stop, and not progress any further down the pipeline.
However, this means that any diagnostics that might be generated by later stages will effectively be suppressed if there is a problem in an earlier stage. This means we give the user significantly fewer diagnostics than we could do.
Consider the following Haskell program:
foo :: Int -> Int
foo =
bar :: Int -> Int
bar = 1 + foo
Reading this as a human, the situation is clear: there is a parse error in the RHS of “foo”, and it’s unreasonable to expect the compiler to say anything much about it (but we do have a stated type for the binding!); and there is a type error in the RHS of “bar”.
However, today GHC will only report the parse error in “foo” and will not report the type error in “bar”.
A common workflow is:
- Make a series of changes to your program, perhaps across multiple locations
- Fix problems reported by the compiler
For example, suppose that I start rewriting declaration “a”, but then realise that declarations “b” and “c” are also wrong, and so I start to alter them. In doing so I make some mistakes: the code I wrote for “a” has a parse error; “b” has a name resolution error; “c” has a type error.
Unfortunately, despite the fact that I am currently focussed on “c” in my editor (and my mind), what happens is:
- Run GHC/check error list
- See only the parse error for “a”
- Navigate back to “a”
- Fix the error
- Run GHC/check error list
- See only the name resolution error for “b”
- Navigate back to “b”
- Fix the error
- Run GHC/check error list
- See only the type error for “c”
- Navigate back to “c”
- Fix the error
While I have now fixed all the problems, I have been forced to move my focus around inefficiently. I have to look at code other than the code I am currently focussed on, and I have to re-run GHC repeatedly.
Furthermore, sometimes this can force you to do work that you would otherwise never do at all. It may be that I will shortly rewrite the code for “a” anyway - forcing me to fix the parse errors in code I am about to delete is quite annoying!
This workflow is particularly relevant for usage in an IDE, as that provides a very direct way to see errors “by location”. Users who work on the command-line suffer less, since even if they got all the errors at once they might not be able to easily find the ones for “c”.
Refusing to progress down the compiler pipeline not only means that we do not get diagnostics from later stages, but also that any tools which want to use the output from later stages can’t do so.
Two examples are code formatters and HLS.
A code formatter can in principle work on a parse tree with parse errors in it, so long as it’s clear how to exact-print the broken bits as they were in the input. But it can’t do this if GHC won’t produce a parse tree at all.
Similarly, HLS can compute information about identifiers (and many other things) from the renamed or typechecked source. A partial result that lacks some information is more helpful than no result at all.
A naive implementation of HLS would lead to users losing many IDE features if their module had a parse error in it anywhere, since we would not be able to get a typechecked module from GHC.
In fact, today HLS goes to quite some lengths to store stale versions of computed information, and carefully adjust positions so that they map onto the user’s current buffer. This allows it to e.g. provide type information for identifiers defined in a part of the file that has not changed, even if the user has introduced a parse error elsewhere (and hence GHC won’t typecheck the file). This is essential in any case, since it allows us to respond quickly to queries on modified buffers, but the fact that we cannot get partial information from GHC means we are often working with much older information than is ideal.
For example, suppose a user types a new declaration "a" (including a parse error), and then starts a second new declaration "b" and tries to autocomplete "a". Today they will not see it, because HLS's stale information is stil from the module before "a" was written. If GHC could cope with parse errors, then we might be able to get an updated partial module that includes information about "a", and so be able to answer the user's query.
This proposal aims to solve the two problems described above:
- GHC does not produce output from pipeline stages if there is an error in the stage
- GHC does not emit diagnostics from later pipeline stages if there is an error at an earlier stage
These problems are clearly closely related and generally relate to GHC’s fault-tolerance.
We propose to solve them by making GHC’s compiler stages fault-tolerant, by which we mean:
- If there is a problem during the stage, the compiler will try to recover or produce partial output (“error-recovery”)
- The stage will accept partial input, and try to recover or (again) produce partial output (MPJ: not sure what to call this)
This is discussed in more detail in the “Technical Content” section.
In the Haskell community:
- https://gitlab.haskell.org/ghc/ghc/-/issues/16955
- This issue discusses fault-tolerance in the parser, but also discusses it more widely
- The issue also gives an example of where “gcc” is able to do better and report both parse errors and type errors
- https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4711
- This is a WIP MR that implements part of the Phase 1 of this proposal
- ghc-proposals/ghc-proposals#333
- This proposal asks for a flag to allow deferring a particular kind of failure.
- The current proposal would subusume the linked one, especially if we do Phase 4, which might include a generalisation of
-fdefer-
flags.
- GHC’s
-fdefer-
flags- These allow deferring errors until runtime, which is very similar to the kind of fault tolerance that we want, and we may well be able to reuse some of the groundwork
- However, these allow compilation to succeed in the end, which is not what we want in this proposal (except maybe in Phase 4). We usually do want an eventual failure, and not a broken binary. We just want to get as much out as we can before we fail.
- (MPJ: I believe
-fdefer
mostly just inserts calls to “error”. I think we may want to do something different that involves explicit “missing” nodes in the tree, since I think the current approach is vulnerable to false positives. See “Misleading diagnostics below)
Outside the Haskell community this problem is widely addressed:
- “gcc” as previously mentioned handles this well
- The authors have worked on proprietary compilers that handled this
- Error-recovery for parsers is widely discussed, less so for the rest of the compilation pipeline
- LALRPROP section on error-recovery: https://lalrpop.github.io/lalrpop/tutorial/008_error_recovery.html
We propose to take a phased approach whereby we work through the compiler pipeline, integrating fault-tolerance for each stage of the pipeline. Strict ordering is not necessary, and most of Phases 1-3 could be done in parallel.
All of parsing, renaming, and typechecking produce very significant numbers of diagnostics, and all of the output from these stages is used by HLS. So we strongly suggest that all phases up to and including Phase 3 be completed. Phase 4 and beyond is extra work and seems likely to be harder.
The phases are not fully-specified:
- There is a wide spectrum of how fault-tolerant a stage can be. For example, the typechecker could be somewhat fault-tolerant by only typechecking complete declarations; or it could try to additionally do something for partially incomplete declarations. For this proposal the main aim should be to pick the low-hanging fruit and get the infrastructure in place. At that point it should hopefully be possible for the community to improve specific aspects over time.
- In particular we propose as a "Success" criteria that GHC generally be able to ignore errors that are in separate top-level declarations. This seems like a good first step.
- How, specifically, to handle partial input and produce partial output is stage-dependent and is a topic that will likely be of great interest to the GHC team. We do not propose to specify that here - rather, coming up with a good design in collaboration with the GHC team is part of the work. Each phase has an implicit “Stage 0: write some tickets and do some discussion” step.
A “fully-recoverable” parse error is one where we can know (or make a very good guess) what the user intended. This usually means guessing or removing a few tokens, such as a trailing bracket.
This step does not require producing partial output: we can produce a full parse tree, just with some tokens that are guessed in it. Hence it is an easy first step.
Example:
tup :: (Int, Int)
tup = (1, 2
This should parse by guessing the missing closing paren (and then pass renaming, typechecking etc.), but produce a parse error.
For parse errors which we cannot recover from we need to produce a partial parse tree with “error” nodes or similar.
Example:
tup :: (Int, Int)
tup = (1, ##)
This should do something like: produce a partial parse tree with an “error” node inside the second component of the tuple.
This should allow downstream stages to accept a partial program produced from the Phase 1 parser and still process it. The result will necessarily also be partial.
After this, it should be possible to see a parse error and also an error from a later stage.
This will allow formatting and refactoring of programs with parse errors.
The renamer should handle errors by producing partial output, omitting the information that it would otherwise resolve.
Example:
tup :: (Int, Int)
tup = nonExistentTup
This should do something like: produce a partial renamed tree with an “unresolved name” node for the RHS of “tup”.
This should allow downstream stages to accept a partial program produced from the Phase 2 renamer and still process it. The result will necessarily also be partial.
After this, it should be possible to see a renamer error and also an error from a later stage.
The typechecker should handle errors by producing partial output, omitting the information that it would otherwise resolve.
Example:
bar = 1 + “hello”
This should do something like: produce a binding whose type is “unknown” or “erroneous”.
This should allow downstream stages to accept a partial program produced from the Phase 3 typechecker and still process it. The result will necessarily also be partial.
After this, it should be possible to see a type error and also an error from a later stage (e.g. a pattern match warning from desugaring).
In principle we can even produce an interface file for a partially broken module. If we know all the bindings and their types, then we know what we need in order to at least typecheck downstream modules. This means that we can get diagnostics from downstream modules also.
We will need to produce a partial interface file, since we may have to omit some of the things that are normally put in interface files (e.g. unfoldings, anything we get from analysis on Core).
After this, it should be possible to see an error from a module and also an error from a downstream module.
MPJ: very speculative, basically trying to generalize the “-fdefer-” approach, maybe do it by default? I’m not sure how useful this is, the main slightly crazy use-case I can think of is running TH using the partial code, which might actually just be fine?
MPJ: very speculative, but it seems that you could use a partial interface files instead of hs-boot files. You would still need to know which module to compile first in order to break the cycle, or else you would need to run the process to fixpoint.
This proposal will make GHC more complicated. In particular, many compiler stages will now have to consider questions like:
- How do I handle partial input? What kinds of partial input even make sense for me?
- How do I produce partial output if there is a problem? When do I just give up?
This means more work for GHC devs and increases the size of the state space that they need to consider.
The main mitigation that we propose is that we should try to enumerate some principles for partial output. For example, most passes should be able to entirely ignore missing parts of the tree: there should already be a diagnostic explaining what went wrong there, and we may just produce confusing errors if we try to process it.
If we try to process partial programs we may find ourselves in a situation where we have a choice between
- Risking false positives by guessing what the user might have meant, or by inserting code into the program that the user did not write
- Risking false negatives by ignoring a partial program about which we might in fact be able to say something correct
Generally false positives are very bad for user trust in a system, and so avoiding false positives should override the risk of false negatives. That means that:
- We should generally not guess what the user meant: if we don’t know, say nothing
- If we have to insert things into the program in order to continue (e.g. because there is a parse error), then we should insert something that we can clearly identify as made-up (e.g. an explicit “Error” node rather than a synthetic call to “error”)
For example, suppose we parsed this example by inserting a call to “error” in the second component:
(1, ##)
When we get to the typechecker, the typechecker will not know that this is a synthetic node, and so will process it like a user-written call to “error”. That means creating a new unification variable for the type parameter to “error”, and potentially giving the user confusing errors that mention that variable. In contrast, if we explicitly know that the node is missing, we could either a) behave similarly, but mark the new type variable specially to avoid giving diagnostics about it, or b) simply ignore this entire expression as it is broken.
(MPJ: Unclear, probably mostly the GHC team would review the design and work, I’d be happy to supervise from a keep-stuff-moving perspective?)
There are no particular time constraints with this project.
Each of the phases provides concrete benefits on its own, so there are clear milestones to complete with visible output.
(MPJ: I have no idea. This is actually quite a lot of work! Probably around 6 person-months for someone familiar with GHC?)
- The HLS team
- The GHC team
- GHC can report diagnostics from all of the parsing, renaming, and typechecking phases in one compilation
- The level of fault-tolerance is at least high enough that an error in one top-level binding does not prevent processing of another top-level binding.
- The GHC compiler pipeline produces partial results at all stages, and a proof-of-concept exists for use by one of:
- HLS
- A code formatter