Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default and expanded errors for rustc #1644

Merged
merged 14 commits into from
Jul 14, 2016
Merged
290 changes: 290 additions & 0 deletions text/0000-default-and-expanded-rustc-errors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,290 @@
- Feature Name: default_and_expanded_errors_for_rustc
- Start Date: 2016-06-07
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
This RFC proposes an update to error reporting in rustc. Its focus is to change the format of Rust error messages and improve --explain capabilities to focus on the user's code. The end goal is for errors and explain text to be more readable, more friendly to new users, while still helping Rust coders fix bugs as quickly as possible. We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could you wrap at 80 chars please (I'm sure this was specified somewhere, but maybe it is just a de facto thing that people do?)


# Motivation

## Default error format

Rust offers a unique value proposition in the landscape of languages in part by codifying concepts like ownership and borrowing. Because these concepts are unique to Rust, it's critical that the learning curve be as smooth as possible. And one of the most important tools for lowering the learning curve is providing excellent errors that serve to make the concepts less intimidating, and to help 'tell the story' about what those concepts mean in the context of the programmer's code.

[as text]
```
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 error: cannot borrow `foo.bar1` as mutable more than once at a time [E0499]
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1;
^~~~~~~~
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22: 29:30 help: run `rustc --explain E0499` to see a detailed explanation
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28:21: 28:29 note: previous borrow of `foo.bar1` occurs here; the mutable borrow prevents subsequent moves, borrows, or modification of `foo.bar1` until the borrow ends
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28 let bar1 = &mut foo.bar1;
^~~~~~~~
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31:2: 31:2 note: previous borrow ends here
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:26 fn borrow_same_field_twice_mut_mut() {
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:27 let mut foo = make_foo();
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:28 let bar1 = &mut foo.bar1;
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29 let _bar2 = &mut foo.bar1;
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:30 *bar1;
src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:31 }
^
```

[as image]
![Image of new error flow](http://www.jonathanturner.org/images/old_errors_3.png)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alt text is incorrect (this is the old errors, not the new ones, right?). Also I don't think you need the [as text]/[as image] tags - GH renders them as text.


*Example of a borrow check error in the current compiler*

Though a lot of time has been spent on the current error messages, they have a couple flaws which make them difficult to use. Specifically, the current error format:

* Repeats the file position on the left-hand side. This offers no additional information, but instead makes the error harder to read.
* Prints messages about lines often out of order. This makes it difficult for the developer to glance at the error and recognize why the error is occuring
* Lacks a clear visual break between errors. As more errors occur it becomes more difficult to tell them apart.
* Uses technical terminology that is difficult for new users who may be unfamiliar with compiler terminology or terminology specific to Rust.

This RFC details a redesign of errors to focus more on the source the programmer wrote. This format addresses the above concerns by eliminating clutter, following a more natural order for help messages, and pointing the user to both "what" the error is and "why" the error is occurring by using color-coded labels. Below you can see the same error again, this time using the proposed format:

[as text]
```
error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time
--> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22
|
28 | let bar1 = &mut foo.bar1;
| -------- first mutable borrow occurs here
29 | let _bar2 = &mut foo.bar1;
| ^^^^^^^^ second mutable borrow occurs here
30 | *bar1;
31 | }
| - first borrow ends here
```

[as image]

<a href="http://www.jonathanturner.org/images/new_errors_3.png"><img src="http://www.jonathanturner.org/images/new_errors_3.png" height="200" width="750" ></a>

*Example of the same borrow check error in the proposed format*

## Expanded error format (revised --explain)

Languages like Elm have shown how effective an educational tool error messages can be if the explanations like our --explain text are mixed with the user's code. As mentioned earlier, it's crucial for Rust to be easy-to-use, especially since it introduces a fair number of concepts that may be unfamiliar to the user. Even experienced users may need to use --explain text from time to time when they encounter unfamiliar messages.

While we have --explain text today, it uses generic examples that require the user to mentally translate the given example into what works for their specific situation.

```
You tried to move out of a value which was borrowed. Erroneous code example:

use std::cell::RefCell;

struct TheDarkKnight;

impl TheDarkKnight {
fn nothing_is_true(self) {}
}
...
```

*Example of the current --explain (showing E0507)*

To help users, this RFC proposes a new `--explain errors`. This new mode is more textual error reporting mode that gives additional explanation to help better understand compiler messages. The end result is a richer, on-demand error reporting style.

```
error: cannot move out of borrowed content
--> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17

I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This first-person, conversational style drives me nuts. I find it patronising and hard to comprehend. AFAIK there is only one other language that does this and there is no evidence that it is in any way better than traditional style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback. I should have probably reiterated again in this section that this RFC's focus is the error format, not the error writing style. The way these errors are written will be a follow-up RFC as we get more experience with what works best.

I list some evidence, by way of mentioning it being praised by noted programmers, but anecdotally, I've been watching the feedback from other programmers around the web. They praise the elm style for being a) one of the best error message styles, b) so thorough they use it as a way to pick back up an editing session if it's cut short on a train ride, c) more new user-friendly than other error styles.

Note, too, that we're not saying that the expanded style is better for all users, but rather it's one we should support for users who wish to have more information. Some users may enable --explain errors and leave it on for all projects, and that would be fine, while others may prefer more compact formats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify, I didn't mean the style of integrating user code and error mesage, I meant the style of using first and second person pronouns and anthropomorphisation of the compiler, rather than a more academic tone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that's why I say "We expect to follow this RFC with a supplemental RFC that provides a writing style guide for error messages and explain text with a focus on readability and education." in the RFC. The writing style will need to be a separate RFC.


29 | match tail {

In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail`
value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents.

30 | [Foo { string: aa },
| ^^ cannot move out of borrowed content

You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix
might look this:

30 | [Foo { string: ref aa },

```

# Detailed design

The RFC is separated into two parts: the format of error messages and the format of expanded error messages (using `--explain errors`).

## Format of error messages

The proposal is a lighter error format focused on the code the user wrote. Messages that help understand why an error occurred appear as labels on the source. The goals of this new format are to:

* Create something that's visually easy to parse
* Remove noise/unnecessary information
* Present information in a way that works well for new developers, post-onboarding, and experienced developers without special configuration
* Draw inspiration from Elm as well as Dybuk and other systems that have already improved on the kind of errors that Rust has.

In order to accomplish this, the proposed design needs to satisfy a number of constraints to make the result maximally flexible across various terminals:

* Multiple errors beside each other should be clearly separate and not muddled together.
* Each error message should draw the eye to where the error occurs with sufficient context to understand why the error occurs.
* Each error should have a "header" section that is visually distinct from the code section.
* Code should visually stand out from text and other error messages. This allows the developer to immediately recognize their code.
* Error messages should be just as readable when not using colors (eg for users of black-and-white terminals, color-impaired readers, weird color schemes that we can't predict, or just people that turn colors off)
* Be careful using “ascii art” and avoid unicode. Instead look for ways to show the information concisely that will work across the broadest number of terminals. We expect IDEs to possibly allow for a more graphical error in the future.
* Where possible, use labels on the source itself rather than sentence "notes" at the end.
* Keep filename:line easy to spot for people who use editors that let them click on errors

### Header

```
error[E0499]: cannot borrow `foo.bar1` as mutable more than once at a time
--> src/test/compile-fail/borrowck/borrowck-borrow-from-owned-ptr.rs:29:22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent was introduced to align 'error:' and '-->', but we don't need it now.

error: msg
   --> file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally, the indent aligns with the column below it. For example:

 --> file
  |
2 |

or

    --> file
     |
9999 |

IMHO, having some indent there still looks better and makes it easier to find the filename/line info, though there are other alternatives we could explore, like:

   --> file
     |
9999 |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are right.

```

The header still serves the original purpose of knowing: a) if it's a warning or error, b) the text of the warning/error, and c) the location of this warning/error. We keep the error code, now a part of the error indicator, as a way to help improve search results.

### Line number column

```
|
28 |
|
29 |
|
30 |
31 |
|
```

The line number column lets you know where the error is occurring in the file. Because we only show lines that are of interest for the given error/warning, we elide lines if they are not annotated as part of the message (we currently use the heuristic to elide after one un-annotated line).

Inspired by Dybuk and Elm, the line numbers are separated with a 'wall', a separator formed from pipe('|') characters, to clearly distinguish what is a line number from what is source at a glance.

As the wall also forms a way to visually separate distinct errors, we propose extending this concept to also support span-less notes and hints. For example:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this. AIUI, the point of the code-oriented layout is that multiple notes ('sub-errors') on the same block of code are displayed together and inline. So the wall ties the errors together rather than separates them out.

In the example below, I'm not sure what is being communicated by the = (which, bikeshed, I'm not a fan of because it is so commonly used in code) - are the notes meant to be tied to the walled source or distinct from it? If the former then this seems to imply that the notes come logically at the end of the code, but in fact they might in general be related to an error in the middle of the displayed lines of codes.

On a purely aesthetic level, I'm not sure if <indent>= note... either makes the notes more distinct or more tied to the walled code (c.f., <no indent> note..., but (to me) it does read as more 'line-noisey'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple kinds of notes, the knowledge of which I unfortunately assume in a few places. Hints, suggestions, notes, etc can all be associated with an error message. Notes (and the like) which do not have span information need to still be associated with the error. They're less common now that many notes are becoming labels, but they still exist in a few cases. Where un-spanned notes are associated with an error, I suggest printing them at the bottom of the error block.

A bit of backstory here. In a previous iteration of the design (currently in nightly), they are left-aligned and colored. @alexcrichton showed us some examples at one point where he was getting a lot of error messages in a row, all with unspanned notes. The result had a negative impact on readability because it was harder to discern each separate error. In this version of the format, we group unspanned notes and indent them so your eye can see a whole error message as a single group, by skipping the indented line and looking for where the next message starts.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense for motivation. I wonder if we can do better than = though.


```
92 | config.target_dir(&pkg)
| ^^^^ expected `core::workspace::Workspace`, found `core::package::Package`
= note: expected type `&core::workspace::Workspace<'_>`
= note: found type `&core::package::Package`
```
### Source area

```
let bar1 = &mut foo.bar1;
-------- first mutable borrow occurs here
let _bar2 = &mut foo.bar1;
^^^^^^^^ second mutable borrow occurs here
*bar1;
}
- first borrow ends here
```

The source area shows the related source code for the error/warning. The source is laid out in the order it appears in the source file, giving the user a way to map the message against the source they wrote.

Key parts of the code are labeled with messages to help the user understand the message.

The primary label is the label associated with the main warning/error. It explains the **what** of the compiler message. By reading it, the user can begin to understand what the root cause of the error or warning is. This label is colored to match the level of the message (yellow for warning, red for error) and uses the ^^^ underline.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the error label be standard weight text rather than bold? One problem I have with the new messages is all the bold text - it makes the really important stuff (which should be bold) harder to pick out and (in my console at least) the bold text is quite a bit harder to read than standard weight.


Secondary labels help to understand the error and use blue text and --- underline. These labels explain the **why** of the compiler message. You can see one such example in the above message where the secondary labels explain that there is already another borrow going on. In another example, we see another way that primary and secondary work together to tell the whole story for why the error occurred.

Taken together, primary and secondary labels create a 'flow' to the message. Flow in the message lets the user glance at the colored labels and quickly form an educated guess as to how to correctly update their code.

Note: We'll talk more about additional style guidance for wording to help create flow in the subsequent style RFC.

## Expanded error messages

Currently, --explain text focuses on the error code. You invoke the compiler with --explain <error code> and receive a verbose description of what causes errors of that number. The resulting message can be helpful, but it uses generic sample code which makes it feel less connected to the user's code.

We propose adding a new `--explain errors`. By passing this to the compiler (or to cargo), the compiler will switch to an expanded error form which incorporates the same source and label information the user saw in the default message with more explanation text.

```
error: cannot move out of borrowed content
--> /Users/jturner/Source/errors/borrowck-move-out-of-vec-tail.rs:30:17

I’m trying to track the ownership of the contents of `tail`, which is borrowed, through this match statement:

29 | match tail {

In this match, you use an expression of the form [...]. When you do this, it’s like you are opening up the `tail`
value and taking out its contents. Because `tail` is borrowed, you can’t safely move the contents.

30 | [Foo { string: aa },
| ^^ cannot move out of borrowed content

You can avoid moving the contents out by working with each part using a reference rather than a move. A naive fix
might look this:

30 | [Foo { string: ref aa },
```

*Example of an expanded error message*

The expanded error message effectively becomes a template. The text of the template is the educational text that is explaining the message more more detail. The template is then populated using the source lines, labels, and spans from the same compiler message that's printed in the default mode. This lets the message writer call out each label or span as appropriate in the expanded text.

It's possible to also add additional labels that aren't necessarily shown in the default error mode but would be available in the expanded error format. This gives the explain text writer maximal flexibility without impacting the readability of the default message. I'm currently prototyping an implementation of how this templating could work in practice.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that there are two orthogonal features here that users might want either or both of: template explanations and inline explanations. There seems to be use cases for both (for both beginning and advanced users) so it might be nice to have access to both separately. It might be that we want to only expose one option to rustc users, but that IDEs can configure things in more detail (this might just come down to ignoring or rendering parts of the JSON errors, in practice).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned earlier, I think the --explain errors should be a separate RFC, but in particular it seems premature to have an RFC without the templating language or other details specified.

Copy link
Contributor Author

@sophiajt sophiajt Jul 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow this point. The templating language piece seems more engineer-related rather than design related. We could iterate the templating piece without impacting the user's experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I think of the interface we present to Rustdoc authors as a user-facing API, not just the output of Rustdoc. Therefore, there is a design question there that should be answered in an RFC.

Copy link
Contributor Author

@sophiajt sophiajt Jul 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I'm picturing this, only compiler authors who are writing error messages will see the templates. It's not intended to be a user-facing API for users to write their own. Similar to the existing error reporting mechanism the compiler currently uses, it would just have an additional "backend" of outputting to a template.

It sounds like you might have bigger ambitions for it? If so, we should chat.

## Tying it together

Lastly, we propose that the final error message:

```
error: aborting due to 2 previous errors
```

Be changed to notify users of this ability:

```
note: compile failed due to 2 errors. You can compile again with `--explain errors` for more information
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify, this should only happen if in fact there is more information to show?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, with the assumption that in the fullness of time, we have --explain errors cover most, if not all, errors.

```

# Drawbacks

Changes in the error format can impact integration with other tools. For example, IDEs that use a simple regex to detect the error would need to be updated to support the new format. This takes time and community coordination.

While the new error format has a lot of benefits, it's possible that some errors will feel "shoehorned" into it and, even after careful selection of secondary labels, may still not read as well as the original format.

There is a fair amount of work involved to update the errors and explain text to the proposed format.

# Alternatives

Rather than using the proposed error format format, we could only provide the verbose --explain style that is proposed in this RFC. Respected programmers like [John Carmack](https://twitter.com/ID_AA_Carmack/status/735197548034412546) have praised the Elm error format.

```
Detected errors in 1 module.

-- TYPE MISMATCH ---------------------------------------------------------------
The right argument of (+) is causing a type mismatch.

25| model + "1"
^^^
(+) is expecting the right argument to be a:

number

But the right argument is:

String

Hint: To append strings in Elm, you need to use the (++) operator, not (+). <http://package.elm-lang.org/packages/elm-lang/core/latest/Basics#++>

Hint: I always figure out the type of the left argument first and if it is acceptable on its own, I assume it
is "correct" in subsequent checks. So the problem may actually be in how the left and right arguments interact.
```

*Example of an Elm error*

In developing this RFC, we experimented with both styles. The Elm error format is great as an educational tool, and we wanted to leverage its style in Rust. For day-to-day work, though, we favor an error format that puts heavy emphasis on quickly guiding the user to what the error is and why it occurred, with an easy way to get the richer explanations (using --explain) when user wants them.

# Stabilization

Currently, these new rust error format is available on nightly using the ```export RUST_NEW_ERROR_FORMAT=true``` environment variable. Ultimately, this should become the default. In order to get there, we need to ensure that the new error format is indeed an improvement over the existing format in practice.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit grammar: "these ... is"


How do we measure the readability of error messages? This RFC details an educated guess as to what would improve the current state but shows no ways to measure success.

Likewise, While some of us have been dogfooding these errors, we don't know what long-term use feels like. For example, after a time does the use of color feel excessive? We can always update the errors as we go, but it'd be helpful to catch it early if possible.

# Unresolved questions

There are a few unresolved questions:
* Editors that rely on pattern-matching the compiler output will need to be updated. It's an open question how best to transition to using the new errors. There is on-going discussion of standardizing the JSON output, which could also be used.
* Can additional error notes be shown without the "rainbow problem" where too many colors and too much boldness cause errors to become less readable?