Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: fmt::Show and fmt::String guidelines #565

Merged
merged 3 commits into from
Jan 20, 2015

Conversation

aturon
Copy link
Member

@aturon aturon commented Jan 9, 2015

A recent RFC split what was previously fmt::Show into two traits, fmt::Show and fmt::String, with format specifiers {:?} and {} respectively.

That RFC did not, however, establish complete conventions for when to implement which of the traits, nor what is expected from the output. That's what this RFC seeks to do.

Rendered

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

I think it's worth elaborating on the problem of nesting one type's "for human consumption" into a generic type's "for human consumption".

Imagine that cargo::Package implements a String representation for human consumption on the command-line (complete with colors, tabs, newlines, list formatting, etc.). Separately, Vec has created a String representation for human consumption. Without coordination, embedding one inside the other simply doesn't make sense.

The person who implemented String on the inner type didn't implement it for embedding in generic contexts, just top-level contexts. It would be an inappropriate constraint on authors of String that their implementation should be appropriate for use in generic embedded contexts.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

@aturon I'm struggling with how to decide when type information can be left off of a Show implementation.

Can you suggest some straw-man guidelines that differentiate between 1u32 and Person { name: "Yehuda" }? What is the right way to print a String? Simple wrapper containers (Rc, Option)?

I'd like to flesh that out and get it added to the RFC 😄

@Diggsey
Copy link
Contributor

Diggsey commented Jan 9, 2015

Is it possible to only show type information for the top level?
[1,2,3] : Vec
Is much more useful than
[1u32,2u32,3u32]

The only exception would be when you have dynamic polymorphism, in which case you'd need to output the specific type, eg.
[1 : u32, 2 : u8, 3 : u16] : Vec< Box< Int > >

@dgrunwald
Copy link
Contributor

It is not a goal for the debugging output to be valid Rust source.

Why?
I think where practicable, Show should produce valid Rust code. Python's repr() works this way; and I've found that to be highly useful for copy-pasting debugging output into unit tests. (python has the same str()/repr() split for user-output and debugging-output)

In that light, I'd say integer suffixes can be left off because the compiler can almost certainly infer them when building the unit test.

Also, how do we Show cyclic data structures? In Python, the container types like list detect recursion in repr(). For example, a list containing itself has a repr of [[...]]]. Not sure whether we'd want to handle this case, or just blow the stack.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

@Diggsey Not without multiple traits (or an additional level parameter).

But perhaps more importantly, I think it's clear that Person { name: "Yehuda" } should include the type even at the top level.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

Also, how do we Show cyclic data structures? In Python, the container types like list detect recursion in repr(). For example, a list containing itself has a repr of [[...]]]. Not sure whether we'd want to handle this case, or just blow the stack.

In Ruby:

$ irb                                                                                                                                   ◼
2.1.5 :001 > x = {}
 => {}
2.1.5 :002 > x[:y] = x
 => {:y=>{...}}

In Node:

$ node                                                                                                                                  ◼
> x = {}
{}
> x.y = x
{ y: [Circular] }

@aturon I don't see a clear way to implement this with the existing Show trait. Is there a clever solution?

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

@wycats Yeah, I'm also uncomfortable with how vague the initial proposal is here, and hoping that we can iterate toward something crisper.

One problem is to balance uniformity against readability. It's hard to see how to avoid special cases. For example, the outcome I'd like for fmt::Show is something like:

  • Scalars print as unsuffixed literals.
  • Strings print as normal quoted notation, with escapes.
  • Smart pointers print as whatever they point to.
  • Structs print as you'd normally construct them: MyStruct { f1: ..., f2: ... }
  • Enums print as you'd construct their variants (possibly with special cases for things like Option and single-variant enums?).
  • Containers print using some notation that makes their type and contents clear. (Since we lack literals for all container types, this will be ad hoc).

Those are some ad hoc rules! But I think there is a clear intent behind them: to make it easy to make sense of compound data of various kinds without overwhelming debugging output with every last bit of type information. Essentially, types at the leaves are left off, as are types of "trivial" wrappers (like smart pointers), but composed things like structs and enums show type information.

Compare:

Vec { 
  MyStruct { f1: 0u32, f2: "some\nstring", f3: MyNullaryVariant }, 
  MyStruct { f1: 1u32, f2: "", f3: MyUnaryVariant(0.3f32) }
}

to

[MyStruct { f1: 0, f2: "some\nstring", f3: MyNullaryVariant },
 MyStruct { f1: 1, f2: "", f3: MyUnaryVariant(0.3) }] 

I feel like the second version is much less noisy, but hasn't lost any interesting information.

I would love some help making this more crisp, if others share similar preferences.

@Diggsey
Copy link
Contributor

Diggsey commented Jan 9, 2015

@wycats Well one way would be to have the Show trait always omit the type if known statically, and then have a shortcut way to "show" both a value and it's compile-time known type.

Also, I don't think your example is necessarily true, I'd prefer this:

[{name: "Adam"}, {name: "Bob"}, {name: "Claire"}] : Vec<Person>

Than this (especially for longer lists, and doubly for when the type is more complex, eg. Person< Some type params... > or something):

[Person {name: "Adam"}, Person {name: "Bob"}, Person {name: "Claire"}]

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

@aturon quick clarification: When you say "String" there, you mean every kind of String (String and &str), right?

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

@Diggsey can you show a sample implementation of a sample Show trait that does what you're thinking?

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

Structs print as you'd normally construct them

@aturon you mean at the top-level only, right?

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

I feel like the second version is much less noisy, but hasn't lost any interesting information.

Unless you're debugging an overflow or precision issue ;)

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

@wycats

@aturon quick clarification: When you say "String" there, you mean every kind of String (String and &str), right?

Yes.

Structs print as you'd normally construct them

@aturon you mean at the top-level only, right?

No, I mean at any level. See the example -- MyStruct is printed even though it's not at the top-level.

Given that we want #[derive(Show)] to work, unless we passed along some extra information about "being at the top level" I don't see how we'd make this distinction.

And anyway, for nested structures I think it's still useful to see the struct name:

Foo { f1: Bar { b1: 0, b2: "hello" }, f2: Baz(3) }

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

I feel like the second version is much less noisy, but hasn't lost any interesting information.

Unless you're debugging an overflow or precision issue ;)

My point is just that this information is readily available elsewhere, especially if we print the type name for any enclosing structs.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

No, I mean at any level.

I mean that the CONTENTS of a struct are not printed as you would normally construct them. Only the structure.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

My point is just that this information is readily available elsewhere, especially if we print the type name for any enclosing structs

And my point is that sometimes you're poring through tons of trace logs and going back to the original definition is very time consuming if you actually care.

@dgrunwald
Copy link
Contributor

@wycats: I just looked at the python implementation, it simply maintains a list of containers that currently have an active repr() invocation on stack. The list is stored in TLS.
We could do the same without having to change the Show trait.

@wycats
Copy link
Contributor

wycats commented Jan 9, 2015

@dgrunwald that seems... not ideal for a Rust Show trait. @aturon am I crazy?

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

@wycats

No, I mean at any level.

I mean that the CONTENTS of a struct are not printed as you would normally construct them. Only the structure.

I'm afraid I still don't understand. The contents of the structure are printed according to whatever their fmt::Show implementation does -- compositionally. In most cases this would be done via #[derive(Show)].

My point is just that this information is readily available elsewhere, especially if we print the type name for any enclosing structs

And my point is that sometimes you're poring through tons of trace logs and going back to the original definition is very time consuming if you actually care.

Maybe I'm way off base, but it seems like if we're printing the surrounding type information -- which struct and field is this going into -- it should be very very easy to discover the type information. That's the sense in which it seems highly redundant to me.

I suppose in some cases being reminded of the type information in debug output could help you make a connection you wouldn't otherwise? Is that what you have in mind?

@dgrunwald
Copy link
Contributor

@wycats: The other option would be to change the trait to pass around a &mut Vec<usize> as a parameter. All container types stuff &self as *const _ as uint in there, and remove it on the way out. (the Vec<usize> should probably be encapsulated behind a &mut RecursionGuard API)

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

@dgrunwald

It is not a goal for the debugging output to be valid Rust source.

Why?
I think where practicable, Show should produce valid Rust code. Python's repr() works this way; and I've found that to be highly useful for copy-pasting debugging output into unit tests. (python has the same str()/repr() split for user-output and debugging-output)

I should have laid this out more explicitly, but while I can see the benefit of what you're suggesting, I worry that it would lead to much more verbose debugging output in general. Further, in a lot of cases in practice it's not really possible to do this, because the internals of a data type are private; at the very least, the debugging output might involve a call to a constructor, etc.

That said, I think in practice the conventions I'm suggesting will put you in the ballpark; I mostly wanted to clarify that it's not a hard-and-fast rule.

In that light, I'd say integer suffixes can be left off because the compiler can almost certainly infer them when building the unit test.

Agreed.

@Diggsey
Copy link
Contributor

Diggsey commented Jan 9, 2015

@wycats Types would implement Show such that they'd never output type information, and to actually display a value when debugging it would do this:

fn type_of<T>(_: &T) -> &'static str {
    unsafe { (*std::intrinsics::get_tydesc::<T>()).name }
}

fn main() {
    let a = Person { name: "Bob" };
    println!("{?:} : {}", a, type_of(&a));
}

Although obviously it would be nicer if there was a better syntax (maybe there could be a format specifier which automatically appended the type information or something)

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

cc @seanmonstar

@aturon
Copy link
Member Author

aturon commented Jan 9, 2015

cc @nick29581

@seanmonstar
Copy link
Contributor

Regarding not printing integer suffixes: I see the pain. But I also have cases where I want to see them. Take this simple example:

fn do_things<T: Show>(t: T) {
    debug!("doing things with {:?}", t);
}
do_things(1is);
do_things(1u64);
do_things((1u8, 2u8));

In this case, I'd need to be able to see what I got, since I don't have the actual type definition right there. But I definitely see the grossness in printing it for a big vector. Could we make use of the alternate # syntax to turn it on/off here?

If I had my way, I'd go down the route of making everything much noisier in debug outputs.

debug!("{:?}", "foo"); // &"foo"
debug!("{:?}", "foo".to_string()); // String { "foo" }
// etc

@Screwtapello
Copy link

It is not a goal for the debugging output to be valid Rust source.

However, Rust source is a notation that's good for concisely and accurately describing Rust values, and one that Rust programmers are likely to be familiar with. Even if a Show implementation doesn't produce 100% guaranteed copy-and-pastable Rust source as output, using Rust source conventions and idioms is a good idea.

@jpernst
Copy link

jpernst commented Jan 22, 2015

Other than unwrap() being frowned upon for user feedback, there are cases where the "error" value of a result isn't just a value representing the error, but also a "recovery" value of sorts. For example, I have an API like this:

struct ContextInactive {
    fn make_active (self) -> Result<Context, (GraphicsError, ContextInactive)> { ... }
}

Since this takes self by value, the original struct would be lost if the function fails, hence it is returned in the tuple of the "error" value so you can recover it. Another possibility is to add the original object as a parameter of the error object, but this isn't ideal since moving the original object out of the error would render the error unusable after the move, preventing you from retaining it for logging purposes or propagating it up the stack. Further, the original object has lifetime constraints as well as a destructor, so it would restrict the life of the error object unnecessarily.

In short, I agree that the error value of Result should not require Display (Debug is fine though, of course).

@nikomatsakis
Copy link
Contributor

(Note though that this requirement is only imposed for unwrap, so it's not that big a deal whichever way it goes.)

@seanmonstar
Copy link
Contributor

I find I only use unwrap to ease writing tests. Having to impl Display as
well would cause more friction when writing tests, and we never want that.

On Thu, Jan 22, 2015, 1:21 PM Niko Matsakis [email protected]
wrote:

(Note though that this requirement is only imposed for unwrap, so it's
not that big a deal whichever way it goes.)


Reply to this email directly or view it on GitHub
#565 (comment).

@aturon
Copy link
Member Author

aturon commented Jan 22, 2015

FWIW, I'm finding the arguments in favor of Debug for unwrap to be quite persuasive. The main downside I see is that for types that do implement Display, you'd generally prefer the Display output to Debug. But as many have said, using unwrap on a Result in production code is fairly rare and is not intended to provide great output (since it indicates an assertion failure). I'm happy to revert that part of the change here.

However, I think @alexcrichton may have some further thoughts.

@jsanders
Copy link

I think it would be nice to have all three of unwrap that uses Debug output, expect that uses the given message, and another method (perhaps unwrap_display) that uses Display output. I could see using unwrap for testing and debugging, expect for user-presentable messages for more one-off errors, and unwrap_display for errors that are presented to a user from a number of places.

@carllerche
Copy link
Member

@aturon I think that w/ specialization, there can be a default Debug impl for Display types, but I believe that if there is a Debug impl, you want that one on panic given that it is an unexpected event, you want as much info as possible.

@alexcrichton
Copy link
Member

After reading over these comments I reached the same conclusion as @carllerche, which I believe for now sounds like we should to back to Debug and we can use specialization (if we get it) to use Display instead.

@aturon
Copy link
Member Author

aturon commented Jan 23, 2015

@carllerche

@aturon I think that w/ specialization, there can be a default Debug impl for Display types, but I believe that if there is a Debug impl, you want that one on panic given that it is an unexpected event, you want as much info as possible.

So earlier I misread this and said I agreed. But actually my view is that of @alexcrichton, namely that with specialization we'd prefer Display over Debug on error types, since the whole point of opting into Display is saying that you have a human-readable form to present. It'd be easy enough to opt out of this with an adapter, in any case.

@alexcrichton
Copy link
Member

Hm, one downside of this that I forgot is that the quality of many error messages will go down significantly in some cases. I think that "avoiding panicking as much as possible" is certainly an important method of using the standard library, but it's by no means the only way of using it!

An example is that RecvError has a Debug implementation that just prints RecvError (it's derived) and a Display implementation that prints receiving on a closed channel. One of these is definitely much more easily debuggable than the other, and it would be unfortunate to lose these nice error messages.

@wycats
Copy link
Contributor

wycats commented Jan 23, 2015

I think this is unavoidable given the tools we have available today, but I consider this to be the correct behavior in Ruby:

~/C/htmlbars ❯❯❯ irb
2.2.0 :001 > raise "foo\nbar"
RuntimeError: foo
bar
    from (irb):1
    from /usr/bin/irb:12:in `<main>'
2.2.0 :002 > "foo\nbar".to_s
 => "foo\nbar"
2.2.0 :003 > "foo\nbar".inspect
 => "\"foo\\nbar\""

Printing out escape sequences in Result<T, String>::unwrap() seems wrong to me.

@carllerche
Copy link
Member

I don't believe that the ruby exception throwing is the correct analogy to Result::unwrap(). IMO, correct usage of Result error handling in "production" code would be something like:

if let Err(e) = risky_business() {
    panic!("attempted to perform risky business -- failed with {}; context={}; more-context={}",
           e, foo, bar);
}

In which case, a good Display impl for errors is important. However, using Result::unwrap() is IMO "hackish" (I use it extensively in tests) and I personally would prefer the panic message to default to Debug over Display. In my tests, when something fails, i would rather get as much info as possible (I am assuming that Debug will provide more info about Display)

Imagine the case where the Result error slot is being used to return a value vs. an error type, and this type implements Display as something simple to show the console but Debug is implemented with a much richer set of info. In this case, since my test did unwrap and failed, I would want the richer set of info displayed in the console.

The argument that is being used here is that error types will implement Display as something useful to the user but Debug as a bunch of "gibberish" it seems. I cannot honestly think of an error type case where Display would be implemented as helpful for console output but Debug would explicitly be implemented as hard to read output.

Simply implement Display on error types and have a blanket impl<T: Display> Debug for T where Debug is not explicitly implemented that is useful and it should be enough.

Defaulting to Display when Debug was explicitly defined is basically saying that the Debug output is not helpful for debugging, and when a Result::unwrap actually fails, as per what @nikomatsakis said, a bug has been written and needs to be debugged.

tl;dr

Unwrapping a Result that is Err is a bug which should be debugged. An explicit Debug impl should be better for debugging than an explicit Display impl. Debug impls don't always have to be formatted like a struct dump

@wycats
Copy link
Contributor

wycats commented Jan 23, 2015

@carllerche For tests, couldn't you implement an Assert trait (with an assert method) on various kinds of Results that has exactly the behavior you want?

@carllerche
Copy link
Member

My usage of unwrap in tests is not for assertions, it's to destructure data such that if it is not in the expected structure, it will fail hard. In the cases where the unwrap fails, I simply want debug info so that I can figure out what went wrong.

Again, one should not be using unwrap() in production code unless 100% sure the error case will not happen, in which case the error message doesn't matter anyway :)

@wycats
Copy link
Contributor

wycats commented Jan 23, 2015

@carllerche I am talking about some trait that you could implement that expresses precisely what you're trying to do: "I want to to destructure a Result such that if it is not in the expected structure, it will fail hard and use the Debug trait to display the Error variant"

@carllerche
Copy link
Member

When outputting errors there really are 2 cases.

  1. Handled errors. The error case is specifically handled with code, and when the error happens, it is gracefully handled (logged or something). For example:
if let Err(e) = risky_business() {
    panic!("attempted to perform risky business -- failed with {}; context={}; more-context={}",
           e, foo, bar);
}

In this case, you definitely want the Display impl as you are logging to a file, console, whatever.

  1. Unhandled errors. You were not expecting them. They are bugs, they must be debugged. You want as much info as possible to be able to debug them. In this case, the "pretty" console output (as implemented by Display) is probably not going to be enough. A Debug implementation will provide far more info and you will want that one in the event of a horrible bug as @nikomatsakis put it :) The log friendly version of the error output is probably not going to be enough.

@wycats
Copy link
Contributor

wycats commented Jan 23, 2015

@carllerche I think the correct solution given specialization is probably something like an UnwrapAs trait that is implemented as Debug normally, but as Display for certain targeted types like String.

@SimonSapin
Copy link
Contributor

To add to this, I’m also very concerned about Result::unwrap requiring a trait that is not expected to be implemented on most things. For example, I use Result<_, ()> heavily, and () does not implement Display, making unwrap impossible to use.

I don’t care about the error message when using unwrap, I only use it when an Err value should never happen, and indicates a bug in my code. Regardless of the flavor of formatting being used, called Result::unwrap()on anErr value is not something users know about.

If a nicer error message is desired, expect or panic! with explicit branching can be used, as described by others above.

@alexcrichton
Copy link
Member

@SimonSapin arguably a type like Result<T, ()> is the same as Option<T>, so I'm curious what using Result buys you over using Option? I would expect public-facing APIs to have a custom Error type implementing both the Error trait as well as a custom Display implementation to give a nicer error message.

@tbu-
Copy link
Contributor

tbu- commented Jan 23, 2015

@alexcrichton IIRC the documentation suggests to always use Result in a operation that might fail and Option where a value is optional. Additionally, using Result gives you ergonomics, such as try!.

mbrubeck referenced this pull request in mbrubeck/image Jan 23, 2015
This is currently required by `Result<T, ImageError>::unwrap`.  Fixes build
errors in test files.
@SimonSapin
Copy link
Contributor

@SimonSapin arguably a type like Result<T, ()> is the same as Option, so I'm curious what using Result buys you over using Option?

As @tbu- said, it’s so that I can use try!. (A lot.)

I’ve experimented to extend the try! macro to work with any types that implements a new Triable trait, but I’m not super satisfied with the design. In particular, it does not (yet?) convert between types, so to use try! e.g. with an Option<T> expression you have to be in a function that returns Option<U>.

@SimonSapin
Copy link
Contributor

I’ve also used Result<(), ()> instead of bool for the same reason.

@aturon
Copy link
Member Author

aturon commented Jan 23, 2015

As far as I can tell, there's widespread agreement that we should revert to bounding by Debug for now, and only some question about whether we should prefer to use Display if/when specialization becomes available.

@alexcrichton would you like to make that change, or shall I?

@alexcrichton
Copy link
Member

@tbu- yes and I believe the general practice is to use custom error types instead of () to leverage better error messages, custom Error implementations, as well as future FromError implementations.

@SimonSapin Yes currently try! only goes between instances of Result, but I would expect that the error type should be something more descriptive than () as it forbids the features I listed above.

I've created a PR for this change: rust-lang/rust#21558

@SimonSapin
Copy link
Contributor

I would expect public-facing APIs to have a custom Error type implementing both the Error trait as well as a custom Display implementation to give a nicer error message.

I would expect that the error type should be something more descriptive than () as it forbids the features I listed above.

Result<_, ()> works great for Servo’s CSS parsing, whether or not that was expected. This is not for a public API, this is for having many functions that call each other and use try!, or_else, unwrap, and other Result methods internally.

By the way, (though this might be getting off-topic, sorry) I don’t really see the point of the Error trait as long as it doesn’t support chaining errors and printing "tracebacks" as in some of @mitsuhiko’s experiments.

@aturon
Copy link
Member Author

aturon commented Jan 23, 2015

@SimonSapin

By the way, (though this might be getting off-topic, sorry) I don’t really see the point of the Error trait as long as it doesn’t support chaining errors and printing "tracebacks" as in some of @mitsuhiko’s experiments.

The intent was to standardize on functionality that production error types would implement; it also provides the cause chaining hooks that @mitsuhiko's ideas work from. FWIW, I would like to move forward with some of his ideas, but they basically are back-compat since they just use the Error infrastructure.

@Centril Centril added the A-fmt Proposals relating to std::fmt and formatting macros. label Nov 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-fmt Proposals relating to std::fmt and formatting macros.
Projects
None yet
Development

Successfully merging this pull request may close these issues.