New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

dekellum · 2019-03-14T23:52:03Z

I've found the large_enum_variant lint from clippy quite educational and was hoping it would be able to lint on what I'll call unbalanced Result<T, E> specified types, in particular when the E (Error) type is really large but the T type is quite small. From my reading, this is often a performance concern, e.g. wasting lots of stack space for an infrequently used error path. On the other hand if the T (Ok, happy path) is larger, then the cost of E should be negligible. Make sense?

After not being able to trigger clippy with Result<(), BigError> (where BigError is 3,000 bytes) I tried using my own MyResult, along the lines of:

pub enum MyResult<T, E> {
    Ok(T),
    Err(E),
}

/// MyResult concrete type with zero-bytes `Ok` and 3K bytes `Err` variant,
/// for a fixed stack size of 3001 bytes (asserted).
type Unbalanced = MyResult<(), BigError>;

fn f() -> Unbalanced {
    MyResult::Err(BigError::Large(error::Biggie { blob: [0u8; 3000] }))
}

But no clippy lint at present on either the fully specified type Unbalanced alias or the f() function definition.

Have/would you consider extending this lint to all generic enums, where they are fully specified, or perhaps more practically, adding a new lint specific to core::result::Result? The suggestion to use Box<E> would be applicable in the later case.

I don't claim to well understand the clippy code base, but this was the only hint I could find (an @oli-obk comment) that generic enum's were considered (and are likely ignored) for the original lint:

rust-clippy/clippy_lints/src/large_enum_variant.rs

Line 71 in bb41b16

// don't count generics by filtering out everything

% cargo clippy -V
clippy 0.0.212 (016d92d 2019-03-10)

The text was updated successfully, but these errors were encountered:

oli-obk · 2019-03-16T11:55:50Z

My comment refers generic enums that are still too generic to compute the size. Your case completely monomorphizes the generic enum to something where we can compute the size, and I agree with you that we should be doing this.

I think we should start small and only look at Result types to begin with, since most of the time one does not have control over arbitrary enums from other crates, so there might be no way to change it.

I think we should start with a MIR lint that goes over all arguments and user defined local variables and the return type. For each of those types, use the walk() method that gives you an iterator over all the types present inside that type. Then, check for occurrences of ty::Adt whose DefId is the same as Result's DefId. Then run the same variant size checking code that we are running for enums in the large_enum_variant lint.

dekellum · 2019-03-16T16:10:16Z

Good to hear this is possible as a MIR lint, thanks @oli-obk!

Do you think this Result case deserves its own new top level lint? A reason I think it might is that the opposite case, where the Ok(T) is large and the Err(E) is small, there is likely no reason to raise a lint. Also in the large Err(E) case it would be helpful to further elaborate (at least on the ALL the Clippy Lints page) why this can be bad in particular, and perhaps independently adjust the threshold size difference at which it is reported?

oli-obk · 2019-03-16T18:09:06Z

Yea, making it its own lint just for Result makes sense.

lukaslueg · 2022-08-21T12:18:31Z

@oli-obk Could you point out wrt #3884 (comment) why we need to walk locals and arguments at all - I'm missing the point. Isn't it enough to walk the E in a Result<_, E> as far as we can, and come up with an at least size, e.g. for Result<(), (T, [u8; 512])>, by simply summing up fields as far as we can determine? The size will be wrong due to generics, but it's the best we can do for library code, and the final E will certainly not be smaller.

oli-obk · 2022-08-22T10:33:46Z

I don't remember and can't make much sense of my own comment. Your explanation for a plan sounds good

Initial implementation `result_large_err` This is a shot at #6560, #4652, and #3884. The lint checks for `Result` being returned from functions/methods where the `Err` variant is larger than a configurable threshold (the default of which is 128 bytes). There has been some discussion around this, which I'll try to quickly summarize: * A large `Err`-variant may force an equally large `Result` if `Err` is actually bigger than `Ok`. * There is a cost involved in large `Result`, as LLVM may choose to `memcpy` them around above a certain size. * We usually expect the `Err` variant to be seldomly used, but pay the cost every time. * `Result` returned from library code has a high chance of bubbling up the call stack, getting stuffed into `MyLibError { IoError(std::io::Error), ParseError(parselib::Error), ...}`, exacerbating the problem. This PR deliberately does not take into account comparing the `Ok` to the `Err` variant (e.g. a ratio, or one being larger than the other). Rather we choose an absolute threshold for `Err`'s size, above which we warn. The reason for this is that `Err`s probably get `map_err`'ed further up the call stack, and we can't draw conclusions from the ratio at the point where the `Result` is returned. A relative threshold would also be less predictable, while not accounting for the cost of LLVM being forced to generate less efficient code if the `Err`-variant is _large_ in absolute terms. We lint private functions as well as public functions, as the perf-cost applies to in-crate code as well. In order to account for type-parameters, I conjured up `fn approx_ty_size`. The function relies on `LateContext::layout_of` to compute the actual size, and in case of failure (e.g. due to generics) tries to come up with an "at least size". In the latter case, the size of obviously wrong, but the inspected size certainly can't be smaller than that. Please give the approach a heavy dose of review, as I'm not actually familiar with the type-system at all (read: I have no idea what I'm doing). The approach does, however flimsy it is, allow us to successfully lint situations like ```rust pub union UnionError<T: Copy> { _maybe: T, _or_perhaps_even: (T, [u8; 512]), } // We know `UnionError<T>` will be at least 512 bytes, no matter what `T` is pub fn param_large_union<T: Copy>() -> Result<(), UnionError<T>> { Ok(()) } ``` I've given some refactoring to `functions/result_unit_err.rs` to re-use some bits. This is also the groundwork for #6409 The default threshold is 128 because of #4652 (comment) `lintcheck` does not trigger this lint for a threshold of 128. It does warn for 64, though. The suggestion currently is the following, which is just a placeholder for discussion to be had. I did have the computed size in a `span_label`. However, that might cause both ui-tests here and lints elsewhere to become flaky wrt to their output (as the size is platform dependent). ``` error: the `Err`-variant returned via this `Result` is very large --> $DIR/result_large_err.rs:36:34 | LL | pub fn param_large_error<R>() -> Result<(), (u128, R, FullyDefinedLargeError)> { | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The `Err` variant is unusually large, at least 128 bytes ``` changelog: Add [`result_large_err`] lint

oli-obk added C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages T-MIR Type: This lint will require working with the MIR labels Mar 16, 2019

dekellum changed the title ~~Extend large_enum_variant lint to all generics, or for Result<T,E>~~ New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> Mar 18, 2019

flip1995 mentioned this issue Jan 15, 2020

Implement issue finder for lint names #5049

Closed

phansch added the A-lint Area: New lints label Dec 21, 2020

Nemo157 mentioned this issue Jan 7, 2021

Lint for large error types #6560

Open

lukaslueg mentioned this issue Aug 24, 2022

Initial implementation result_large_err #9373

Merged

Alexendoo linked a pull request Aug 30, 2022 that will close this issue

Initial implementation result_large_err #9373

Merged

bors closed this as completed in #9373 Aug 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

dekellum commented Mar 14, 2019 •

edited

Loading

oli-obk commented Mar 16, 2019

dekellum commented Mar 16, 2019 •

edited

Loading

oli-obk commented Mar 16, 2019

lukaslueg commented Aug 21, 2022

oli-obk commented Aug 22, 2022

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

Comments

dekellum commented Mar 14, 2019 • edited Loading

oli-obk commented Mar 16, 2019

dekellum commented Mar 16, 2019 • edited Loading

oli-obk commented Mar 16, 2019

lukaslueg commented Aug 21, 2022

oli-obk commented Aug 22, 2022

dekellum commented Mar 14, 2019 •

edited

Loading

dekellum commented Mar 16, 2019 •

edited

Loading