Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

Closed
dekellum opened this issue Mar 14, 2019 · 5 comments · Fixed by #9373
Closed

New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> #3884

dekellum opened this issue Mar 14, 2019 · 5 comments · Fixed by #9373
Labels
A-lint Area: New lints C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages T-MIR Type: This lint will require working with the MIR

Comments

@dekellum
Copy link

dekellum commented Mar 14, 2019

I've found the large_enum_variant lint from clippy quite educational and was hoping it would be able to lint on what I'll call unbalanced Result<T, E> specified types, in particular when the E (Error) type is really large but the T type is quite small. From my reading, this is often a performance concern, e.g. wasting lots of stack space for an infrequently used error path. On the other hand if the T (Ok, happy path) is larger, then the cost of E should be negligible. Make sense?

After not being able to trigger clippy with Result<(), BigError> (where BigError is 3,000 bytes) I tried using my own MyResult, along the lines of:

pub enum MyResult<T, E> {
    Ok(T),
    Err(E),
}

/// MyResult concrete type with zero-bytes `Ok` and 3K bytes `Err` variant,
/// for a fixed stack size of 3001 bytes (asserted).
type Unbalanced = MyResult<(), BigError>;

fn f() -> Unbalanced {
    MyResult::Err(BigError::Large(error::Biggie { blob: [0u8; 3000] }))
}

But no clippy lint at present on either the fully specified type Unbalanced alias or the f() function definition.

Have/would you consider extending this lint to all generic enums, where they are fully specified, or perhaps more practically, adding a new lint specific to core::result::Result? The suggestion to use Box<E> would be applicable in the later case.

I don't claim to well understand the clippy code base, but this was the only hint I could find (an @oli-obk comment) that generic enum's were considered (and are likely ignored) for the original lint:

// don't count generics by filtering out everything

% cargo clippy -V
clippy 0.0.212 (016d92d 2019-03-10)
@oli-obk oli-obk added C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages T-MIR Type: This lint will require working with the MIR labels Mar 16, 2019
@oli-obk
Copy link
Contributor

oli-obk commented Mar 16, 2019

My comment refers generic enums that are still too generic to compute the size. Your case completely monomorphizes the generic enum to something where we can compute the size, and I agree with you that we should be doing this.

I think we should start small and only look at Result types to begin with, since most of the time one does not have control over arbitrary enums from other crates, so there might be no way to change it.

I think we should start with a MIR lint that goes over all arguments and user defined local variables and the return type. For each of those types, use the walk() method that gives you an iterator over all the types present inside that type. Then, check for occurrences of ty::Adt whose DefId is the same as Result's DefId. Then run the same variant size checking code that we are running for enums in the large_enum_variant lint.

@dekellum
Copy link
Author

dekellum commented Mar 16, 2019

Good to hear this is possible as a MIR lint, thanks @oli-obk!

Do you think this Result case deserves its own new top level lint? A reason I think it might is that the opposite case, where the Ok(T) is large and the Err(E) is small, there is likely no reason to raise a lint. Also in the large Err(E) case it would be helpful to further elaborate (at least on the ALL the Clippy Lints page) why this can be bad in particular, and perhaps independently adjust the threshold size difference at which it is reported?

@oli-obk
Copy link
Contributor

oli-obk commented Mar 16, 2019

Yea, making it its own lint just for Result makes sense.

@dekellum dekellum changed the title Extend large_enum_variant lint to all generics, or for Result<T,E> New lint: large Err(E) variant, as compared with Ok(T), for Result<T, E> Mar 18, 2019
@phansch phansch added the A-lint Area: New lints label Dec 21, 2020
@lukaslueg
Copy link
Contributor

@oli-obk Could you point out wrt #3884 (comment) why we need to walk locals and arguments at all - I'm missing the point. Isn't it enough to walk the E in a Result<_, E> as far as we can, and come up with an at least size, e.g. for Result<(), (T, [u8; 512])>, by simply summing up fields as far as we can determine? The size will be wrong due to generics, but it's the best we can do for library code, and the final E will certainly not be smaller.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 22, 2022

I don't remember and can't make much sense of my own comment. Your explanation for a plan sounds good

@Alexendoo Alexendoo linked a pull request Aug 30, 2022 that will close this issue
bors added a commit that referenced this issue Aug 30, 2022
Initial implementation `result_large_err`

This is a shot at #6560, #4652, and #3884. The lint checks for `Result` being returned from functions/methods where the `Err` variant is larger than a configurable threshold (the default of which is 128 bytes). There has been some discussion around this, which I'll try to quickly summarize:

* A large `Err`-variant may force an equally large `Result` if `Err` is actually bigger than `Ok`.
* There is a cost involved in large `Result`, as LLVM may choose to `memcpy` them around above a certain size.
* We usually expect the `Err` variant to be seldomly used, but pay the cost every time.
* `Result` returned from library code has a high chance of bubbling up the call stack, getting stuffed into `MyLibError { IoError(std::io::Error), ParseError(parselib::Error), ...}`, exacerbating the problem.

This PR deliberately does not take into account comparing the `Ok` to the `Err` variant (e.g. a ratio, or one being larger than the other). Rather we choose an absolute threshold for `Err`'s size, above which we warn. The reason for this is that `Err`s probably get `map_err`'ed further up the call stack, and we can't draw conclusions from the ratio at the point where the `Result` is returned. A relative threshold would also be less predictable, while not accounting for the cost of LLVM being forced to generate less efficient code if the `Err`-variant is _large_ in absolute terms.

We lint private functions as well as public functions, as the perf-cost applies to in-crate code as well.

In order to account for type-parameters, I conjured up `fn approx_ty_size`. The function relies on `LateContext::layout_of` to compute the actual size, and in case of failure (e.g. due to generics) tries to come up with an "at least size". In the latter case, the size of obviously wrong, but the inspected size certainly can't be smaller than that. Please give the approach a heavy dose of review, as I'm not actually familiar with the type-system at all (read: I have no idea what I'm doing).

The approach does, however flimsy it is, allow us to successfully lint situations like

```rust
pub union UnionError<T: Copy> {
    _maybe: T,
    _or_perhaps_even: (T, [u8; 512]),
}

// We know `UnionError<T>` will be at least 512 bytes, no matter what `T` is
pub fn param_large_union<T: Copy>() -> Result<(), UnionError<T>> {
    Ok(())
}
```

I've given some refactoring to `functions/result_unit_err.rs` to re-use some bits. This is also the groundwork for #6409

The default threshold is 128 because of #4652 (comment)

`lintcheck` does not trigger this lint for a threshold of 128. It does warn for 64, though.

The suggestion currently is the following, which is just a placeholder for discussion to be had. I did have the computed size in a `span_label`. However, that might cause both ui-tests here and lints elsewhere to become flaky wrt to their output (as the size is platform dependent).

```
error: the `Err`-variant returned via this `Result` is very large
  --> $DIR/result_large_err.rs:36:34
   |
LL | pub fn param_large_error<R>() -> Result<(), (u128, R, FullyDefinedLargeError)> {
   |                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The `Err` variant is unusually large, at least 128 bytes
```

changelog: Add [`result_large_err`] lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages T-MIR Type: This lint will require working with the MIR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants