Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

False positive for inherent_to_string_shadow_display? #4396

Open
huxi opened this issue Aug 16, 2019 · 16 comments
Open

False positive for inherent_to_string_shadow_display? #4396

huxi opened this issue Aug 16, 2019 · 16 comments
Labels
C-bug Category: Clippy is not doing the correct thing C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages good-first-issue These issues are a good way to get started with Clippy I-false-positive Issue: The lint was triggered on code it shouldn't have L-suggestion Lint: Improving, adding or fixing lint suggestions

Comments

@huxi
Copy link

huxi commented Aug 16, 2019

I think I found a false positive or at least questionable case of inherent_to_string_shadow_display rule triggering.

The place this rule is (or rather "was", since I disabled it) triggered:
https://github.com/huxi/rusty_ulid/blob/72ac82736311981d12c339b78756049f48406179/src/lib.rs#L557

The place where the method in question is used to actually implement Display:
https://github.com/huxi/rusty_ulid/blob/72ac82736311981d12c339b78756049f48406179/src/lib.rs#L571

This boils down to a situation where a type has an efficient implementation of fn to_string(&self) -> String that is used to implement fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> of fmt::Display.

I think this case (method is used to implement trait) shouldn't trigger the rule.

@flip1995
Copy link
Member

flip1995 commented Aug 16, 2019

You should just write

impl fmt::Display for Ulid {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {
        let mut string = String::with_capacity(26);

        crockford::append_crockford_u64_tuple(self.value, &mut string);

        write!(f, "{}", string)
    }
}

and remove the to_string method.

This will give you a to_string method, that calls fmt to produce the string. It's pretty much the same implementation as yours, but the other way around.
https://doc.rust-lang.org/src/alloc/string.rs.html#2132-2142

Or does something prevent an implementation like this?

@huxi
Copy link
Author

huxi commented Aug 16, 2019

I thought the way I'm doing it would be slightly more efficient, i.e. only one String::with_capacity(26) with exact required size already known vs. that same allocation plus everything the default implementation of to_stringdoes.

@flip1995
Copy link
Member

flip1995 commented Aug 16, 2019

I thought the way I'm doing it would be slightly more efficient

I doubt that. One great thing about Rust are Zero Cost Abstractions; the way it's implemented by the compiler should be at least as fast as your implementation.

What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.

I expect that the default implementation of to_string will be optimized out by the compiler anyway.


Just tested it with godbolt: https://rust.godbolt.org/z/0Ynb4g

The version with just the Display implementation produces less Code than the version with the to_string implementation.


So yeah: Just trust the Rust compiler, he/she/it just wants what's best for you ❤️ 😄

@huxi
Copy link
Author

huxi commented Aug 16, 2019

@flip1995 There are some benchmarks in my repository.

With my current manual implementation:

generate_ulid_string    time:   [197.75 ns 199.60 ns 201.66 ns]
Found 8 outliers among 100 measurements (8.00%)
  8 (8.00%) high mild

Using the default implementation of Display instead:

generate_ulid_string    time:   [338.23 ns 340.84 ns 343.38 ns]
                        change: [+68.709% +70.540% +72.419%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

@flip1995
Copy link
Member

flip1995 commented Aug 16, 2019

Do you run these benchmarks in --release mode? Maybe you have multiple to_string calls spread throughout the code, which don't get optimized in debug mode. If you find the same behavior in --release mode feel free to reopen

@huxi
Copy link
Author

huxi commented Aug 16, 2019

I simply executed cargo bench with cargo 1.38.0-nightly (e853aa976 2019-08-09) which executes the criterion benchmarks after building them in optimized release mode.

Reverting the changes causes the performance to increase accordingly:

generate_ulid_string    time:   [193.82 ns 194.85 ns 195.95 ns]
                        change: [-42.510% -41.993% -41.431%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

Taking a look at the inherent_to_string_shadow_display description, it seems that the intent of this rule is mostly about preventing different output from to_string and fmt and I agree that this would indeed be quite bad.

But in my example this problematic behavior is impossible because fmt is actually using the to_string implementation, guaranteeing that output will definitely be identical.

I have no idea how prevalent this other-way-around pattern is but I think that it should not trigger the rule if it is detected.

The description of the rule also says "The less versatile inherent method will then shadow the implementation introduced by Display." but I don't see how the default implementation via Display is more "versatile" than the manual implementation.

@flip1995 I think I can't reopen the issue. I could only create a new one referencing this one.

@flip1995
Copy link
Member

flip1995 commented Aug 16, 2019

Fair enough. Could you test what happens if you implement the ToString trait instead of the to_string method, please. If this also works with the same performance, I would prefer this as a "this-is-how-you-should-do-it-instead"-suggestion. I think we even have a lint for "This method already exists in this trait".

I think I can't reopen the issue

Oh I thought the author can always do this. In that case I do it for you.

@flip1995 flip1995 reopened this Aug 16, 2019
@flip1995 flip1995 added L-suggestion Lint: Improving, adding or fixing lint suggestions good-first-issue These issues are a good way to get started with Clippy C-bug Category: Clippy is not doing the correct thing C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages and removed good-first-issue These issues are a good way to get started with Clippy labels Aug 16, 2019
@huxi
Copy link
Author

huxi commented Aug 16, 2019

Trying to implement ToString directly leads to this error:

error[E0119]: conflicting implementations of trait `std::string::ToString` for type `Ulid`:
   --> src/lib.rs:555:1
    |
555 | impl ToString for Ulid {
    | ^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: conflicting implementation in crate `alloc`:
            - impl<T> std::string::ToString for T
              where T: std::fmt::Display, T: ?Sized;

The documentation of ToString also says "ToString shouldn't be implemented directly".

@flip1995
Copy link
Member

Hm, I kind of expected something like this. So yeah, a check, if the to_string impl is used in the Display impl should solve this problem. But I'm still not conviced 100%, that this is a problem with the lint, since such performance improvements should be really rare and you should probably just allow the lint in these cases.

I'll add the needs-discussion label to this issue, so we can maybe attract more opinions.

@flip1995 flip1995 added the S-needs-discussion Status: Needs further discussion before merging or work can be started label Aug 16, 2019
@ghost
Copy link

ghost commented Aug 18, 2019

The problem is the shadowing. Calling to_string through the ToString trait is still going to call the slow code. You don't want people confused about which version they are calling, specially if it's performance critical.

I can think of two possible solutions.

  • Rename to_string() to something else. Maybe encode().
  • Don't implement Display directly but make a display() method that returns a wrapper that does.

You could also do both.

@huxi
Copy link
Author

huxi commented Aug 20, 2019

Here are some more benchmarks:

generate_ulid_string    time:   [208.62 ns 209.90 ns 211.33 ns]
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

generate_ulid_string_formatter
                        time:   [364.51 ns 369.58 ns 375.72 ns]
Found 13 outliers among 100 measurements (13.00%)
  7 (7.00%) high mild
  6 (6.00%) high severe

generate_ulid_string_tostring_trait
                        time:   [367.32 ns 370.45 ns 373.84 ns]
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) high mild
  10 (10.00%) high severe
  • generate_ulid_string generates a ULID and converts it into a String using my optimized method.
  • generate_ulid_string_formatter generates a ULID and converts it into a String using std::fmt::Write::write_fmt, i.e. the Display trait implementation using my optimized to_string method.
  • generate_ulid_string_tostring_trait generates a ULID and converts it into a String using the shadowed ToString::to_string(&value) default implementation.
  • serde Serialize is also using my optimized to_string implementation.

As you can see, the performance differences are minuscule but exist.

Given that a user would always get optimal to_string performance automatically beside the case where they explicitly request the default implementation by calling ToString::to_string(&value), I'd argue that everything works exactly as expected. The default implementation is also working correctly, producing guaranteed identical output, just a little bit slower, i.e. as fast as the implementation would be in absence of the more optimized handcrafted method.

I don't think that sacrificing usability/expected behavior by not implementing Display or renaming to_string to encode would be reasonable because the latter would likely lead to users being unaware of encode and using the (still slower) default to_string instead.

Because of that, I'll keep the inherent_to_string_shadow_display rule disabled for my specific use-case.

But I'm also still convinced that not warning if the shadowing to_string is used in the implementation of the Display trait would be sensible because then it's guaranteed that the string representations won't differ, which otherwise isn't the case.

@flip1995
Copy link
Member

Now I see, why your to_string implementation is faster: You avoid calling the write function. So it makes a difference when you call to_string directly, but not, when using the Display implementation.

If we allow a shadowing to_string method, I would only allow it in the most trivial case, where the body of fmt is only write!(<formatter>, "{}", self.to_string()), otherwise, the Display and to_string methods still may differ.

@huxi
Copy link
Author

huxi commented Aug 20, 2019

Yep, that's exactly the point I tried to make. :)

My current implementation looks like this, though:

impl fmt::Display for Ulid {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> Result<(), fmt::Error> {
        f.write_str(&self.to_string())
    }
}

It may actually be worthwhile to write another clippy rule detecting a pattern like write!(<formatter>, "{}", self.to_string()), i.e. a pattern string that's just "{}", and suggest to replace it with f.write_str(&self.to_string()). That also brings a small performance boost (I checked) and does not reduce readability.

I wanted to do that from the start but forgot to revisit the code once it worked...

@flip1995
Copy link
Member

The documentation states that these two functions are equivalent. But if the write_str has better performance, this could be a perf lint.

@flip1995 flip1995 removed the S-needs-discussion Status: Needs further discussion before merging or work can be started label Aug 20, 2019
@huxi
Copy link
Author

huxi commented Aug 21, 2019

I wonder if this could/should be fixed at the write! macro level... it could expand differently in case of "{}" pattern, I think.

@phansch phansch added the I-false-positive Issue: The lint was triggered on code it shouldn't have label Dec 18, 2020
@KamilaBorowska
Copy link
Contributor

KamilaBorowska commented Jan 20, 2022

I think this lint is wrong. There is a good reason for shadowing to_string - performance. The problem is that the default implementation of to_string goes through std::fmt::Formatter which is known for being slow (see rust-lang/rust#76490). It's not possible to provide a better implementation without specialization which is not stable, so shadowing is the best replacement.

It's possible to deal with this issue to some extent by not implementing Display or renaming the method, but that's not ideal. Not implementing Display is a breaking change, and renaming the method will lower the performance for previous users of previously written code.

I understand that incompatible implementations are a problem, however I don't think it's easy to determine whether implementations are incompatible.

I think this lint shouldn't be enabled by default until specialization gets introduced on stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: Clippy is not doing the correct thing C-enhancement Category: Enhancement of lints, like adding more cases or adding help messages good-first-issue These issues are a good way to get started with Clippy I-false-positive Issue: The lint was triggered on code it shouldn't have L-suggestion Lint: Improving, adding or fixing lint suggestions
Projects
None yet
Development

No branches or pull requests

5 participants