Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed optimization because of missing inline attr on OsStr #67150

Closed
tesuji opened this issue Dec 8, 2019 · 5 comments · Fixed by #67169
Closed

Missed optimization because of missing inline attr on OsStr #67150

tesuji opened this issue Dec 8, 2019 · 5 comments · Fixed by #67169
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tesuji
Copy link
Contributor

tesuji commented Dec 8, 2019

Godbolt link: https://rust.godbolt.org/z/bAU2UA
I expect these two snippets have the same optimized asm but they don't:

use std::ffi::OsStr;

pub fn foo(s: Option<&OsStr>) -> bool {
    s.map_or(false, |x| x == OsStr::new("so"))
}
use std::ffi::OsStr;

pub fn foo(s: Option<&OsStr>) -> bool {
    s == Some(OsStr::new("so"))
}
@jonas-schievink jonas-schievink added C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 8, 2019
@nikic
Copy link
Contributor

nikic commented Dec 8, 2019

I think the bigger problem here is that the quite trivial AsRef and PartialEq implementations did not get inlined. I would not expect anything to optimize well without those getting inlined.

@michalt
Copy link
Contributor

michalt commented Dec 8, 2019

I'm not sure we can expect these two to give you the same assembly. They are actually a bit different -- the first one will construct OsStr conditionally (map_or will call the closure only for Some), whereas the second one will create the OsStr unconditionally before even checking if the parameter is Some or None.

As for the differences in the assembly wrt. comparisons themselves, this seems to be due to the difference between the code to check for equality of OsStr vs Option<&OsStr>.

I wonder, do you have a particular benchmark/workload where the first function is slower than expected?

@tesuji
Copy link
Contributor Author

tesuji commented Dec 8, 2019

Still, if I change the first snippet like below, the asm is different too:

use std::ffi::OsStr;

pub fn foo(s: Option<&OsStr>) -> bool {
    let so = OsStr::new("so");
    s.map_or(false, |x| x == so)
}

@nagisa
Copy link
Member

nagisa commented Dec 8, 2019

OsStr::new("so") is effectively a pointer cast, so it is warranted to expect assembly of at least similar quality.

Inlining does not happen because we do happen to have generated machine code for these trivial functions in libstd.rlib. Tacking some #[inline]s should suffice?

@tesuji
Copy link
Contributor Author

tesuji commented Dec 9, 2019

After inlining some methods in #67169, I got the same asm between those two snippets:

check_stage1::foo:
 xor     eax, eax
 test    rdi, rdi
 je      .LBB0_5
 cmp     rsi, 2
 jne     .LBB0_5
 lea     rax, [rip, +, .Lanon.7bddf4f09674752ba4bdf737f126bcf4.0]
 cmp     rdi, rax
 je      .LBB0_3
 movzx   eax, word, ptr, [rdi]
 cmp     eax, 28531
 sete    al
.LBB0_5:
 ret
.LBB0_3:
 mov     al, 1
 ret

@tesuji tesuji changed the title Missed optimization on Option comparison Missed optimization because missing inline attr on OsStr Dec 9, 2019
@tesuji tesuji changed the title Missed optimization because missing inline attr on OsStr Missed optimization because of missing inline attr on OsStr Dec 9, 2019
@bors bors closed this as completed in c255815 Dec 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants