Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialized slice::fill implementations are not inlined #83235

Closed
steffahn opened this issue Mar 17, 2021 · 4 comments · Fixed by #83245
Closed

Specialized slice::fill implementations are not inlined #83235

steffahn opened this issue Mar 17, 2021 · 4 comments · Fixed by #83245
Assignees
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@steffahn
Copy link
Member

steffahn commented Mar 17, 2021

Specialized slice::fill implementations are not inlined even though they would compile down to pretty much just a single memset-call.

The bar function below represents manual inlining of the implementation of fill.

use core::ptr::write_bytes;

pub fn foo(slice: &mut [u8]) {
    slice.fill(42);
}

pub fn bar(slice: &mut [u8]) {
    unsafe {
        let ptr = slice.as_mut_ptr();
        let len = slice.len();
        write_bytes(ptr, 0_u8, len);
    }
}

(Playground)

    Finished release [optimized] target(s) in 0.83s
   
---------- Result ----------

playground::foo: # @playground::foo
# %bb.0:
	movl	$42, %edx
	jmpq	*<[u8] as core::slice::specialize::SpecFill<u8>>::spec_fill@GOTPCREL(%rip) # TAILCALL
                                        # -- End function

playground::bar: # @playground::bar
# %bb.0:
	movq	%rsi, %rdx
	xorl	%esi, %esi
	jmpq	*memset@GOTPCREL(%rip)          # TAILCALL
                                        # -- End function

I haven’t tested it in any way, but I suspect that this can be fixed by marking the fill specialized implementations for u8, i8, and bool as #[inline].

@steffahn
Copy link
Member Author

@rustbot modify labels: C-enhancement, T-libs-impl, I-slow, E-easy

@rustbot rustbot added C-enhancement Category: An issue proposing an enhancement or a PR with one. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 17, 2021
@the8472
Copy link
Member

the8472 commented Mar 17, 2021

@rustbot claim

@slightlyoutofphase
Copy link
Contributor

slightlyoutofphase commented Mar 18, 2021

Don't you need at the very least #[inline] (but not necessarily #[inline(always)]) to actually guarantee cross-crate-inlining eligibility with the default no-LTO release profile? It'd certainly make sense to add that to all of the impls in this file IMO.

@the8472
Copy link
Member

the8472 commented Mar 18, 2021

It being generic is sufficient for being eligible because the monomorphized function is only created in the consuming crate once all the generics have been filled in. Adding #[inline] on top just encourages it further.

@bors bors closed this as completed in 1010038 Mar 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. I-slow Issue: Problems and improvements with respect to performance of generated code. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants