Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vtable accesses optimization in a tight loop. #40314

Open
dpc opened this issue Mar 7, 2017 · 2 comments
Open

vtable accesses optimization in a tight loop. #40314

dpc opened this issue Mar 7, 2017 · 2 comments
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@dpc
Copy link
Contributor

dpc commented Mar 7, 2017

This is a followup of #39992 .

This playrust:

#![crate_type="lib"]

use std::sync::Arc;

pub struct Wrapper {
    x: usize,
    y: usize,
    t: Arc<Foo>,
}

pub trait Foo {
    fn foo(&self);
}

pub fn test(foo: Wrapper) {
    for _ in 0..200 {
        foo.t.foo();
    }
}

Generates the tight loop:

.LBB1_1:
	incl	%ebp
	cmpl	$200, %ebp
	jge	.LBB1_2
	movq	16(%rbx), %rdi
	movq	24(%rbx), %rax
	leaq	15(%rdi), %rcx
	negq	%rdi
	andq	%rcx, %rdi
	addq	%r12, %rdi
.Ltmp12:
	callq	*%rax
.Ltmp13:
	jmp	.LBB1_1

while it seems to me the vtable access should be out of the loop, and it shouldn't be hard.

@hanna-kruppe
Copy link
Contributor

Strangely, changing the signature to take &Wrapper does hoist the vtable access:

.LBB0_1:
	inc	ebp
	mov	rdi, rbx
	call	r14
	cmp	ebp, 200
	jl	.LBB0_1

@Mark-Simulacrum Mark-Simulacrum added the I-slow Issue: Problems and improvements with respect to performance of generated code. label May 27, 2017
@Mark-Simulacrum Mark-Simulacrum added C-enhancement Category: An issue proposing an enhancement or a PR with one. and removed C-enhancement Category: An issue proposing an enhancement or a PR with one. labels Jul 26, 2017
@nikic
Copy link
Contributor

nikic commented Dec 2, 2018

Looks like the load is hoisted on nightly (but not beta):

.LBB2_1:                                # =>This Inner Loop Header: Depth=1
	addl	$1, %ebp
	cmpl	$200, %ebp
	jae	.LBB2_2
# %bb.4:                                #   in Loop: Header=BB2_1 Depth=1
	movq	%rbx, %rdi
	callq	*%r13
	jmp	.LBB2_1

The &Wrapper variant still generates slightly better code:


.LBB0_1:                                # =>This Inner Loop Header: Depth=1
	movq	%rbx, %rdi
	callq	*%r14
	addl	$-1, %ebp
	jne	.LBB0_1

@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

5 participants