Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM pessimises vtable loads. #39992

Closed
Aatch opened this issue Feb 21, 2017 · 2 comments
Closed

LLVM pessimises vtable loads. #39992

Aatch opened this issue Feb 21, 2017 · 2 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Aatch
Copy link
Contributor

Aatch commented Feb 21, 2017

This code contains two offset calculations and two function pointer loads:

#![crate_type="lib"]

pub struct Wrapper<T: ?Sized> {
    x: usize,
    y: usize,
    t: T
}

pub trait Foo {
    fn foo(&self);
}

pub fn test(foo: &Wrapper<Foo>) {
    foo.t.foo();
    foo.t.foo();
}

Playpen

LLVM should be able to remove all but the first occurrences via common subexpression elimination, but for some reason it reloads from the vtable pointer each time. The only reasonable explanation I can think of is that LLVM assumes that the vtable pointer may alias some pointer that the called function manipulates.

@Aatch
Copy link
Contributor Author

Aatch commented Feb 21, 2017

/cc @rust-lang/compiler

@Aatch Aatch added I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 21, 2017
Aatch added a commit to Aatch/rust that referenced this issue Feb 21, 2017
Give LLVM much more information about vtable pointers. Without the extra
information, LLVM has to be rather pessimistic about vtables, preventing
a number of obvious optimisations.

* Makes the vtable pointer argument noalias and readonly.
* Marks loads of the vtable pointer as nonnull.
* Marks load from the vtable with `!invariant.load` metadata.

Fixes rust-lang#39992
eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017
Set metadata for vtable-related loads

Give LLVM much more information about vtable pointers. Without the extra
information, LLVM has to be rather pessimistic about vtables, preventing
a number of obvious optimisations.

* Makes the vtable pointer argument noalias and readonly.
* Marks loads of the vtable pointer as nonnull.
* Marks load from the vtable with `!invariant.load` metadata.

Fixes rust-lang#39992
eddyb added a commit to eddyb/rust that referenced this issue Feb 25, 2017
Set metadata for vtable-related loads

Give LLVM much more information about vtable pointers. Without the extra
information, LLVM has to be rather pessimistic about vtables, preventing
a number of obvious optimisations.

* Makes the vtable pointer argument noalias and readonly.
* Marks loads of the vtable pointer as nonnull.
* Marks load from the vtable with `!invariant.load` metadata.

Fixes rust-lang#39992
@dpc
Copy link
Contributor

dpc commented Mar 4, 2017

Any way to help LLVM move vtable stuff out of a tight loop?

This playrust:

#![crate_type="lib"]

use std::sync::Arc;

pub struct Wrapper {
    x: usize,
    y: usize,
    t: Arc<Foo>,
}

pub trait Foo {
    fn foo(&self);
}

pub fn test(foo: Wrapper) {
    for _ in 0..200 {
        foo.t.foo();
    }
}

Generates the tight loop:

.LBB1_1:
	incl	%ebp
	cmpl	$200, %ebp
	jge	.LBB1_2
	movq	16(%rbx), %rdi
	movq	24(%rbx), %rax
	leaq	15(%rdi), %rcx
	negq	%rdi
	andq	%rcx, %rdi
	addq	%r12, %rdi
.Ltmp12:
	callq	*%rax
.Ltmp13:
	jmp	.LBB1_1

while it seems to me the vtable access should be out of the loop, and it shouldn't be hard...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

2 participants