-
Notifications
You must be signed in to change notification settings - Fork 13.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmp+Ordering match versus if+else if+then for binary_search_by #86354
Comments
This will call By the way, using the use std::cmp::Ordering::{self, *};
fn binary_search_by<T, F>(this: &[T], mut f: F) -> Result<usize, usize>
where
F: FnMut(&T) -> Ordering
{
let mut base = 0usize;
let mut s = this;
loop {
let (head, mid, tail) = match s.split_at(s.len() >> 1) {
(_, []) => return Err(base),
(head, [mid, tail @ ..]) => (head, mid, tail),
};
let cmp = f(mid);
if cmp == Less {
base += head.len() + 1;
s = tail;
} else if cmp == Greater {
s = head;
} else {
return Ok(base + head.len());
}
}
}
pub fn foo(data: &[u32], x: u32) -> Result<usize, usize> {
binary_search_by(data, |y| x.cmp(y))
} Gives: example::foo:
mov r10, rsi
shr r10
mov eax, 1
sub rsi, r10
je .LBB0_8
mov r8d, edx
lea r9, [rdi + 4*r10]
xor edx, edx
jmp .LBB0_4
.LBB0_2:
add r9, 4
add rsi, -1
add rdx, r10
add rdx, 1
mov r10, rsi
mov rdi, r9
.LBB0_3:
mov rcx, r10
shr rcx
lea r9, [rdi + 4*rcx]
sub r10, rcx
mov rsi, r10
mov r10, rcx
je .LBB0_7
.LBB0_4:
cmp dword ptr [r9], r8d
ja .LBB0_2
jne .LBB0_3
add rdx, r10
xor eax, eax
.LBB0_7:
ret
.LBB0_8:
xor edx, edx
ret |
Compiling your code with other optimizations leads to still long(ish) asm: foo:
mov r11, rsi
shr r11
mov eax, 1
sub rsi, r11
je .LBB0_1
mov r8d, edx
lea r10, [rdi + 4*r11]
xor edx, edx
mov r9d, 255
jmp .LBB0_3
.LBB0_6:
mov rcx, r11
shr rcx
lea r10, [rdi + 4*rcx]
sub r11, rcx
mov rsi, r11
mov r11, rcx
je .LBB0_8
.LBB0_3:
xor ecx, ecx
cmp dword ptr [r10], r8d
setne cl
cmova ecx, r9d
cmp cl, 1
je .LBB0_6
cmp cl, -1
jne .LBB0_7
add r10, 4
dec rsi
add rdx, r11
inc rdx
mov r11, rsi
mov rdi, r10
jmp .LBB0_6
.LBB0_1:
xor edx, edx
ret
.LBB0_7:
add rdx, r11
xor eax, eax
.LBB0_8:
ret |
Can you reproduce it on the playground/godbolt? |
|
I would consider this a regression cause by some optimizations enabled with |
The binary search implementation no longer matches on Ordering, so closing this. Remaining discussion should go to #86511. |
The binary search is implemented as:
rust/src/libcore/slice.rs
Lines 304 to 324 in 27e766d
Here I've added a caller to instantiate it with commonly used types (u32 and its cmp):
rustc 1.54.0-nightly with optimizations gives:
I've noticed that replacing the call to cmp + match with simpler code, the asm becomes shorter (6 asm instructions less):
Gives:
Is it possible to improve rustc so it handles simple cmp+Ordering match cases like this about as efficiently as the if/else if/else case?
By the way, I've also noticed that you can rewrite the binary search function using pattern matching, avoiding explicit indexing and slicing (the asm is the same as the stdlib version):
The text was updated successfully, but these errors were encountered: