-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Little improves in CString new
when creating from slice
#92124
Little improves in CString new
when creating from slice
#92124
Conversation
r? @kennytm (rust-highfive has picked a reviewer for you, use r? to override) |
I checked generated asm for use std::ffi::{CString, NulError};
#[no_mangle]
pub extern "Rust" fn make_c_string()->Result<CString, NulError>{
CString::new("Hello world!")
} Asm generated using command: Old generated code .text
.def @feat.00;
.scl 3;
.type 0;
.endef
.globl @feat.00
.set @feat.00, 0
.file "cstring_new.a85eb7bc-cgu.0"
.def make_c_string;
.scl 2;
.type 32;
.endef
.section .text,"xr",one_only,make_c_string
.globl make_c_string
.p2align 4, 0x90
make_c_string:
.seh_proc make_c_string
pushq %rsi
.seh_pushreg %rsi
pushq %rdi
.seh_pushreg %rdi
subq $56, %rsp
.seh_stackalloc 56
.seh_endprologue
movq %rcx, %rsi
leaq __unnamed_1(%rip), %rdx
leaq 32(%rsp), %rdi
movl $12, %r8d
movq %rdi, %rcx
callq _ZN70_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..SpecIntoVec$GT$8into_vec17h42e14556a4e783e0E
movq %rsi, %rcx
movq %rdi, %rdx
callq _ZN3std3ffi5c_str7CString4_new17hd06ee8640c50ad96E
movq %rsi, %rax
addq $56, %rsp
popq %rdi
popq %rsi
retq
.seh_endproc
.section .rdata,"dr",one_only,__unnamed_1
__unnamed_1:
.ascii "Hello world!" New generated code .text
.def @feat.00;
.scl 3;
.type 0;
.endef
.globl @feat.00
.set @feat.00, 0
.file "cstring_new.558cd7f7-cgu.0"
.def make_c_string;
.scl 2;
.type 32;
.endef
.section .text,"xr",one_only,make_c_string
.globl make_c_string
.p2align 4, 0x90
make_c_string:
.seh_proc make_c_string
pushq %rsi
.seh_pushreg %rsi
subq $32, %rsp
.seh_stackalloc 32
.seh_endprologue
movq %rcx, %rsi
leaq __unnamed_1(%rip), %rdx
movl $12, %r8d
callq _ZN66_$LT$$RF$str$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$8new_impl17ha5748a2c8ce696e6E
movq %rsi, %rax
addq $32, %rsp
popq %rsi
retq
.seh_endproc
.section .rdata,"dr",one_only,__unnamed_1
__unnamed_1:
.ascii "Hello world!" It can be seen that new caller code is shorter. As for lto performance, I compiled same code used On Unix, there is probably no difference because Rust uses |
I have 2 questions:
Example: ; _$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$::new_impl::hcd1310bd9104df0d
_ZN83_$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17hcd1310bd9104df0dE proc near
push rsi
sub rsp, 20h
mov rsi, rcx
call _ZN75_$LT$$RF$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17h8473efdc7dbe7405E ; _$LT$$RF$$u5b$u8$u5d$$u20$as$u20$std..ffi..c_str..CString..new..NewImpl$GT$::new_impl::h8473efdc7dbe7405
mov rax, rsi
add rsp, 20h
pop rsi
retn
_ZN83_$LT$$RF$mut$u20$$u5b$u8$u5d$$u20$as$u20$std__ffi__c_str__CString__new__NewImpl$GT$8new_impl17hcd1310bd9104df0dE endp |
Surprisingly, adding |
It turns out that |
new
when creating from slicenew
when creating from slice
Removed all code related to new allocation mechanism. |
This comment has been minimized.
This comment has been minimized.
JFYI: I wouldn't be able to participate until 15th January. I would return to this PR after that if needed. |
Old code already contain optimization for cases with `&str` and `&[u8]` args. This commit adds a specialization for `&mut[u8]` too. Also, I added usage of old slice in search for zero bytes instead of new buffer because it produce better code for Windows on LTO builds. For other platforms, this wouldn't cause any difference because it calls `libc` anyway. Inlined `_new` method into spec trait to reduce amount of code generated to `CString::new` callers.
@kennytm May you review my PR please? :) |
This seems okay to me, thanks. @bors r+ |
📌 Commit 4b62a77 has been approved by |
…askrgr Rollup of 10 pull requests Successful merges: - rust-lang#88642 (Formally implement let chains) - rust-lang#89621 (doc: guarantee call order for sort_by_cached_key) - rust-lang#91278 (Use iterator instead of recursion in `codegen_place`) - rust-lang#92124 (Little improves in CString `new` when creating from slice) - rust-lang#92783 (Annotate dead code lint with notes about ignored derived impls) - rust-lang#92797 (Remove horizontal lines at top of page) - rust-lang#92920 (Move expr- and item-related pretty printing functions to modules) - rust-lang#93041 (Remove some unused ordering derivations based on `DefId`) - rust-lang#93051 (Add Option::is_some_with and Result::is_{ok,err}_with) - rust-lang#93062 (Update books) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
Old code already contain optimization for cases with
&str
and&[u8]
args. This commit adds a specialization for&mut[u8]
too.Also, I added usage of old slice in search for zero bytes instead of new buffer because it produce better code for constant inputs on Windows LTO builds. For other platforms, this wouldn't cause any difference because it calls
libc
anyway.Inlined
_new
method into spec trait to reduce amount of code generated toCString::new
callers.