Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement ptr::write without dedicated intrinsic #80290

Merged
merged 3 commits into from
Jan 16, 2021

Conversation

RalfJung
Copy link
Member

@RalfJung RalfJung commented Dec 22, 2020

This makes ptr::write more consistent with ptr::write_unaligned, ptr::read, ptr::read_unaligned, all of which are implemented in terms of copy_nonoverlapping.

This means we can also remove move_val_init implementations in codegen and Miri, and its special handling in the borrow checker.

Also see this Zulip discussion.

@rust-highfive
Copy link
Collaborator

r? @m-ou-se

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Dec 22, 2020
@RalfJung
Copy link
Member Author

Let's see if perf says anything.

@bors try

@bors
Copy link
Contributor

bors commented Dec 22, 2020

⌛ Trying commit e10fd772a10fa762806af973306b1513d40ac1c1 with merge 035e759b99e57ac8055a4d0c71b48e2ceb0beb36...

@RalfJung
Copy link
Member Author

@rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-9 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
.................................................................................................... 9000/11196
.................................................................................................... 9100/11196
......................................................................................i.......i..... 9200/11196
.................................................................................................... 9300/11196
.........................iiiiii..iiiiii.i........................................................... 9400/11196
.................................................................................................... 9600/11196
.................................................................................................... 9700/11196
.................................................test [ui] ui/issues/issue-74564-if-expr-stack-overflow.rs has been running for over 60 seconds
................................................... 9800/11196
---
failures:

---- [ui] ui/intrinsics/intrinsic-move-val-cleanups.rs stdout ----

error: test compilation failed although it shouldn't!
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/intrinsics/intrinsic-move-val-cleanups.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zemit-future-incompat-report" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/intrinsics/intrinsic-move-val-cleanups/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/intrinsics/intrinsic-move-val-cleanups/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init(&mut dest_b, { LogOnDrop(&acq, "drop temp LOD", 3);
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ panic!("every test ends in a panic") },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`
error: aborting due to 5 previous errors

For more information about this error, try `rustc --explain E0425`.


------------------------------------------


---- [ui] ui/unsafe/unsafe-move-val-init.rs stdout ----
diff of stderr:

- error[E0133]: dereference of raw pointer is unsafe and requires unsafe function or block
-   --> $DIR/unsafe-move-val-init.rs:8:5
+ error[E0425]: cannot find function `move_val_init` in module `intrinsics`
3    |
3    |
4 LL |     intrinsics::move_val_init(1 as *mut u32, 1);
-    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereference of raw pointer
-    |
-    = note: raw pointers may be NULL, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior
+    |                 ^^^^^^^^^^^^^ not found in `intrinsics`
9 error: aborting due to previous error
10 

- For more information about this error, try `rustc --explain E0133`.
- For more information about this error, try `rustc --explain E0133`.
+ For more information about this error, try `rustc --explain E0425`.
12 


The actual stderr differed from the expected stderr.
Actual stderr saved to /checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init/unsafe-move-val-init.stderr
To update references, rerun the tests and pass the `--bless` flag
To only update this specific test, also pass `--test-args unsafe/unsafe-move-val-init.rs`
error: 1 errors occurred comparing output.
status: exit code: 1
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/unsafe/unsafe-move-val-init.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zemit-future-incompat-report" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init" "-A" "unused" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |     intrinsics::move_val_init(1 as *mut u32, 1);
   |                 ^^^^^^^^^^^^^ not found in `intrinsics`
error: aborting due to previous error

For more information about this error, try `rustc --explain E0425`.

---

Some tests failed in compiletest suite=ui mode=ui host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu


command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--suite" "ui" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-9/bin/FileCheck" "--nodejs" "/usr/bin/node" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "9.0.0" "--llvm-components" "aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets amdgpu amdgpuasmparser amdgpucodegen amdgpudesc amdgpudisassembler amdgpuinfo amdgpuutils analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver engine executionengine fuzzmutate globalisel hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interpreter ipo irreader jitlink lanai lanaiasmparser lanaicodegen lanaidesc lanaidisassembler lanaiinfo libdriver lineeditor linker lto mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcjit passes perfjitevents powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo riscvutils runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info x86utils xcore xcorecodegen xcoredesc xcoredisassembler xcoreinfo xray" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /checkout/obj/build/bootstrap/debug/bootstrap --stage 2 test --exclude src/tools/tidy
Build completed unsuccessfully in 0:17:10

@bors
Copy link
Contributor

bors commented Dec 22, 2020

☀️ Try build successful - checks-actions
Build commit: 035e759b99e57ac8055a4d0c71b48e2ceb0beb36 (035e759b99e57ac8055a4d0c71b48e2ceb0beb36)

@rust-timer
Copy link
Collaborator

Queued 035e759b99e57ac8055a4d0c71b48e2ceb0beb36 with parent 793931f, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 22, 2020
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (035e759b99e57ac8055a4d0c71b48e2ceb0beb36): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 22, 2020
@RalfJung
Copy link
Member Author

Let's see what happens if we only call intrinsics directly in write.

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@bors
Copy link
Contributor

bors commented Dec 22, 2020

⌛ Trying commit 9b2ca1ac97baf9624f2c5fb8bff9959e6e91d712 with merge 63c71d4cdc96bd631e4a4da9b77587035aac443b...

@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-9 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
.................................................................................................... 9000/11196
.................................................................................................... 9100/11196
......................................................................................i......i...... 9200/11196
.................................................................................................... 9300/11196
.........................iiiiii..iiiiii.i........................................................... 9400/11196
.................................................................................................... 9600/11196
.................................................................................................... 9700/11196
.................................................................................................... 9800/11196
.................................................................................................... 9900/11196
---
failures:

---- [ui] ui/intrinsics/intrinsic-move-val-cleanups.rs stdout ----

error: test compilation failed although it shouldn't!
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/intrinsics/intrinsic-move-val-cleanups.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zemit-future-incompat-report" "-C" "prefer-dynamic" "-o" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/intrinsics/intrinsic-move-val-cleanups/a" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/intrinsics/intrinsic-move-val-cleanups/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init(&mut dest_b, { LogOnDrop(&acq, "drop temp LOD", 3);
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ panic!("every test ends in a panic") },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`

error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |         intrinsics::move_val_init({ LogOnDrop(&acq, "drop temp LOD", 2); &mut dest_a },
   |                     ^^^^^^^^^^^^^ not found in `intrinsics`
error: aborting due to 5 previous errors

For more information about this error, try `rustc --explain E0425`.


------------------------------------------


---- [ui] ui/unsafe/unsafe-move-val-init.rs stdout ----
diff of stderr:

- error[E0133]: dereference of raw pointer is unsafe and requires unsafe function or block
-   --> $DIR/unsafe-move-val-init.rs:8:5
+ error[E0425]: cannot find function `move_val_init` in module `intrinsics`
3    |
3    |
4 LL |     intrinsics::move_val_init(1 as *mut u32, 1);
-    |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dereference of raw pointer
-    |
-    = note: raw pointers may be NULL, dangling or unaligned; they can violate aliasing rules and cause data races: all of these are undefined behavior
+    |                 ^^^^^^^^^^^^^ not found in `intrinsics`
9 error: aborting due to previous error
10 

- For more information about this error, try `rustc --explain E0133`.
- For more information about this error, try `rustc --explain E0133`.
+ For more information about this error, try `rustc --explain E0425`.
12 


The actual stderr differed from the expected stderr.
Actual stderr saved to /checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init/unsafe-move-val-init.stderr
To update references, rerun the tests and pass the `--bless` flag
To only update this specific test, also pass `--test-args unsafe/unsafe-move-val-init.rs`
error: 1 errors occurred comparing output.
status: exit code: 1
status: exit code: 1
command: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "/checkout/src/test/ui/unsafe/unsafe-move-val-init.rs" "-Zthreads=1" "--target=x86_64-unknown-linux-gnu" "--error-format" "json" "-Zui-testing" "-Zdeduplicate-diagnostics=no" "-Zemit-future-incompat-report" "--emit" "metadata" "-C" "prefer-dynamic" "--out-dir" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init" "-A" "unused" "-Crpath" "-O" "-Cdebuginfo=0" "-Zunstable-options" "-Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "-L" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui/unsafe/unsafe-move-val-init/auxiliary"
------------------------------------------

------------------------------------------
stderr:
stderr:
------------------------------------------
error[E0425]: cannot find function `move_val_init` in module `intrinsics`
   |
   |
LL |     intrinsics::move_val_init(1 as *mut u32, 1);
   |                 ^^^^^^^^^^^^^ not found in `intrinsics`
error: aborting due to previous error

For more information about this error, try `rustc --explain E0425`.

---

Some tests failed in compiletest suite=ui mode=ui host=x86_64-unknown-linux-gnu target=x86_64-unknown-linux-gnu


command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/compiletest" "--compile-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" "--run-lib-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" "--rustc-path" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" "--src-base" "/checkout/src/test/ui" "--build-base" "/checkout/obj/build/x86_64-unknown-linux-gnu/test/ui" "--stage-id" "stage2-x86_64-unknown-linux-gnu" "--suite" "ui" "--mode" "ui" "--target" "x86_64-unknown-linux-gnu" "--host" "x86_64-unknown-linux-gnu" "--llvm-filecheck" "/usr/lib/llvm-9/bin/FileCheck" "--nodejs" "/usr/bin/node" "--host-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--target-rustcflags" "-Crpath -O -Cdebuginfo=0 -Zunstable-options  -Lnative=/checkout/obj/build/x86_64-unknown-linux-gnu/native/rust-test-helpers" "--docck-python" "/usr/bin/python2.7" "--lldb-python" "/usr/bin/python2.7" "--gdb" "/usr/bin/gdb" "--quiet" "--llvm-version" "9.0.0" "--llvm-components" "aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets amdgpu amdgpuasmparser amdgpucodegen amdgpudesc amdgpudisassembler amdgpuinfo amdgpuutils analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfomsf debuginfopdb demangle dlltooldriver engine executionengine fuzzmutate globalisel hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interpreter ipo irreader jitlink lanai lanaiasmparser lanaicodegen lanaidesc lanaidisassembler lanaiinfo libdriver lineeditor linker lto mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts object objectyaml option orcjit passes perfjitevents powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo riscvutils runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target textapi transformutils vectorize webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info x86utils xcore xcorecodegen xcoredesc xcoredisassembler xcoreinfo xray" "--system-llvm" "--cc" "" "--cxx" "" "--cflags" "" "--adb-path" "adb" "--adb-test-dir" "/data/tmp/work" "--android-cross-path" "" "--color" "always"


failed to run: /checkout/obj/build/bootstrap/debug/bootstrap --stage 2 test --exclude src/tools/tidy
Build completed unsuccessfully in 0:17:08

@bors
Copy link
Contributor

bors commented Dec 22, 2020

☀️ Try build successful - checks-actions
Build commit: 63c71d4cdc96bd631e4a4da9b77587035aac443b (63c71d4cdc96bd631e4a4da9b77587035aac443b)

@rust-timer
Copy link
Collaborator

Queued 63c71d4cdc96bd631e4a4da9b77587035aac443b with parent 75e1acb, future comparison URL.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 22, 2020
@rust-timer
Copy link
Collaborator

Finished benchmarking try commit (63c71d4cdc96bd631e4a4da9b77587035aac443b): comparison url.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying rollup- to bors.

Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 22, 2020
@RalfJung
Copy link
Member Author

Now perf is looking clean (but @bjorn3 remarked that the impact would likely be largest when running debug builds).

@bjorn3
Copy link
Member

bjorn3 commented Dec 25, 2020

Perf results for ebobby/simple-raytracer@804a7a2:

Benchmark #1: ./raytracer_75e1acb
  Time (mean ± σ):      8.040 s ±  0.019 s    [User: 8.037 s, System: 0.004 s]
  Range (min … max):    8.013 s …  8.065 s    10 runs
 
Benchmark #2: ./raytracer_63c71d4
  Time (mean ± σ):      8.086 s ±  0.022 s    [User: 8.083 s, System: 0.003 s]
  Range (min … max):    8.061 s …  8.138 s    10 runs
 
Summary
  './raytracer_75e1acb' ran
    1.01 ± 0.00 times faster than './raytracer_63c71d4'

The perf difference is so small as simple-raytracer only spends a tiny bit of time in std::ptr::write. The codegen is noticably worse though.

Before:

       │    000000000007ab40 <core::ptr::write>:
       │    _ZN4core3ptr5write17h8616c78d24e53297E():
 32,01 │      sub    $0x18,%rsp
 19,97 │      mov    %rdi,0x8(%rsp)
 16,00 │      mov    %rsi,0x10(%rsp)
 32,02 │      mov    %rsi,(%rdi)
       │      add    $0x18,%rsp
       │    ← retq

After:

       │    000000000007ac00 <core::ptr::write>:
       │    _ZN4core3ptr5write17h8616c78d24e53297E():
 18,93 │      sub    $0x28,%rsp
  8,13 │      mov    %rsi,(%rsp)
 18,96 │      mov    %rdi,0x10(%rsp)
       │      movb   $0x0,0xf(%rsp)
  5,42 │      movb   $0x1,0xf(%rsp)
 10,72 │      mov    (%rsp),%rax
 29,71 │      mov    %rax,(%rdi)
  8,13 │      movb   $0x0,0xf(%rsp)
       │      add    $0x28,%rsp
       │    ← retq 

@RalfJung
Copy link
Member Author

@bjorn3 what kind of build of the raytracer is that (debug/release, which codegen backend)?

copy_nonoverlapping should directly lower to a memcpy. OTOH, looks like move_val_init is already lowered to an assignment during MIR building (but separate from @tmiasko's new pass). Looks like MIR assignments are lowered better than copy_nonoverlapping?

@bjorn3
Copy link
Member

bjorn3 commented Dec 26, 2020

@bjorn3 what kind of build of the raytracer is that (debug/release, which codegen backend)?

Debug build using cg_llvm.

copy_nonoverlapping should directly lower to a memcpy. OTOH, looks like move_val_init is already lowered to an assignment during MIR building (but separate from @tmiasko's new pass). Looks like MIR assignments are lowered better than copy_nonoverlapping?

Much better. copy_nonoverlapping results in several assignments and an extra taken reference in addition to the intrinsic call. It is even considered as capable of unwinding.

#![feature(core_intrinsics)]
pub unsafe fn write<T>(dst: *mut T, src: T) {
    std::intrinsics::move_val_init(&mut *dst, src)
}
fn write(_1: *mut T, _2: T) -> () {
    debug dst => _1;
    debug src => _2;
    let mut _0: ();
    let mut _3: *mut T;
    let mut _4: &mut T;

    bb0: {
        StorageLive(_4);
        _4 = &mut (*_1);
        _3 = &raw mut (*_4);
        (*_3) = move _2;
        StorageDead(_4);
        return;
    }
}
#![feature(core_intrinsics)]
pub unsafe fn write<T>(dst: *mut T, src: T) {
    std::intrinsics::copy_nonoverlapping(&src as *const T, dst, 1);
    std::intrinsics::forget(src);
}
fn write(_1: *mut T, _2: T) -> () {
    debug dst => _1;
    debug src => _2;
    let mut _0: ();
    let _3: ();
    let mut _4: *const T;
    let _5: &T;
    let mut _6: *mut T;
    let mut _7: bool;

    bb0: {
        _7 = const false;
        _7 = const true;
        StorageLive(_3);
        StorageLive(_4);
        StorageLive(_5);
        _5 = &_2;
        _4 = &raw const (*_5);
        StorageLive(_6);
        _6 = _1;
        _3 = copy_nonoverlapping::<T>(move _4, move _6, const 1_usize) -> [return: bb1, unwind: bb4];

    bb1: {
        StorageDead(_6);
        StorageDead(_4);
        StorageDead(_5);
        StorageDead(_3);
        _7 = const false;
        _0 = const ();
        return;
    }

    bb2 (cleanup): {
        resume;
    }

    bb3 (cleanup): {
        drop(_2) -> bb2;
    }

    bb4 (cleanup): {
        switchInt(_7) -> [false: bb2, otherwise: bb3];
    }
}

@bors
Copy link
Contributor

bors commented Jan 16, 2021

⌛ Testing commit a5b89a0 with merge 492b83c...

@bors
Copy link
Contributor

bors commented Jan 16, 2021

☀️ Test successful - checks-actions
Approved by: lcnr
Pushing 492b83c to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jan 16, 2021
@bors bors merged commit 492b83c into rust-lang:master Jan 16, 2021
@rustbot rustbot added this to the 1.51.0 milestone Jan 16, 2021
@therealprof
Copy link
Contributor

There's a noticeable binary size regression between rustc 1.51.0-nightly (e38fb306b 2021-01-14) and rustc 1.51.0-nightly (4253153db 2021-01-17) on thumbv6m-none-eabi in dev mode. However looking at the assembly I have troubles accounting for all of it:

Section size:
   text    data     bss     dec     hex filename
   9820       0       4    9824    2660 blinky

vs

Section size:
   text    data     bss     dec     hex filename
   9884       0       4    9888    26a0 blinky

The only observable difference I can spot is:

 08000936 <core::ptr::write>:
- 8000936:      b083            sub     sp, #12
- 8000938:      9001            str     r0, [sp, #4]
- 800093a:      9102            str     r1, [sp, #8]
- 800093c:      6001            str     r1, [r0, #0]
- 800093e:      b003            add     sp, #12
- 8000940:      4770            bx      lr
+ 8000936:      b082            sub     sp, #8
+ 8000938:      9100            str     r1, [sp, #0]
+ 800093a:      9001            str     r0, [sp, #4]
+ 800093c:      9900            ldr     r1, [sp, #0]
+ 800093e:      6001            str     r1, [r0, #0]
+ 8000940:      b002            add     sp, #8
+ 8000942:      4770            bx      lr

@RalfJung RalfJung deleted the less-intrinsic-write branch January 18, 2021 10:15
@RalfJung
Copy link
Member Author

@therealprof so is the binary of rustc itself bigger, or is some rustc-generated binary bigger? Is this a debug build or a release build?

https://github.com/kennytm/rustup-toolchain-install-master could be used to confirm that it is this PR vs some other PR that landed that day.

@therealprof
Copy link
Contributor

@therealprof so is the binary of rustc itself bigger, or is some rustc-generated binary bigger? Is this a debug build or a release build?

Generated binaries in dev (or debug mode if you prefer) are larger.

https://github.com/kennytm/rustup-toolchain-install-master could be used to confirm that it is this PR vs some other PR that landed that day.

Sorry, I don't have time to bisect this.

I was just browsing the recently merged PRs and the mention of regressions piqued my interest so I decided to run my tools (https://github.com/stm32-rs/stm32f0xx-hal/blob/master/tools/capture_nightly_example_bloat.sh) on the latest nightly and sure enough I can see regressions happening between rustc 1.51.0-nightly (e38fb306b 2021-01-14) and rustc 1.51.0-nightly (4253153db 2021-01-17).

@RalfJung
Copy link
Member Author

Some regression for debug builds was expected with this PR. Depending on how much this matters, I sketched an idea for how to mitigate this:

The debug performance regression could likely be fixed by adding a post-drop-elaboration MIR optimization pass which replaces copy_nonoverlapping(src, dst, 1) by *dst = *src.

@therealprof
Copy link
Contributor

Well, debug mode in Rust is atrocious in every regard compared to other languages used for embedded development. The binaries are huge (often too big to fit into the flash of smaller microcontrollers) and slow (some peripherals like USB can't even be used since the generated code is some order of magnitudes too slow to react to USB events).

Every little improvement helps! I'd be happy to test and benchmark any changes once they've landed in nightly. I've collected a nice dataset over different versions in the current format (each stable release back to 1.41 and a quite a few more nightlies in between).

@RalfJung
Copy link
Member Author

RalfJung commented Jan 18, 2021

The reason this PR was deemed acceptable is that the issue should only affect debug builds. Anybody who cares about such issues would use release (or size-optimized) builds, we figured, and those should not be affected.

Why does the size of debug builds matter so much to you?

(Btw, discussions in a closed PR are bound to get lost, so if you think something should be done here or if you think there's something worth tracking, please open an issue.)

Every little improvement helps! I'd be happy to test and benchmark any changes once they've landed in nightly.

I won't have time to work on this, but maybe someone from @rust-lang/wg-mir-opt would be interested. Cc @tmiasko who recently added a closely related intrinsic lowering MIR pass.

@therealprof
Copy link
Contributor

The reason this PR was deemed acceptable is that the issue should only affect debug builds. Anybody who cares about such issues would use release (or size-optimized) builds, we figured, and those should not be affected.

I can confirm release builds are not affected.

Why does the size of debug builds matter so much to you?

Well, only debug builds are really debuggable (in embedded context) for starters and debugging with a debugger has some extra relevance due to limited interaction capabilities compared with a regular application. Release builds are also built with all optimisation features Rust has to offer which really make them really slow to compile. And then there's the usual trap for young players that the default build mode is dev mode, so starters are frequently ending up in debug mode; best case they'll get a linker error telling them about their mistake...

(Btw, discussions in a closed PR are bound to get lost, so if you think something should be done here or if you think there's something worth tracking, please open an issue.)

Sure thing.

@usbalbin usbalbin mentioned this pull request Jan 18, 2021
@rylev
Copy link
Member

rylev commented Jan 20, 2021

@RalfJung @lcnr There seems to be a small compile time perf regression after all in regex-debug full builds. Looks like LLVM_module_codegen_emit_obj regressed.Thoughts on if addressing this is worth it?

@oli-obk
Copy link
Contributor

oli-obk commented Jan 20, 2021

Considering that LLVM_module_optimize_module_passes now takes half the time, I think LLVM now fails to optimize something that then causes more llvm object code to be emitted. This could also have a runtime impact due to missing some optimizations.

Now... this is only for regex-debug, and we generally don't put too much effort into making debug builds be efficient, so the runtime perf loss is acceptable, but it is still not great that we lost some compile-time. I guess regex uses ptr::write a lot (or at least "for a lot of different T").

I don't think it is possible to regain this perf directly. We may get it back via always-run MIR optimizations that turn this ptr::write into basic MIR statements.

@therealprof
Copy link
Contributor

cf. #81163 for ongoing discussion

Now... this is only for regex-debug, and we generally don't put too much effort into making debug builds be efficient, so the runtime perf loss is acceptable,

I very much disagree with this statement. The performance and binary size of debug builds are a huge problem for embedded Rust already; regressions are definitely not acceptable for us.

@RalfJung
Copy link
Member Author

Also see #80290 (comment) for an approach that should regain perf. write wasn't itself an intrinsic before, so I don't think it is necessary to make it an intrinsic now.

@rylev
Copy link
Member

rylev commented Jan 20, 2021

Should we create an issue for #80290 (comment) to make sure it's not lost?

@RalfJung
Copy link
Member Author

Yeah if people care enough, there should be a version of #81163 for ptr::write.

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 13, 2021
directly expose copy and copy_nonoverlapping intrinsics

This effectively un-does rust-lang#57997. That should help with `ptr::read` codegen in debug builds (and any other of these low-level functions that bottoms out at `copy`/`copy_nonoverlapping`), where the wrapper function will not get inlined. See the discussion in rust-lang#80290 and rust-lang#81163.

Cc `@bjorn3` `@therealprof`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request Feb 25, 2021
directly expose copy and copy_nonoverlapping intrinsics

This effectively un-does rust-lang/rust#57997. That should help with `ptr::read` codegen in debug builds (and any other of these low-level functions that bottoms out at `copy`/`copy_nonoverlapping`), where the wrapper function will not get inlined. See the discussion in rust-lang/rust#80290 and rust-lang/rust#81163.

Cc `@bjorn3` `@therealprof`
Zalathar added a commit to Zalathar/rust that referenced this pull request Jul 25, 2024
The test mentioned by this comment was deleted long ago by
<rust-lang#80290>.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.