Skip to content

Commit

Permalink
RFC: mem::black_box and mem::clobber
Browse files Browse the repository at this point in the history
  • Loading branch information
gnzlbg committed Mar 12, 2018
1 parent fd70ea3 commit 8a9ae3f
Showing 1 changed file with 138 additions and 0 deletions.
138 changes: 138 additions & 0 deletions text/0000-bench-utils.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
- Feature Name: black_box-and-clobber
- Start Date: 2018-03-12
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

This RFC adds two functions to `core::mem`: `black_box` and `clobber`, which are
mainly useful for writing benchmarks.

# Motivation
[motivation]: #motivation

The `black_box` and `clobber` functions are useful for writing synthetic
benchmarks where, due to the constrained nature of the benchmark, the compiler
is able to perform optimizations that wouldn't otherwise trigger in practice.

The implementation of these functions is backend-specific and requires inline
assembly. Such that if the standard library does not provide them, the users are
required to use brittle workarounds on nightly.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation


## `mem::black_box`

The function:

```rust
pub fn black_box<T>(x: T) -> T;
```

prevents the value `x` from being optimized away and flushes pending reads/writes
to memory. It does not prevent optimizations on the expression generating the
value `x` nor on the return value of the function. For
example ([`rust.godbolt.org`](https://godbolt.org/g/YP2GCJ)):

```rust
fn foo(x: i32) -> i32{
mem::black_box(2 + x);
3
}
let a = foo(2);
```

Here, the compiler can simplify the expression `2 + x` into `2 + 2` and then
`4`, but it is not allowed to discard `4`. Instead, it must store `4` into a
register even though it is not used by anything afterwards.

## `mem::clobber`

The function

```rust
pub fn clobber() -> ();
```

flushes all pending writes to memory. Memory managed by block scope objects must
be "escaped" with `black_box` .

Using `mem::{black_box, clobber}` we can benchmark `Vec::push` as follows:

```rust
fn bench_vec_push_back(bench: Bencher) -> BenchResult {
let n = /* large enough number */;
let mut v = Vec::with_capacity(n);
bench.iter(|| {
// Escape the vector pointer:
mem::black_box(v.as_ptr());
v.push_back(42_u8);
// Flush 42 write to memory:
mem::clobber();
})
}
```
# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

* `mem::black_box(x)`: flushes all pending writes/read to memory and prevents
`x` from being optimized away while still allowing optimizations on the
expression that generates `x`.
* `mem::clobber`: flushes all pending writes to memory.

# Drawbacks
[drawbacks]: #drawbacks

TBD.

# Rationale and alternatives
[alternatives]: #alternatives

An alternative design was proposed during the discussion on
[rust-lang/rfcs/issues/1484](https://github.com/rust-lang/rfcs/issues/1484), in
which the following two functions are provided instead:

```rust
#[inline(always)]
pub fn value_fence<T>(x: T) -> T {
let y = unsafe { (&x as *const T).read_volatile() };
std::mem::forget(x);
y
}

#[inline(always)]
pub fn evaluate_and_drop<T>(x: T) {
unsafe {
let mut y = std::mem::uninitialized();
std::ptr::write_volatile(&mut y as *mut T, x);
drop(y); // not necessary but for clarity
}
}
```

This approach is not pursued in this RFC because these two functions:

* add overhead ([`rust.godbolt.com`](https://godbolt.org/g/aCpPfg)): `volatile`
reads and stores aren't no ops, but the proposed `black_box` and `clobber`
functions are.
* are implementable on stable Rust: while we could add them to `std` they do not
necessarily need to be there.

# Prior art
[prior-art]: #prior-art

These two exact functions are provided in the [`Google
Benchmark`](https://github.com/google/benchmark) C++ library: are called
[`DoNotOptimize`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L306)
(`black_box`) and
[`ClobberMemory`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L317).
The `black_box` function with slightly different semantics is provided by the `test` crate:
[`test::black_box`](https://github.com/rust-lang/rust/blob/master/src/libtest/lib.rs#L1551).

# Unresolved questions
[unresolved]: #unresolved-questions

TBD.

0 comments on commit 8a9ae3f

Please sign in to comment.