Skip to content

Commit ffb8d99

Browse files
committed
bench: add run outputs
This makes it easy to link to benchmarks when someone asks, but also serves as a good way to archive benchmark data at defined points for comparison later. We also make a (feeble) attempt at putting a "pretty" version of a subset of benchmarks in the README of each run directory.
1 parent 1233467 commit ffb8d99

File tree

9 files changed

+77242
-5
lines changed

9 files changed

+77242
-5
lines changed

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ repository = "https://github.com/BurntSushi/rust-memchr"
99
readme = "README.md"
1010
keywords = ["memchr", "char", "scan", "strchr", "string"]
1111
license = "Unlicense/MIT"
12-
exclude = ["/ci/*", "/.travis.yml", "/Makefile", "/appveyor.yml"]
12+
exclude = ["/bench", "/.github", "/fuzz"]
1313
edition = "2018"
1414

1515
[workspace]

bench/README.md

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
This directory defines a large suite of benchmarks for both the memchr and
2+
memmem APIs in this crate. A selection of "competitor" implementations are
3+
chosen. In general, benchmarks are meant to be a tool for optimization. That's
4+
why there is so many: we want to be sure we get enough coverage such that our
5+
benchmarks approximate real world usage. When some benchmarks look a bit slower
6+
than we expect (for one reason another), we can use profiling tools to look at
7+
codegen and attempt to improve that case.
8+
9+
Because there are so many benchmarks, if you run all of them, you might want to
10+
step away for a cup of coffee (or two). Therefore, the typical way to run them
11+
is to select a subset. For example,
12+
13+
```
14+
$ cargo bench -- 'memmem/krate/.*never.*'
15+
```
16+
17+
runs all benchmarks for the memmem implementation in this crate with searches
18+
that never produce any matches. This will still take a bit, but perhaps only a
19+
few minutes.
20+
21+
Running a specific benchmark can be useful for profiling. For example, if you
22+
want to see where `memmem/krate/prebuiltiter/huge-en/common-one-space` is
23+
spending all of its time, you would first want to run it (to make sure the code
24+
is compiled):
25+
26+
```
27+
$ cargo bench -- memmem/krate/prebuiltiter/huge-en/common-one-space
28+
```
29+
30+
And then run it under your profiling tool (I use `perf` on Linux):
31+
32+
```
33+
$ perfr --callgraph cargo bench -- memmem/krate/prebuiltiter/huge-en/common-one-space --profile-time 3
34+
```
35+
36+
Where
37+
[`perfr` is my own wrapper around `perf`](https://github.com/BurntSushi/dotfiles/blob/master/bin/perfr),
38+
and the `--profile-time 3` flag means, "just run the code for 3 seconds, but
39+
don't do anything else." This makes the benchmark harness get out of the way,
40+
which lets the profile focus as much as possible on the code being measured.
41+
42+
See the README in the `runs` directory for a bit more info on how to use
43+
`critcmp` to look at benchmark data in a way that makes it easy to do
44+
comparisons.

bench/data/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
This directory contains benchmark corpora. Each sub-directory contains a README
2+
documenting the corpus a bit more.

bench/data/code/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
This data contains corpora generated from source code. These sorts of corpora
2+
are important because code is something that is frequently searched.
3+
4+
This corpus was generated by running
5+
6+
```
7+
$ find ./library/alloc -name '*.rs' -print0 \
8+
| xargs -0 cat > .../memchr/bench/data/code/rust-library.rs
9+
```
10+
11+
in a checkout of the https://github.com/rust-lang/rust repository at commit
12+
78c963945aa35a76703bf62e024af2d85b2796e2.

bench/runs/2021-04-30_initial/README.md

Lines changed: 146 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)