-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
document how to do good performance work #7541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| <!-- spell-checker:ignore taskset --> | ||
|
|
||
| # Performance Profiling Tutorial | ||
|
|
||
| ## Effective Benchmarking with Hyperfine | ||
|
|
||
| [Hyperfine](https://github.com/sharkdp/hyperfine) is a powerful command-line benchmarking tool that allows you to measure and compare execution times of commands with statistical rigor. | ||
|
|
||
| ### Benchmarking Best Practices | ||
|
|
||
| When evaluating performance improvements, always set up your benchmarks to compare: | ||
|
|
||
| 1. The GNU implementation as reference | ||
| 2. The implementation without the change | ||
| 3. The implementation with your change | ||
|
|
||
| This three-way comparison provides clear insights into: | ||
| - How your implementation compares to the standard (GNU) | ||
| - The actual performance impact of your specific change | ||
|
|
||
| ### Example Benchmark | ||
sylvestre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| First, you will need to build the binary in release mode. Debug builds are significantly slower: | ||
|
|
||
| ```bash | ||
| cargo build --features unix --release | ||
| ``` | ||
|
|
||
| ```bash | ||
| # Three-way comparison benchmark | ||
| hyperfine \ | ||
sylvestre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| --warmup 3 \ | ||
| "/usr/bin/ls -R ." \ | ||
| "./target/release/coreutils.prev ls -R ." \ | ||
| "./target/release/coreutils ls -R ." | ||
sylvestre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # can be simplified with: | ||
| hyperfine \ | ||
| --warmup 3 \ | ||
| -L ls /usr/bin/ls,"./target/release/coreutils.prev ls","./target/release/coreutils ls" \ | ||
| "{ls} -R ." | ||
| ``` | ||
|
|
||
| ``` | ||
| # to improve the reproducibility of the results: | ||
| taskset -c 0 | ||
| ``` | ||
|
|
||
| ### Interpreting Results | ||
|
|
||
| Hyperfine provides summary statistics including: | ||
| - Mean execution time | ||
| - Standard deviation | ||
| - Min/max times | ||
| - Relative performance comparison | ||
|
|
||
| Look for consistent patterns rather than focusing on individual runs, and be aware of system noise that might affect results. | ||
|
|
||
| ## Using Samply for Profiling | ||
|
|
||
| [Samply](https://github.com/mstange/samply) is a sampling profiler that helps you identify performance bottlenecks in your code. | ||
|
|
||
| ### Basic Profiling | ||
|
|
||
| ```bash | ||
| # Generate a flame graph for your application | ||
| samply record ./target/debug/coreutils ls -R | ||
|
|
||
| # Profile with higher sampling frequency | ||
| samply record --rate 1000 ./target/debug/coreutils seq 1 1000 | ||
| ``` | ||
|
|
||
| ## Workflow: Measuring Performance Improvements | ||
|
|
||
| 1. **Establish baselines**: | ||
| ```bash | ||
| hyperfine --warmup 3 \ | ||
| "/usr/bin/sort large_file.txt" \ | ||
| "our-sort-v1 large_file.txt" | ||
| ``` | ||
|
|
||
| 2. **Identify bottlenecks**: | ||
| ```bash | ||
| samply record ./our-sort-v1 large_file.txt | ||
| ``` | ||
|
|
||
| 3. **Make targeted improvements** based on profiling data | ||
|
|
||
| 4. **Verify improvements**: | ||
| ```bash | ||
| hyperfine --warmup 3 \ | ||
| "/usr/bin/sort large_file.txt" \ | ||
| "our-sort-v1 large_file.txt" \ | ||
| "our-sort-v2 large_file.txt" | ||
| ``` | ||
|
|
||
| 5. **Document performance changes** with concrete numbers | ||
| ```bash | ||
| hyperfine --export-markdown file.md [...] | ||
| ``` | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.