-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More copy-able result for @benchmark
#236
Comments
@tecosaur any easy improvements we could do? Maybe add a |
I can see the value in this, it's just a matter of how to best do it. Unfortunately, I've been rather busy the past few weeks, and will be for the immediate future — so you'll see a lot me or me commenting than me doing 😛. |
let me know what the desired syntax / behavior should be and I can give it a try. |
addition: I think |
I suspect we could remove a lot of details without losing much. Has anyone ever got any value out of median GC percentage (whatever that is)? Putting the most important information first, you could do for instance:
And perhaps "0 bytes// GC: min 0.0ns ..." can be omitted entirely when these are all zero, without being a confusing change of layout. Since nobody runs just one benchmark, the case we should design for is having several on screen -- so being compact, and having all the information in one block, seem desirable. The fact that the histogram auto-scales to look nice for one run works against this, e.g. in #237. While it's a bigger change than moving text around, a smart suggestion (Edit -- from #253) was that perhaps the histogram should always have 0 at the left and a power of 10 on the right. Then (very often) they would be comparable between different functions. |
I don't disagree with the concerns, but for the proposed solutions I have a couple of counterpoints:
|
Here's a real-world example of just how bad the standard deviation is for these purposes: julia> show(ioctxfalse, MIME("text/plain"), b)
BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 18.860 μs … 764.893 μs ┊ GC (min … max): 0.00% … 93.99%
Time (median): 19.471 μs ┊ GC (median): 0.00%
Time (mean ± σ): 20.375 μs ± 12.209 μs ┊ GC (mean ± σ): 0.99% ± 1.65%
█▃ ▃▁
▇██▆██▄▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▃▂▂▂▂▂▃▃▃▂▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂ ▃
18.9 μs Histogram: frequency by time 28.2 μs <
Memory estimate: 15.84 KiB, allocs estimate: 4.
julia> meanad(x, med=median(x)) = mean(abs, x .- med)
meanad (generic function with 2 methods)
julia> meanad(b.times) # in nanoseconds
1224.1091
julia> std(b.times)
12209.450625612251 I think anyone looking at that histogram would agree that 1us is much closer to the typical variability than 12us: julia> sum(b.times .< median(b.times) + meanad(b.times)) / length(b.times)
0.8379 since 84% of the samples are less than median + 1 mean absolute deviation (20.7us). |
In defence of brevity, it seems that Unlike the right tail, of course. The |
The mean can be way, way off too. |
I don't mind showing more numbers, it's just trying editing this: BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 18.860 μs … 764.893 μs ┊ GC (min … max): 0.00% … 93.99%
Time (median): 19.471 μs ┊ GC (median): 0.00%
Time (mean ± σ): 20.375 μs ± 12.209 μs ┊ GC (mean ± σ): 0.99% ± 1.65%
█▃ ▃▁
▇██▆██▄▃▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▃▂▂▂▂▂▃▃▃▂▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▂▂▂▂▂▂ ▃
18.9 μs Histogram: frequency by time 28.2 μs <
Memory estimate: 15.84 KiB, allocs estimate: 4. into BenchmarkTools.Trial: 10000 samples with 1 evaluation.
Range (min … max): 18.860 μs … 764.893 μs
Time (median): 19.471 μs
Time (mean ± σ): 20.375 μs ± 12.209 μs
Memory estimate: 15.84 KiB, allocs estimate: 4. takes way too many mouse clicks right now. Try pasting the above example 3 times into some 55 characters wide Slack thread panel. When there's no color encoding, 15 numbers and blocks wrapped around lines is just way too hard to spot key information. |
In a chat now lost, I was persuaded by someone that a fairly common situation is to have (say) 90% of samples almost at the minimum, and 10% a bit higher when GC kicks in (plus perhaps a high maximum). In this kind of distribution the median is less useful than the mean. (Edit: #237 (comment) is a nice example.) The reason to print the histogram is that there's no one or two perfect numbers that are always what you should watch. But 5 (or in fact 7) times seems like a lot, it's hard to scan the page and remember where to look. Something like min/mean/max or min/median/max which has an obvious sequence would I think make it much easier to find the one you want; I guess min/median/mean/max isn't too bad either. 5 different GC numbers (of uncertain meaning) is way too many. Not printing them at all when they are all zero (which is common) would be great. |
Fair enough, if we're showing |
Here's an attempt... with #237 (comment) 's suggestion of
When copied:
|
It's hard to tell because the actual data collected are different, but it seems to reprint the median, not the mean? (Trick: use But in general terms, the new display looks more easily interpretable for most users, and I like the decoration with markers for median and mean in the histogram! |
It was indeed printing things twice, sorry, now fixed. ( |
LGTM! |
I really like the
I haven't worked this into the above, but I think it would be nice to have the stdev of the GC at least. For example, consider these two sets of example GC times
Both have a mean just over 12, but the first has a stdev of 2.3 while the second has a stdev of 24. So, I consider this information helpful in gauging how variable the GC time is. |
I think GC should go below allocation. We always care about time, and we always want to know if the allocation is larger than we expected (sometimes we expect 0, or expect |
Thanks for taking a close look. To explain my thinking a bit:
Another possibility might be
Printing I also wonder if there should be less digits on the histogram ends. Partly in the hope that they'd more often line up between subsequent runs. Presently it rounds to 2 figures. Rounding to 1 means a step from 1.0 to 2.0 which is graphically too big, but maybe 10 steps per decade can be done nicely, not sure.
|
Sometimes users want to copy the result for comparison where
@btime
gives too little information but the current@benchmark
is unfriendly for that.One proposal is to change now:
to
so that people can copy the first 3-4 lines without sending really long, wrapped lines.
Besides I think it's much easier to navigate vertically anyway
The text was updated successfully, but these errors were encountered: