Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

README claims #1

Closed
sharkdp opened this issue Oct 15, 2020 · 7 comments
Closed

README claims #1

sharkdp opened this issue Oct 15, 2020 · 7 comments
Labels
question Further information is requested

Comments

@sharkdp
Copy link

sharkdp commented Oct 15, 2020

Hi!

Author of hyperfine here. Glad to see that you liked our tool and decided to port it to bash. I have a few questions regarding the statements in the README. Not because I want to claim they are wrong, but because I'm genuinely curious:

produces the same output (with some improvements)

What would be some of these improvements? Maybe we could profit from these in hyperfine as well?

Outputs most of the numbers with greater precision

hyperfine reports all times in millisecond resolution because I don't know how to make a measurement that would be more precise. There is no sense in showing more digits if the actual measurement is not that precise. The problem is that we are spawning an intermediate shell that takes roughly 5 milliseconds on its own. We subtract that time again, but we can not expect to measure microsecond-resolution execution times on top of a 4.8 ms ± 3.7 ms shell spawning time. This is why hyperfine also shows a warning if commands take less than 5 ms to complete.

To get more precise timings, we would need to get rid of the intermediate shell. This is possible in principle, but would keep us from (easily) running benchmarks like seq 100000 | factor.

and outputs more information.

What kind of information would that be? CPU usage?

Supports outputting in ASCII only (no Unicode characters) to support older terminals.

Nice!

Slightly faster when interactive output (the progress bar) is disabled.

Slightly faster than what?

@tdulcet tdulcet added the question Further information is requested label Oct 16, 2020
@tdulcet
Copy link
Owner

tdulcet commented Oct 16, 2020

What would be some of these improvements?

It outputs the median time and CPU usage. In the summary section, it shows the percentage faster. With interactive output, it shows the current run number and progress percentage. With Unicode output, it uses Unicode smart quotes among other symbols like x̅ for mean and x̃ for median. It also spells out "std dev" for people who may not know the σ symbol. The warnings say how many runs specifically had the problem, so for example instead of "Ignoring non-zero exit code" it will say "Ignoring 5 non-zero exit codes". Without the -i ignore-failure option, if a run fails it will show the actual exit code. It immediately exports the results (sharkdp/hyperfine#306). It also includes numerous other improvements.

I should note that it does not include the parameter options, since Bash supports this natively (no need for me to reinvent the wheel). There are some examples of this in the usage information that I adapted from hyperfine's examples.

hyperfine reports all times in millisecond resolution because I don't know how to make a measurement that would be more precise.

Yes, the Bash time builtin also only supports millisecond resolution, which is what this port uses. I was referring to the calculated numbers, like the (elapsed, user and system) means, standard deviation and speed, which it can output with greater precision.

This port does not need to spawn new shells since Bash is obviously already a shell and can run the commands directly.

What kind of information would that be? CPU usage?

See above.

Slightly faster than what?

Slightly faster than hyperfine. I used hyperfine to benchmark itself and my port. My port is as much as 12% faster from my testing, although typically less than that. I assume this is because it does not need to spawn a new shell for each run.

Thank you again for making hyperfine! Feel free to backport any of my improvements. If you do not already know, you may be interested that hyperfine is actively being used to benchmark GNU factor and the new uutils Rust implementation (see here and here). That is what the screenshot on my README is showing.

@sharkdp
Copy link
Author

sharkdp commented Oct 17, 2020

Thank you for the detailed response!

It outputs the median time and CPU usage.

Displaying median between min and max seems like a great idea!

In the summary section, it shows the percentage faster

what does that mean?

Without the -i ignore-failure option, if a run fails it will show the actual exit code.

👍

like the (elapsed, user and system) means, standard deviation and speed, which it can output with greater precision.

Well, I believe that printing numbers with greater precision is not necessarily a good thing. If the benchmark result is 205.9 ms ± 1.7 ms, there's really no need to show any more digits.

This port does not need to spawn new shells since Bash is obviously already a shell and can run the commands directly.

👍 see sharkdp/hyperfine#336

Slightly faster than hyperfine. I used hyperfine to benchmark itself and my port. My port is as much as 12% faster from my testing, although typically less than that.

Okay.. 12% faster for commands that run extremely fast, I would guess? When the runtime is actually limited by spawning shells 😄

Thank you again for making hyperfine! Feel free to backport any of my improvements. If you do not already know, you may be interested that hyperfine is actively being used to benchmark GNU factor and the new uutils Rust implementation (see here and here). That is what the screenshot on my README is showing.

Nice, thank you for the references!

@sharkdp
Copy link
Author

sharkdp commented Oct 17, 2020

Ok, and maybe just to be fair:

Does NOT require installing Rust, downloading dependencies or compiling anything.

Using hyperfine does not require any of these things either. Packages for hyperfine are available for a wide range of distributions: https://github.com/sharkdp/hyperfine#installation. Even if users are on non-supported OS, we provide pre-compiled binaries for many different architectures (https://github.com/sharkdp/hyperfine/releases). There are statically compiled versions of hyperfine (*-musl-*) that do not have ANY dependencies (in contrary to time.sh, which requires some external tools like awk or sed to be available).

Outputs most of the numbers with greater precision

It somehow indicates that hyperfine would not be as precise, which is not true. We show all significant digits, but not too many.

supports most of the same command line options.

I guess this would be easy to add here, but I really think that --show-output is a very useful option. Both for debugging and for benchmarking commands that are limited by terminal I/O.

Parameter scans are somewhat supported by using bash features, but part of the strength of hyperfines --parameter-scan benchmarks comes from the fact that the parameter values are stored in the JSON output.. which can be used to create plots that show the param values on the x-axis.

@tdulcet
Copy link
Owner

tdulcet commented Oct 18, 2020

Thank you for the detailed response!

No problem!

what does that mean?

It is just a different representation of how much faster the fastest command is that may be easier for users to comprehend, particularity when all the commands are about the same speed.

Well, I believe that printing numbers with greater precision is not necessarily a good thing. If the benchmark result is 205.9 ms ± 1.7 ms, there's really no need to show any more digits.

Sure, when the times are less than one second and the numbers in milliseconds by default, but that same line in seconds would be 0.206 s ± 0.002 s, which is obviously losing precision. My port would output 0.2059s ± 0.0017s (see the screenshot on my README).

Packages for hyperfine are available for a wide range of distributions

Sure, but none of the major distributions that most people use: Ubuntu, Debian, openSUSE, etc.

Even if users are on non-supported OS, we provide pre-compiled binaries for many different architectures

Yes, but I doubt most people want use an unsigned binary from an untrusted source. I think most users are probably installing hyperfine with cargo install hyperfine, which obviously requires installing Rust, downloading the dependencies and compiling it.

BTW, I noticed that you just added ARM binaries. I am not sure if you know that Travis CI provides ARM jobs, so you should not have to cross-compile it like you did. They also provide Windows jobs, if you wanted to automate that.

in contrary to time.sh, which requires some external tools like awk or sed to be available

My port is only targeting Linux, where awk and sed are always available. I would love to remove these dependencies, but unfortunately Bash does not support floating-point math (awk) or regular expression replace (sed).

I am open to suggestions for changing the wording to be fairer.

It somehow indicates that hyperfine would not be as precise

See above. Currently with hyperfine you would have to use one of the export options to get better precision.

I guess this would be easy to add here, but I really think that --show-output is a very useful option.

Yes, this is on my list of features to add.

Parameter scans are somewhat supported by using bash features

I believe everything that is supported by hyperfine (--parameter-scan, --parameter-step-size and --parameter-list) is also supported by Bash's brace expansion feature and more.

but part of the strength of hyperfines --parameter-scan benchmarks comes from the fact that the parameter values are stored in the JSON output

Interesting, I was not aware of this.

@sharkdp
Copy link
Author

sharkdp commented Oct 24, 2020

Sure, when the times are less than one second and the numbers in milliseconds by default, but that same line in seconds would be 0.206 s ± 0.002 s, which is obviously losing precision. My port would output 0.2059s ± 0.0017s (see the screenshot on my README).

ok, fair point.

Sure, but none of the major distributions that most people use: Ubuntu, Debian, openSUSE, etc.

We do provide .deb packages for Debian/Ubuntu on the release page. There is also an ITP for Debian here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=908776

openSUSE is arguably declining in popularity, but that's just my personal view (which seems to align with https://distrowatch.com/ page views, if they count as a metric).

Yes, but I doubt most people want use an unsigned binary from an untrusted source. I think most users are probably installing hyperfine with cargo install hyperfine, which obviously requires installing Rust, downloading the dependencies and compiling it.

We have around 17k downloads from https://crates.io/crates/hyperfine and around 38k downloads the GitHub release page from https://somsubhra.com/github-release-stats/?username=sharkdp&repository=hyperfine - so it seems like the majority of users does not have a problem with downloading an "unsigned" binary (they are downloading it from the official project page, so...).

BTW, I noticed that you just added ARM binaries. I am not sure if you know that Travis CI provides ARM jobs, so you should not have to cross-compile it like you did. They also provide Windows jobs, if you wanted to automate that.

Thanks, I did not know that they provide ARM builds. I do Windows builds on some other projects, but so far, Appveyor was good enough. We could also switch to GitHub Actions, like we did for my bat and pastel projects.

@tdulcet
Copy link
Owner

tdulcet commented Oct 25, 2020

openSUSE is arguably declining in popularity, but that's just my personal view

That was not an exhaustive list, just a few examples. You could also include: CentOS, Mint, Raspbian, etc.

so it seems like the majority of users does not have a problem with downloading an "unsigned" binary

OK, good point. I guess I was only thinking about Linux users. Although I still think you should sign your binaries with PGP for security. One of the advantages of my port is that it is only about 600 LOC, so users can easily audit it to verify that it is not doing anything malicious. While users can audit your repository (as I have done) and its 15 dependencies, they have no way of knowing if your binaries are actually of the code in it (or even if they were created by you), without installing Rust, downloading the dependencies and compiling it themselves.

As I said before, I am open to suggestions for changing the wording to be fairer. How about: "Does NOT require running an unsigned binary or installing Rust, downloading dependencies and compiling anything".

@sharkdp
Copy link
Author

sharkdp commented Oct 25, 2020

No worries. I don't have any hard feelings about the wording 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants