Skip to content

Conversation

@drinkcat
Copy link
Collaborator

@drinkcat drinkcat commented Aug 7, 2025

Optimizing for the %g printf format. I don't particularly care about the usecase below, but I'm thinking of moving od to the common formatter, where those optimizations would be welcome.

We're still 20% away from coreutils, but this picks some of the low hanging fruits.

cargo build --release -p uu_seq && hyperfine -L seq ./target/release/seq,seq,./seq.main "{seq} -f "%g" 0 1e-9 1e-3 > /dev/null"
    Finished `release` profile [optimized] target(s) in 0.13s
Benchmark 1: ./target/release/seq -f %g 0 1e-9 1e-3 > /dev/null
  Time (mean ± σ):     169.5 ms ±   3.3 ms    [User: 168.0 ms, System: 1.0 ms]
  Range (min … max):   164.4 ms … 174.7 ms    17 runs
 
Benchmark 2: seq -f %g 0 1e-9 1e-3 > /dev/null
  Time (mean ± σ):     138.8 ms ±   4.9 ms    [User: 136.7 ms, System: 1.6 ms]
  Range (min … max):   133.8 ms … 153.5 ms    22 runs
 
Benchmark 3: ./seq.main -f %g 0 1e-9 1e-3 > /dev/null
  Time (mean ± σ):     237.5 ms ±   4.9 ms    [User: 235.4 ms, System: 1.4 ms]
  Range (min … max):   232.6 ms … 249.8 ms    12 runs
 
Summary
  seq -f %g 0 1e-9 1e-3 > /dev/null ran
    1.22 ± 0.05 times faster than ./target/release/seq -f %g 0 1e-9 1e-3 > /dev/null
    1.71 ± 0.07 times faster than ./seq.main -f %g 0 1e-9 1e-3 > /dev/null

uucore: num_format: Optimize format_float_shortest

We already know the String length ahead of time, and we can
avoid using format.

Saves about ~30% performance on:

{seq} -f "%g" 0 1e-9 1e-3

uucore: num_format: Reduce calls to with_prec

with_prec is actually really expensive, so it's much better to
just call it once, and handle the rounding corner case manually.

Saves about ~20% performance on:

{seq} -f "%g" 0 1e-9 1e-3

uucore: num_format: Move common scientific formatting code to a function

Both format_float_scientific and format_float_shortest carry the
same code, moving it to a single function will make it possible
to optimize both.

Both format_float_scientific and format_float_shortest carry the
same code, moving it to a single function will make it possible
to optimize both.
with_prec is actually really expensive, so it's much better to
just call it once, and handle the rounding corner case manually.

Saves about ~20% performance on:
```
{seq} -f "%g" 0 1e-9 1e-3
```
We already know the String length ahead of time, and we can
avoid using `format`.

Saves about ~30% performance on:
```
{seq} -f "%g" 0 1e-9 1e-3
```
@github-actions
Copy link

github-actions bot commented Aug 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)

@sylvestre sylvestre merged commit ac527f9 into uutils:main Aug 7, 2025
90 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants