Use hyperfine and jq to improve evaluate.sh #182

hundredwatt · 2024-01-06T19:40:46Z

For lack of a better name, I called this script evaluate2.sh

$ ./evaluate2.sh
Usage: evaluate2.sh <fork name> (<fork name 2> ...)

hundredwatt · 2024-01-06T19:45:11Z

evaluate2.sh

+
+echo ""
+TEMP_FILE=$(mktemp)
+OPTS="--warmup 1 --runs 5 --export-json $TEMP_FILE"


Note: I configured 1 warmup run... this differs from evaluate.sh so we could always change this to 0

hundredwatt · 2024-01-06T19:53:13Z

I added output of the raw times in addition to the trimmed means, eg:

fork,trimmed_mean
spullara,0.52814592528
royvanrijn,0.49758428661333337

fork,raw_times
spullara,0.51465599528,0.53491328628,0.52979903628,0.51972545328,0.53921453728
royvanrijn,0.49466057828000004,0.49655945328,0.49698236928,0.4992110372800001,0.50892974428

gunnarmorling · 2024-01-07T10:53:22Z

Very cool! One problem I see is that this moves the location of time measurement from just the java call to the launch script, which penalizes any contenders which call sdk for setting up a specific distro (as an example, I see the same time for merykitty as before, but +300ms for royvanrijn). We'd somehow have to extract this step.

hundredwatt · 2024-01-08T23:54:40Z

New look, based on: #105 (comment)

hundredwatt

@gunnarmorling I updated the script, please see my review comments below with some callouts and questions

evaluate2.sh

gunnarmorling

@hundredwatt, thanks, I think this is getting into great shape. A few comments inline. Apart from those, is there a way we can capture the output of the contenders? We'd need that to compare that to the expected output (see process_output.java, which then of course wouldn't have to handle the output of time any more).

hundredwatt · 2024-01-09T15:59:30Z

@gunnarmorling Yes, we can do hyperfine --output <FILE>, I'll take a look at process_output.java shortly

hundredwatt

Addressed all comments from previous reviews

evaluate2.sh

gunnarmorling · 2024-01-09T16:15:12Z

I'll take a look at process_output.java shortly

Excellent, thx! It's invoked via ./process.sh (and sorry for the code, I know it's a hot mess... ;)

hundredwatt · 2024-01-09T16:15:25Z

Remaining TODO:

Save output to file
Incorporate functionality from process_output.java

evaluate2.sh

AlexanderYastrebov · 2024-01-09T16:25:23Z

I'll take a look at process_output.java shortly

BTW bash uses builtin time, the GNU /usr/bin/time supports TIME environment variable, see https://man7.org/linux/man-pages/man1/time.1.html

bash:

$ time sleep 1

real    0m1,009s
user    0m0,003s
sys     0m0,006s

/usr/bin/time has quite verbose default format which could be reduced just to real to avoid any kind of parsing.

$ /usr/bin/time sleep 1
0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2240maxresident)k
0inputs+0outputs (0major+78minor)pagefaults 0swaps

$ export TIME="%e"
$ /usr/bin/time sleep 1
1.00

Note also that default sh in Ubuntu is dash

~$ ll /bin/sh
lrwxrwxrwx 1 root root 4 Sep 27  2018 /bin/sh -> dash*

that does not have time builtin so all scripts that use #!/bin/sh and time (not /usr/bin/time) in Ubuntu print

0.00user 0.00system 0:01.00elapsed 0%CPU (0avgtext+0avgdata 2240maxresident)k
0inputs+0outputs (0major+78minor)pagefaults 0swaps

source sdk also does not work with dash.

I.e. for the next challenge baseline should stick to #!/bin/bash and use /usr/bin/time - then output could be configured via TIME :)

evaluate2.sh

hundredwatt · 2024-01-09T16:37:06Z

process_output.java is used as follows:

Expected output is created via ./eval.sh baseline; cp baseline.out out_expected.txt
Fork output is created via ./eval.sh $fork
Invoke process.sh ./process.sh $fork
process_output.java then:
a. Verifies the printed aggregation matches baseline
b. Collects the times
c. Computes the trimmed mean
d. Prints the leaderboard line in Markdown table format

evaluate2.sh already handles b. and c., so we just need to add a. and d.

Let me know if I missed anything 😄

gunnarmorling · 2024-01-09T16:50:35Z

Yepp, that sounds exactly right 👍 .

hundredwatt · 2024-01-09T17:16:43Z

Ugh, hyperfine --output <FILE> overwrites FILE on each run... it doesn't append.

For now we'll only check the output of 1 run unless we can find a workaround

gunnarmorling · 2024-01-09T17:38:15Z

For now we'll only check the output of 1 run unless we can find a workaround

So if we run this script multiple times for multiple contenders, will we check the output of the last run for all contenders (ok)? Or just the last run of the last contender (bad)?

gunnarmorling · 2024-01-09T17:49:02Z

I've created #266 as a follow-up to this one, re-organizing the existing launch scripts to adhere to the structure established here.

hundredwatt · 2024-01-09T17:52:24Z

So if we run this script multiple times for multiple contenders, will we check the output of the last run for all contenders (ok)? Or just the last run of the last contender (bad)?

The former, we're ok 👍

hundredwatt · 2024-01-09T17:55:45Z

I pushed all the changes from process_output.java!

New look:

Happy Path

Verification failed

Going to do another round of testing on a Fedora box too

hundredwatt · 2024-01-09T17:56:49Z

For fun, the Leaderboard text now scans ./prepare_$fork.sh and extracts the Java version from sdk use if present 😄

hundredwatt · 2024-01-09T18:01:01Z

Oops, forgot the SMT / turbo stuff

hundredwatt

@gunnarmorling I think it's ready 😅

New look:

Happy Path

Verification failed

hundredwatt · 2024-01-09T18:23:17Z

evaluate2.sh

+check_command_installed hyperfine
+check_command_installed jq
+
+# Check if SMT is enabled (we want it disabled)


What the warnings look like:

gunnarmorling · 2024-01-09T19:50:14Z

That's awesome, really great stuff, @hundredwatt! I'm gonna squash everything into one commit and merge it. We can do any necessary fine-tuning in follow-up PRs. Thanks a lot for pulling through with this one!

gunnarmorling · 2024-01-09T19:51:44Z

For fun, the Leaderboard text now scans ./prepare_$fork.sh and extracts the Java version from sdk use if present 😄

Wanted to suggest exactly that, but you beat me to it. Seems we can (re-)generate the leaderboard fully automatically with this change. Thanks again!

hundredwatt · 2024-01-09T20:38:26Z

My pleasure, happy to help out! Thanks for putting together such a fun, challenging and educational contest @gunnarmorling! Hopefully I’ll find some time to submit my own entry once we run out of infra things to improve 😂

…

On Tue, Jan 9, 2024 at 12:52 PM Gunnar Morling ***@***.***> wrote: Merged #182 <#182> into main. — Reply to this email directly, view it on GitHub <#182 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAWLOOOOTQJ3JJFL4JMSGLYNWNWVAVCNFSM6AAAAABBPYITNOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJRGQZTIOBTGYYTONQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

AlexanderYastrebov · 2024-01-10T17:47:31Z

evaluate2.sh

+  then
+    echo "Usage: evaluate2.sh <fork name> (<fork name 2> ...)"
+    echo " for each fork, there must be a 'prepare_<fork name>.sh' script and a 'calculate_average_<fork name>.sh' script"
+    echo " there may be an 'additional_build_steps_<fork name>.sh' script too"


Why do we need additional_build_steps_*.sh? I think the stuff could be in prepare_*.sh

They run one after another and there is a single use only:

https://github.com/gunnarmorling/1brc/blob/main/additional_build_steps_thomaswue.sh

https://github.com/gunnarmorling/1brc/blob/main/prepare_thomaswue.sh

I don't see why we can not mv additional_build_steps_thomaswue.sh prepare_thomaswue.sh and drop support for additional_build_steps_*.sh.

hundredwatt mentioned this pull request Jan 6, 2024

Suggestion: Use hyperfine instead of time to perform timing measurements #105

Closed

hundredwatt commented Jan 6, 2024

View reviewed changes

hundredwatt force-pushed the evaluate-hyperfine branch 3 times, most recently from de445e7 to 363803b Compare January 7, 2024 02:51

hundredwatt added 3 commits January 8, 2024 16:48

create new version of evaluate.sh using hyperfine + jq

f2d85cc

output the raw times for each command

8691782

nit: s/command/fork/

facf3c1

hundredwatt force-pushed the evaluate-hyperfine branch from 363803b to 8a636e8 Compare January 8, 2024 23:54

hundredwatt force-pushed the evaluate-hyperfine branch from 8a636e8 to 5264787 Compare January 8, 2024 23:58

hundredwatt commented Jan 9, 2024

View reviewed changes

evaluate2.sh Outdated Show resolved Hide resolved

evaluate2.sh Outdated Show resolved Hide resolved

evaluate2.sh Outdated Show resolved Hide resolved

evaluate2.sh Show resolved Hide resolved

update evaluate2.sh for new fork file structure

15acae6

hundredwatt force-pushed the evaluate-hyperfine branch from 5264787 to 15acae6 Compare January 9, 2024 00:05

AlexanderYastrebov reviewed Jan 9, 2024

View reviewed changes

evaluate2.sh Outdated Show resolved Hide resolved

AlexanderYastrebov reviewed Jan 9, 2024

View reviewed changes

evaluate2.sh Show resolved Hide resolved

gunnarmorling reviewed Jan 9, 2024

View reviewed changes

evaluate2.sh Outdated Show resolved Hide resolved

gunnarmorling reviewed Jan 9, 2024

View reviewed changes

review changes

5d6d347

hundredwatt commented Jan 9, 2024

View reviewed changes

evaluate2.sh Outdated Show resolved Hide resolved

gunnarmorling reviewed Jan 9, 2024

View reviewed changes

evaluate2.sh Outdated Show resolved Hide resolved

hundredwatt added 2 commits January 9, 2024 09:24

use numactl on linux

1582b3e

1 warmup

8543b7a

gunnarmorling reviewed Jan 9, 2024

View reviewed changes

evaluate2.sh Show resolved Hide resolved

AlexanderYastrebov mentioned this pull request Jan 9, 2024

Added to stable comparison lehuyduc/1brc-simd#1

Open

gunnarmorling mentioned this pull request Jan 9, 2024

Hyperfine: Script re-org #266

Merged

verify output

d647b41

hundredwatt force-pushed the evaluate-hyperfine branch from ea63161 to a492c37 Compare January 9, 2024 17:52

leaderboard

1170ae2

hundredwatt force-pushed the evaluate-hyperfine branch from a492c37 to 1170ae2 Compare January 9, 2024 17:54

hundredwatt added 3 commits January 9, 2024 11:09

do not early exit on hyperfine error

e06b5a3

check if SMT and turbo boost are disabled

ad43a5d

fix bug

131b4fe

hundredwatt commented Jan 9, 2024

View reviewed changes

gunnarmorling merged commit 42e5ca1 into gunnarmorling:main Jan 9, 2024

hundredwatt deleted the evaluate-hyperfine branch January 9, 2024 21:12

AlexanderYastrebov reviewed Jan 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use hyperfine and jq to improve evaluate.sh #182

Use hyperfine and jq to improve evaluate.sh #182

hundredwatt commented Jan 6, 2024 •

edited

Loading

hundredwatt Jan 6, 2024

hundredwatt commented Jan 6, 2024 •

edited

Loading

gunnarmorling commented Jan 7, 2024

hundredwatt commented Jan 8, 2024

hundredwatt left a comment

gunnarmorling left a comment

hundredwatt commented Jan 9, 2024

hundredwatt left a comment

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

AlexanderYastrebov commented Jan 9, 2024 •

edited

Loading

hundredwatt commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024 •

edited

Loading

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt left a comment •

edited

Loading

hundredwatt Jan 9, 2024

gunnarmorling commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024 via email

AlexanderYastrebov Jan 10, 2024

Use hyperfine and jq to improve evaluate.sh #182

Use hyperfine and jq to improve evaluate.sh #182

Conversation

hundredwatt commented Jan 6, 2024 • edited Loading

hundredwatt Jan 6, 2024

Choose a reason for hiding this comment

hundredwatt commented Jan 6, 2024 • edited Loading

gunnarmorling commented Jan 7, 2024

hundredwatt commented Jan 8, 2024

hundredwatt left a comment

Choose a reason for hiding this comment

gunnarmorling left a comment

Choose a reason for hiding this comment

hundredwatt commented Jan 9, 2024

hundredwatt left a comment

Choose a reason for hiding this comment

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

AlexanderYastrebov commented Jan 9, 2024 • edited Loading

hundredwatt commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024 • edited Loading

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

Happy Path

Verification failed

hundredwatt commented Jan 9, 2024

hundredwatt commented Jan 9, 2024

hundredwatt left a comment • edited Loading

Choose a reason for hiding this comment

Happy Path

Verification failed

hundredwatt Jan 9, 2024

Choose a reason for hiding this comment

gunnarmorling commented Jan 9, 2024

gunnarmorling commented Jan 9, 2024

hundredwatt commented Jan 9, 2024 via email

AlexanderYastrebov Jan 10, 2024

Choose a reason for hiding this comment

hundredwatt commented Jan 6, 2024 •

edited

Loading

hundredwatt commented Jan 6, 2024 •

edited

Loading

AlexanderYastrebov commented Jan 9, 2024 •

edited

Loading

gunnarmorling commented Jan 9, 2024 •

edited

Loading

hundredwatt left a comment •

edited

Loading