Use CSV output format #14

AlexanderYastrebov · 2024-01-02T18:38:25Z

Use proper CSV (or semicolon-separated for that matter) output format

Addis Ababa;33.0;33.0;33.0
Aden;31.1;31.1;31.1,
...

instead of weird not-a-json oneliner output.

This would simplify result comparison (diff) between multiple implementations.
This would also enable parsing of results - think of import into database or e.g. database-based implementation.

The text was updated successfully, but these errors were encountered:

gunnarmorling · 2024-01-02T19:51:46Z

I agree that this would have been the better way. I'm hesitating though to change the format at this point, to keep different implementations comparable.

lluismf · 2024-01-02T21:33:15Z

The weird not-a-json oneliner is just the map serialized. Writing a CSV output would be extra cost.

AlexanderYastrebov · 2024-01-02T21:52:58Z

@lluismf You are right, serializing result in a portable format is a game-changer and will severely affect processing performance of 10^9 input rows 👍

gunnarmorling · 2024-01-03T10:14:40Z

Hey, let's keep it friendly :)

I agree that a different output format wouldn't make any difference perf-wise in the grand scheme of things and as said, it would have been the better choice. But I don't think it's that much of an issue to justify changing this while the challenge is running.

So I'd keep this as-is for the time being, and consider it a lesson learned for whenever another challenge of this kind is happening. Thanks all!

It is useful to debug differences between implemenattions, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For gunnarmorling#14

It is useful to debug differences between implementations, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For gunnarmorling#14

It is useful to debug differences between implementations, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For #14

As an exercise I've re-implemented gunnarmorling#375 idea to compare result numbers with a tolerance: ``` $ ./test.sh baseline 0 src/test/resources/samples/measurements-rounding-precise.txt Validating calculate_average_baseline.sh -- src/test/resources/samples/measurements-rounding-precise.txt Rounding=14.6/25.5/33.6 != Rounding=14.6/25.4/33.6 (avg) ``` The hassle of parsing highlights the importance of using machine-readable format as suggested in gunnarmorling#14 **But** I think this approach is wrong, the rules should define rounding mode and baseline should be fixed to produce the correct result. Submissions should be fixed accordingly or qualified as non-passing. Could have been done earlier but better late than never. Updates gunnarmorling#49

AlexanderYastrebov mentioned this issue Jan 3, 2024

memory mapped files, branchless parsing, bitwiddle magic #5

Merged

gunnarmorling closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2024

AlexanderYastrebov mentioned this issue Jan 3, 2024

Add a script to transform output into CSV format #36

Merged

AlexanderYastrebov mentioned this issue Jan 14, 2024

Compare outputs with tolerance #375

Closed

AlexanderYastrebov mentioned this issue Jan 14, 2024

Compare results with tolerance #390

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use CSV output format #14

Use CSV output format #14

AlexanderYastrebov commented Jan 2, 2024

gunnarmorling commented Jan 2, 2024

lluismf commented Jan 2, 2024

AlexanderYastrebov commented Jan 2, 2024

gunnarmorling commented Jan 3, 2024

Use CSV output format #14

Use CSV output format #14

Comments

AlexanderYastrebov commented Jan 2, 2024

gunnarmorling commented Jan 2, 2024

lluismf commented Jan 2, 2024

AlexanderYastrebov commented Jan 2, 2024

gunnarmorling commented Jan 3, 2024