-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CSV output format #14
Comments
I agree that this would have been the better way. I'm hesitating though to change the format at this point, to keep different implementations comparable. |
The weird not-a-json oneliner is just the map serialized. Writing a CSV output would be extra cost. |
@lluismf You are right, serializing result in a portable format is a game-changer and will severely affect processing performance of 10^9 input rows 👍 |
Hey, let's keep it friendly :) I agree that a different output format wouldn't make any difference perf-wise in the grand scheme of things and as said, it would have been the better choice. But I don't think it's that much of an issue to justify changing this while the challenge is running. So I'd keep this as-is for the time being, and consider it a lesson learned for whenever another challenge of this kind is happening. Thanks all! |
It is useful to debug differences between implemenattions, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For gunnarmorling#14
It is useful to debug differences between implementations, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For gunnarmorling#14
It is useful to debug differences between implementations, e.g.: ```sh $ while ./create_measurements.sh 1000 && diff <(./calculate_average_royvanrijn.sh 2>/dev/null | ./tocsv.sh) <(./calculate_average.sh 2>/dev/null | ./tocsv.sh) ; do echo OK; done Created file with 1,000 measurements in 50 ms 60c60 < Bucharest;-2.9;2.9;6.1 --- > Bucharest;-2.9;2.8;6.1 265c265 < Petropavlovsk-Kamchatsky;0.9;9.3;17.7 --- > Petropavlovsk-Kamchatsky;0.9;9.2;17.7 ``` For #14
As an exercise I've re-implemented gunnarmorling#375 idea to compare result numbers with a tolerance: ``` $ ./test.sh baseline 0 src/test/resources/samples/measurements-rounding-precise.txt Validating calculate_average_baseline.sh -- src/test/resources/samples/measurements-rounding-precise.txt Rounding=14.6/25.5/33.6 != Rounding=14.6/25.4/33.6 (avg) ``` The hassle of parsing highlights the importance of using machine-readable format as suggested in gunnarmorling#14 **But** I think this approach is wrong, the rules should define rounding mode and baseline should be fixed to produce the correct result. Submissions should be fixed accordingly or qualified as non-passing. Could have been done earlier but better late than never. Updates gunnarmorling#49
Use proper CSV (or semicolon-separated for that matter) output format
instead of weird not-a-json oneliner output.
This would simplify result comparison (diff) between multiple implementations.
This would also enable parsing of results - think of import into database or e.g. database-based implementation.
The text was updated successfully, but these errors were encountered: