Skip to content

Conversation

@nastra
Copy link
Contributor

@nastra nastra commented Nov 2, 2021

Release Notes: https://arrow.apache.org/release/6.0.0.html
Diff: apache/arrow@apache-arrow-5.0.0...apache-arrow-6.0.0

Benchmark results on bump-arrow branch

Benchmark run: https://github.com/nastra/iceberg/actions/runs/1411344807

Benchmark                                                                 Mode  Cnt   Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   2.487 ± 0.265   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   2.574 ± 0.254   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  15.033 ± 1.561   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5  13.783 ± 1.112   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   4.095 ± 0.364   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   5.119 ± 0.480   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   4.769 ± 0.499   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   4.785 ± 0.707   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   3.925 ± 0.496   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   4.282 ± 0.275   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   3.907 ± 0.371   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   4.449 ± 0.534   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5   7.066 ± 0.239   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5   8.048 ± 0.640   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   3.192 ± 0.341   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   3.056 ± 0.370   s/op
Benchmark                                                                                  Mode  Cnt   Score   Error  Units
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   9.094 ± 0.473   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   7.241 ± 0.351   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  28.554 ± 1.062   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5  23.239 ± 0.564   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   7.397 ± 1.015   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   7.755 ± 0.586   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   7.173 ± 0.628   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   6.653 ± 0.340   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   7.023 ± 0.156   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   7.504 ± 0.347   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   7.180 ± 0.862   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   7.435 ± 0.597   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5   9.306 ± 0.437   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5  13.432 ± 0.449   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   7.675 ± 0.359   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   6.577 ± 0.846   s/op

Benchmark results on master

Benchmark run: https://github.com/nastra/iceberg/actions/runs/1411692594

Benchmark                                                                 Mode  Cnt   Score   Error  Units
VectorizedReadFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   2.112 ± 0.090   s/op
VectorizedReadFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   2.678 ± 0.085   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  10.964 ± 0.829   s/op
VectorizedReadFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5  10.354 ± 0.603   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   4.094 ± 0.342   s/op
VectorizedReadFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   3.429 ± 0.156   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   4.255 ± 0.048   s/op
VectorizedReadFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   4.213 ± 0.244   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   4.296 ± 0.189   s/op
VectorizedReadFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   3.870 ± 0.230   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   3.634 ± 0.189   s/op
VectorizedReadFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   3.974 ± 0.192   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5   7.427 ± 0.142   s/op
VectorizedReadFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5   6.882 ± 0.269   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   2.206 ± 0.050   s/op
VectorizedReadFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   2.617 ± 0.068   s/op
Benchmark                                                                                  Mode  Cnt   Score   Error  Units
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesIcebergVectorized5k         ss    5   9.003 ± 0.272   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDatesSparkVectorized5k           ss    5   7.500 ± 0.348   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsIcebergVectorized5k      ss    5  28.049 ± 1.615   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDecimalsSparkVectorized5k        ss    5  24.715 ± 0.405   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesIcebergVectorized5k       ss    5   7.244 ± 0.600   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readDoublesSparkVectorized5k         ss    5   6.949 ± 0.670   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsIcebergVectorized5k        ss    5   9.109 ± 0.452   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readFloatsSparkVectorized5k          ss    5   7.872 ± 0.517   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersIcebergVectorized5k      ss    5   8.483 ± 0.293   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readIntegersSparkVectorized5k        ss    5   6.782 ± 0.219   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsIcebergVectorized5k         ss    5   7.572 ± 0.210   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readLongsSparkVectorized5k           ss    5   6.340 ± 0.911   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsIcebergVectorized5k       ss    5  10.578 ± 0.629   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readStringsSparkVectorized5k         ss    5  14.291 ± 0.865   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsIcebergVectorized5k    ss    5   8.953 ± 0.883   s/op
VectorizedReadDictionaryEncodedFlatParquetDataBenchmark.readTimestampsSparkVectorized5k      ss    5   7.689 ± 0.638   s/op

@rdblue
Copy link
Contributor

rdblue commented Nov 7, 2021

Thanks for the update, @nastra!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants