Add block flattening#12792
Conversation
|
Putting this for early review. CC: @yingsu00 who has a dependency on this work. |
cc517ea to
e554b66
Compare
49cd169 to
ca7668f
Compare
|
Later commits add I've opened #12838 which will allow us to remove this check within Presto. |
mbasmanova
left a comment
There was a problem hiding this comment.
@tdcmeehan Tim, thanks for taking the time to explore ways to improve performance without sacrificing long term maintainability of the code. I have some questions about BlockLease.
-
It appears that
flattenis copying the data even for "flat" blocks - is this intentional? Why not return the internal array as is? It doesn't look like the caller can write into these arrays. -
I'm thinking about a lease as something that expires on its own. However, one needs to explicitly call
BlockLease#close()(perhaps, via the use of try-with-resources) or lease to "expire". Furthermore, "ClosingBlockLease#get()" continues to "work" even after the lease is closed. Perhaps, add some more comments toflattenorBlockLeaseto clarify howBlockLeaseis supposed to be used, e.g. that it must be closed explicitly when not needed anymore.
59be5d6 to
feeab0c
Compare
|
This has gone through some substantial changes.
|
f59bc76 to
a870431
Compare
|
@mbasmanova rebased |
a870431 to
5d962e8
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@tdcmeehan Add UncheckedBlock commit looks good % a question about DictionaryBlock below.
mbasmanova
left a comment
There was a problem hiding this comment.
@tdcmeehan Some questions about BlockFlattener.
5d962e8 to
27375a5
Compare
|
@mbasmanova thank you for the review, comments addressed |
mbasmanova
left a comment
There was a problem hiding this comment.
@tdcmeehan Great work!
27375a5 to
6bb2179
Compare
|
I think this review should have at least one other approval. Any takers? |
wenleix
left a comment
There was a problem hiding this comment.
"Add UncheckedBlock"
LGTM % nits. Maybe also copy some comment in the UncheckedBlock class into the commit message to explain the motivation as well ? :)
There was a problem hiding this comment.
"Add BlockFlattener"
Looks good. Most comments are nit except #12792 (comment) . Let's talk quickly in person -- I might miss something :)
There was a problem hiding this comment.
nit: Current use case all only needs one Closer? Why it's a ... ?
Also, why not taking a List<Closer> as input ?
6bb2179 to
4f750f9
Compare
Large gains are shown for deeply nested dictionaries. There is a slight regression for singly or non-nested structures. Benchmark (blockSize) (nestedLevel) (numberOfIterations) (reuseArrays) Mode Cnt Score Error Units BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 1 false avgt 30 0.403 ± 0.003 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 1 true avgt 30 0.417 ± 0.010 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 10 false avgt 30 4.065 ± 0.055 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 10 true avgt 30 4.015 ± 0.064 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 100 false avgt 30 39.671 ± 0.844 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 100 true avgt 30 39.678 ± 0.651 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 1000 false avgt 30 399.884 ± 5.295 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 1 1000 true avgt 30 399.918 ± 6.009 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 1 false avgt 30 0.836 ± 0.020 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 1 true avgt 30 0.819 ± 0.008 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 10 false avgt 30 8.177 ± 0.223 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 10 true avgt 30 8.214 ± 0.236 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 100 false avgt 30 84.060 ± 2.272 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 100 true avgt 30 80.598 ± 0.503 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 1000 false avgt 30 839.950 ± 23.786 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 2 1000 true avgt 30 810.260 ± 5.665 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 1 false avgt 30 1.852 ± 0.027 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 1 true avgt 30 1.663 ± 0.044 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 10 false avgt 30 18.403 ± 1.197 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 10 true avgt 30 16.235 ± 0.142 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 100 false avgt 30 182.208 ± 6.084 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 100 true avgt 30 163.570 ± 4.110 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 1000 false avgt 30 1814.377 ± 30.751 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 3 1000 true avgt 30 1656.627 ± 59.170 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 1 false avgt 30 2.875 ± 0.102 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 1 true avgt 30 2.881 ± 0.030 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 10 false avgt 30 28.070 ± 0.339 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 10 true avgt 30 29.420 ± 1.174 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 100 false avgt 30 282.978 ± 5.391 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 100 true avgt 30 287.730 ± 5.303 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 1000 false avgt 30 2918.489 ± 105.837 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 4 1000 true avgt 30 2869.990 ± 30.173 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 1 false avgt 30 4.031 ± 0.121 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 1 true avgt 30 4.015 ± 0.102 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 10 false avgt 30 38.307 ± 1.109 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 10 true avgt 30 39.059 ± 0.656 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 100 false avgt 30 387.400 ± 12.490 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 100 true avgt 30 392.756 ± 8.925 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 1000 false avgt 30 3855.306 ± 128.000 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000 5 1000 true avgt 30 3989.636 ± 116.716 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 1 false avgt 30 3.768 ± 0.057 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 1 true avgt 30 3.751 ± 0.049 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 10 false avgt 30 38.366 ± 1.013 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 10 true avgt 30 37.335 ± 0.777 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 100 false avgt 30 276.369 ± 2.593 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 100 true avgt 30 276.398 ± 5.349 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 1000 false avgt 30 2769.720 ± 64.044 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 1 1000 true avgt 30 2767.396 ± 43.910 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 1 false avgt 30 7.729 ± 0.227 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 1 true avgt 30 7.756 ± 0.218 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 10 false avgt 30 78.450 ± 1.952 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 10 true avgt 30 77.823 ± 1.939 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 100 false avgt 30 766.068 ± 17.384 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 100 true avgt 30 768.810 ± 14.559 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 1000 false avgt 30 7540.602 ± 45.159 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 2 1000 true avgt 30 7631.610 ± 165.614 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 1 false avgt 30 17.238 ± 0.540 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 1 true avgt 30 13.192 ± 1.073 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 10 false avgt 30 186.183 ± 6.667 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 10 true avgt 30 124.057 ± 3.779 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 100 false avgt 30 1869.975 ± 76.606 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 100 true avgt 30 1226.599 ± 29.137 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 1000 false avgt 30 18821.154 ± 506.016 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 3 1000 true avgt 30 12317.860 ± 391.818 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 1 false avgt 30 26.196 ± 0.476 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 1 true avgt 30 17.378 ± 0.732 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 10 false avgt 30 271.300 ± 8.528 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 10 true avgt 30 177.687 ± 5.405 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 100 false avgt 30 2636.603 ± 87.603 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 100 true avgt 30 1717.128 ± 17.499 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 1000 false avgt 30 24616.602 ± 1691.644 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 4 1000 true avgt 30 17400.347 ± 447.709 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 1 false avgt 30 35.381 ± 0.595 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 1 true avgt 30 23.430 ± 0.898 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 10 false avgt 30 359.595 ± 15.979 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 10 true avgt 30 297.779 ± 37.398 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 100 false avgt 30 3570.755 ± 56.462 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 100 true avgt 30 2990.641 ± 365.138 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 1000 false avgt 30 33452.285 ± 1609.984 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 10000 5 1000 true avgt 30 33487.113 ± 1950.377 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 1 false avgt 30 37.182 ± 0.546 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 1 true avgt 30 36.911 ± 0.219 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 10 false avgt 30 275.444 ± 5.738 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 10 true avgt 30 276.957 ± 7.651 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 100 false avgt 30 2734.018 ± 22.684 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 100 true avgt 30 2864.532 ± 106.213 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 1000 false avgt 30 29273.519 ± 760.995 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 1 1000 true avgt 30 28482.173 ± 617.647 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 1 false avgt 30 76.496 ± 2.953 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 1 true avgt 30 78.476 ± 2.771 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 10 false avgt 30 755.749 ± 4.794 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 10 true avgt 30 763.532 ± 25.618 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 100 false avgt 30 7626.038 ± 211.631 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 100 true avgt 30 7771.801 ± 187.217 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 1000 false avgt 30 77617.182 ± 2523.580 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 2 1000 true avgt 30 74860.365 ± 1372.668 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 1 false avgt 30 149.574 ± 7.353 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 1 true avgt 30 138.826 ± 5.115 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 10 false avgt 30 1437.860 ± 39.946 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 10 true avgt 30 1366.731 ± 29.757 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 100 false avgt 30 15030.690 ± 633.808 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 100 true avgt 30 13766.177 ± 305.617 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 1000 false avgt 30 151408.815 ± 6173.241 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 3 1000 true avgt 30 135822.840 ± 2826.602 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 1 false avgt 30 226.332 ± 9.970 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 1 true avgt 30 179.636 ± 3.842 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 10 false avgt 30 2259.908 ± 88.110 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 10 true avgt 30 1820.587 ± 55.840 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 100 false avgt 30 22380.453 ± 801.705 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 100 true avgt 30 18153.183 ± 694.295 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 1000 false avgt 30 229260.524 ± 6595.133 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 4 1000 true avgt 30 176891.133 ± 3295.354 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 1 false avgt 30 306.627 ± 13.364 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 1 true avgt 30 235.543 ± 3.115 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 10 false avgt 30 3045.404 ± 159.998 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 10 true avgt 30 2374.803 ± 62.193 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 100 false avgt 30 30360.151 ± 1590.780 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 100 true avgt 30 23584.320 ± 654.287 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 1000 false avgt 30 304459.760 ± 8288.211 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 100000 5 1000 true avgt 30 234184.751 ± 4853.699 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 1 false avgt 30 269.851 ± 2.849 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 1 true avgt 30 277.722 ± 10.083 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 10 false avgt 30 2720.098 ± 59.047 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 10 true avgt 30 2715.279 ± 41.711 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 100 false avgt 30 27702.283 ± 862.337 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 100 true avgt 30 27552.290 ± 767.291 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 1000 false avgt 30 272543.370 ± 6539.963 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 1 1000 true avgt 30 274771.545 ± 7305.585 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 1 false avgt 30 773.679 ± 27.841 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 1 true avgt 30 800.950 ± 36.993 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 10 false avgt 30 7642.878 ± 86.400 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 10 true avgt 30 7749.828 ± 239.371 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 100 false avgt 30 77072.284 ± 2584.736 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 100 true avgt 30 78794.303 ± 2343.048 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 1000 false avgt 30 757348.524 ± 7009.957 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 2 1000 true avgt 30 771420.224 ± 18225.074 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 1 false avgt 30 1825.609 ± 23.650 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 1 true avgt 30 1361.094 ± 49.297 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 10 false avgt 30 18481.728 ± 691.964 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 10 true avgt 30 14063.485 ± 595.612 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 100 false avgt 30 189929.949 ± 6328.067 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 100 true avgt 30 135608.117 ± 3899.327 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 1000 false avgt 30 1882899.844 ± 33034.454 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 3 1000 true avgt 30 1351588.956 ± 34878.905 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 1 false avgt 30 2891.851 ± 93.932 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 1 true avgt 30 2045.241 ± 108.149 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 10 false avgt 30 28336.851 ± 215.016 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 10 true avgt 30 20259.337 ± 659.981 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 100 false avgt 30 288446.453 ± 8957.803 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 100 true avgt 30 197057.674 ± 2924.530 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 1000 false avgt 30 2925820.257 ± 48452.527 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 4 1000 true avgt 30 2014881.700 ± 40542.938 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 1 false avgt 30 3841.607 ± 41.968 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 1 true avgt 30 2544.018 ± 90.077 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 10 false avgt 30 39724.433 ± 1294.749 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 10 true avgt 30 25400.889 ± 880.818 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 100 false avgt 30 399275.509 ± 9798.709 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 100 true avgt 30 264695.903 ± 10576.511 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 1000 false avgt 30 3945066.705 ± 61561.876 us/op BenchmarkBlockFlattener.benchmarkWithFlatten 1000000 5 1000 true avgt 30 2523678.085 ± 49393.545 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 1 1 N/A avgt 30 0.268 ± 0.003 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 1 10 N/A avgt 30 2.642 ± 0.030 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 1 100 N/A avgt 30 26.310 ± 0.216 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 1 1000 N/A avgt 30 262.681 ± 2.939 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 2 1 N/A avgt 30 0.601 ± 0.015 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 2 10 N/A avgt 30 6.021 ± 0.183 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 2 100 N/A avgt 30 59.380 ± 0.883 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 2 1000 N/A avgt 30 604.837 ± 17.722 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 3 1 N/A avgt 30 3.245 ± 0.069 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 3 10 N/A avgt 30 31.864 ± 0.795 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 3 100 N/A avgt 30 301.078 ± 6.522 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 3 1000 N/A avgt 30 3032.573 ± 87.627 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 4 1 N/A avgt 30 6.538 ± 0.090 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 4 10 N/A avgt 30 65.491 ± 1.767 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 4 100 N/A avgt 30 632.193 ± 14.034 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 4 1000 N/A avgt 30 6466.004 ± 215.846 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 5 1 N/A avgt 30 8.250 ± 0.519 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 5 10 N/A avgt 30 82.602 ± 4.585 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 5 100 N/A avgt 30 788.154 ± 14.086 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000 5 1000 N/A avgt 30 8003.193 ± 158.694 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 1 1 N/A avgt 30 2.661 ± 0.058 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 1 10 N/A avgt 30 26.426 ± 0.323 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 1 100 N/A avgt 30 270.514 ± 6.150 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 1 1000 N/A avgt 30 2678.311 ± 89.922 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 2 1 N/A avgt 30 5.982 ± 0.133 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 2 10 N/A avgt 30 59.149 ± 0.509 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 2 100 N/A avgt 30 605.708 ± 16.240 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 2 1000 N/A avgt 30 6045.093 ± 176.226 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 3 1 N/A avgt 30 31.965 ± 0.441 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 3 10 N/A avgt 30 303.013 ± 9.062 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 3 100 N/A avgt 30 3024.951 ± 141.624 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 3 1000 N/A avgt 30 29896.855 ± 302.799 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 4 1 N/A avgt 30 64.143 ± 1.225 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 4 10 N/A avgt 30 632.337 ± 6.325 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 4 100 N/A avgt 30 6296.469 ± 56.195 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 4 1000 N/A avgt 30 63579.125 ± 1643.570 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 5 1 N/A avgt 30 79.397 ± 3.233 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 5 10 N/A avgt 30 798.502 ± 17.367 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 5 100 N/A avgt 30 7981.766 ± 191.042 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 10000 5 1000 N/A avgt 30 78684.362 ± 1221.119 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 1 1 N/A avgt 30 26.362 ± 0.418 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 1 10 N/A avgt 30 268.844 ± 7.798 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 1 100 N/A avgt 30 2633.897 ± 24.388 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 1 1000 N/A avgt 30 26612.388 ± 571.446 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 2 1 N/A avgt 30 59.603 ± 0.518 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 2 10 N/A avgt 30 598.655 ± 11.785 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 2 100 N/A avgt 30 6143.158 ± 271.417 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 2 1000 N/A avgt 30 59277.248 ± 688.251 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 3 1 N/A avgt 30 301.184 ± 5.788 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 3 10 N/A avgt 30 3044.561 ± 78.880 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 3 100 N/A avgt 30 30375.165 ± 594.934 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 3 1000 N/A avgt 30 303305.588 ± 5500.633 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 4 1 N/A avgt 30 631.763 ± 4.854 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 4 10 N/A avgt 30 6432.887 ± 239.080 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 4 100 N/A avgt 30 63122.986 ± 375.513 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 4 1000 N/A avgt 30 639784.956 ± 18510.250 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 5 1 N/A avgt 30 793.572 ± 16.623 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 5 10 N/A avgt 30 7936.717 ± 108.213 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 5 100 N/A avgt 30 79989.752 ± 2030.961 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 100000 5 1000 N/A avgt 30 789914.243 ± 17234.871 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 1 1 N/A avgt 30 264.279 ± 3.356 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 1 10 N/A avgt 30 2647.274 ± 31.918 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 1 100 N/A avgt 30 27008.186 ± 938.436 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 1 1000 N/A avgt 30 265239.263 ± 3311.397 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 2 1 N/A avgt 30 622.770 ± 8.581 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 2 10 N/A avgt 30 6266.896 ± 223.059 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 2 100 N/A avgt 30 62229.261 ± 2154.334 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 2 1000 N/A avgt 30 615650.663 ± 17433.933 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 3 1 N/A avgt 30 3013.966 ± 33.382 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 3 10 N/A avgt 30 30639.834 ± 958.698 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 3 100 N/A avgt 30 303535.879 ± 7303.708 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 3 1000 N/A avgt 30 3044234.511 ± 35832.633 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 4 1 N/A avgt 30 6407.718 ± 216.946 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 4 10 N/A avgt 30 63180.529 ± 737.827 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 4 100 N/A avgt 30 637949.563 ± 18867.336 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 4 1000 N/A avgt 30 6436638.885 ± 137358.865 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 5 1 N/A avgt 30 8066.633 ± 303.192 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 5 10 N/A avgt 30 79048.261 ± 910.061 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 5 100 N/A avgt 30 789426.155 ± 13378.167 us/op BenchmarkBlockFlattener.benchmarkWithoutFlatten 1000000 5 1000 N/A avgt 30 8214757.779 ± 149511.952 us/op
4f750f9 to
f860932
Compare
|
Test failure is unrelated |
wenleix
left a comment
There was a problem hiding this comment.
Looks good. Thanks for the great work!
|
Will merge tomorrow unless I hear more feedback. Thanks all for the thorough reviews! |
High level goal
Join, repartitioning, and other future enhancements, need data access with minimal overhead for efficient access. Experiments have shown nontrivial overhead from accessing deeply nested blocks and boundary checks over data which is not elided by the JVM. We want "fast track" data access which trades safety for speed, with the idea that this API may require more care and testing.
New components
BlockFlattener
Returns a Block structure which is nested at most one level. Nested structures, for the time being, are considered Dictionary and RLE blocks. The idea is to flatten nested structures so that they are at most one level deep. This promises better efficiency when operating over tight loops due to a lower cache miss rate.
BlockLease
Repeated array allocation has nontrivial overhead, in particular when combined with block decoding/flattening, as this requires repeated array allocation when rewriting the nested ids map. BlockLease is introduced as an
AutoCloseableresources that is designed to return arrays for later reuse when finished, andSupplier<Block>to access the Block from within the try-with-resources.ArrayAllocator
A simple interface for creating and returning primitive arrays. It may be used with a
BlockLeaseto return an array for reuse when the lease has expired. This class is designed to live at the operator level to avoid repeated array allocation over several iterations of calls to an operator.UncheckedBlock
Access over primitive types over blocks incurs nontrivial overhead from boundary checks which are not elided by the JVM. Access over primitive arrays proves to have the best performance, as for example the JVM will elide most boundary checks in loops to the end of the loops. We would like a compromise where we can get the performance of raw primitive arrays but with the encapsulation and convenience of Blocks.
The idea of UncheckedBlock is to hoist out boundary checks, so we pay the cost once in a loop rather than per item of the loop. To accomplish this, we expose the offset of the block in UncheckedBlock--this provides the start of the iteration. Once the offset is exposed, which is where the initial data lives in the underlying array, we exposed
getXXXUncheckedmethods to provide quick access which is strictly limited to underlying array access. These methods will be inlined by the JVM into direct array access, giving us a good compromise of performance, with the slight inconvenience of a downcast toUncheckedBlock.When complete, any block which is not RLE or Dictionary could be cast to UncheckedBlock, where a parallel API could be used to provide fast track access to the data in the Blocks.
Decisions made