Specializing Functional Interfaces#11302
Specializing Functional Interfaces#11302ameyaKetkar wants to merge 794 commits intoprestodb:masterfrom ameyaKetkar:master
Conversation
|
Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed. If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks! |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
|
Hello, |
|
I was able to understand what you are doing here after reading the actual code changes. However, I would hope that the commit message be a bit more informative. Can you improve it please? We would also like to see the changes to Block interface be in a separate commit. @electrum and I believe that the changes otherwise look reasonable. |
|
What's the purpose of this change? I find that it reduces readability in many cases. A casual reader can easily understand what |
|
we use |
|
@haozhun - I have split the commit into 2, with one dealing with |
|
@martint Your example is comparing two different interfaces (function vs consumer). These would be equivalents:
|
Do you have some particular numbers in mind? Is it 10%, 1%, 0.1%? |
|
David, I know. I wasn’t comparing those two interfaces directly, but the general shape and familiarity of their names (just picked the first ones I ran into as I skimmed the code) |
It only exists for CURRENT_PATH and should not be exposed to plugins.
Add separate final task info listener to ensure correct final info visibility.
This reverts commit 353ecee, which containes a temporary work around for a leak caused when the final query info event is not fired for failed queries.
The presto-jdbc artifact shades the resulting jar and creates a new source jar. There is a bug in maven (MSHADE-195 and MSHADE-259) that causes this source artifact to be deployed twice. This is not a problem for SNAPSHOT or releases to a staging repo but it will fail for releases to a regular repo where a non-SNAPSHOT version can not be overwritten. This fixes the deployment to only deploy the source repository once. With this fix, a deployment e.g. to an internal repo succeeds.
If a query fails before execution is created, catalog events may not be processed and are not added to the query events.
Presto builds hashtable for MapBlocks eagerly when constructing the MapBlock even it's not needed in the query. Building a hashtable could take up to 40% CPU of the scan cost on a map column. This commit defers the hashtable build to the time it's needed in SeekKey(). Note that we only do this to the MapBlock, not the MapBlockBuilder to avoid complex synchronization problems. The MapBlockBuilder will always build the hashtable. As the result MergingPageOutput and PartitionOutputOperator will still rebuild the hashtables when needed. The measurements shows there will be less than 10% pages for MergingPageOutput to build the hashtables. We will have a seperate PR to improve PartitionOutput and avoid rebuilding the pages so as to avoid hashtable rebuilding. Simple select checsum queries show over 40% CPU gain: Test | After | Before | Improvement select 2 map columns checksum | 11.69d | 20.06d | 42% Select 1 map column checksum | 9.67d | 17.73d | 45%
Building hashtable in map column uses 30% of CPU out of the scan cost on that column, and Math.floorMod() uses 60% CPU out of building hashtable. We tested 5 custom range reduction functions and chose computePositionWithBitShifting(), which is one of the top 3 implementations in performance and does not require the hash table size to be power of 2. It can reduce the uniform distributed 64-bit hashcode to [0, hashTableSize) in uniform distribution. After the change buildHashTable improved 31% and the total CPU for the test select map query improved 14%. The following table shows the CPU percentage for computePosition, getHashPosition and buildHashTable and the total CPU hours for a select checksum query on a map column. computePosition% getHashPosition% buildHashTable% Total CPU Hours computePositionWithMod 16.62 21.89 28.04 14.37 computePositionWithMask 0.51 9.53 14.44 12.24 computePositionWithBitShifting 4.61 11.66 18.97 12.32 computePositionWithFloorMod 16.22 22.03 27.33 14.21 The JMH benchmark results on JDK 10: Benchmark Mode Cnt Score Error Units BenchmarkComputePosition.computePositionWithBitShifting avgt 30 2.954 ± 0.122 ns/op BenchmarkComputePosition.computePositionWithDivision avgt 30 2.852 ± 0.145 ns/op BenchmarkComputePosition.computePositionWithFloorMod avgt 30 11.212 ± 0.215 ns/op BenchmarkComputePosition.computePositionWithMask avgt 30 2.455 ± 0.054 ns/op BenchmarkComputePosition.computePositionWithMod avgt 30 4.490 ± 0.113 ns/op The performance on JDK 8 and 11 are similar for all methods.
This reverts commit e7b4b41.
Currently stats and costs are pre-computed for every query. There are still some bugs in stats calculator that are being triggered in very rare cases that are incredibly hard to catch. Given that the stats calculator has to be run for every query, it is better to be fail safe. As a failure in stats calculation must not result in failing query.
Additionally, make glue tests command follow the convention of such commands in .travis.yml file.
These tests used to not pass on travis due to low amount of (memory) resources. Now it looks they are fine.
It does not influence testing automation, because maven is running in parallel already. However, it speeds up vastly manual generation of documentation with: `make clean html`
The column shows the coordinator ID. This is not useful. Also, in the table that lists queries, it can be mistaken as a node which executes the query. When we support multiple coordinator, we can add the column back, perhaps under some more meaningful name.
This also fixes a query failure during the new year period caused by the time zone in TestingSession (used by Presto) being different from the JVM time zone (used by H2).
Predicates can be pushed below windows if the predicate is constant for all rows within each partition of the windowing operation. The most tractable case where this is true is if the predicate is a deterministic expression where all the input symbols are also partition symbols.
…nct_aggregation is turned on.
When running queries with presto, if the underlying table is of type varchar, the predicate gets pushed. If it's of type char, it will NOT get pushed, which causes presto to get the whole table, and then filter locally, which is a big pain (and bad performance). Using Presto PR #4842 as a guide, I've added support for char types to also be passed down. Unit tests also updated for the type following the pattern. Also integration tested against a mysql-like backend.
Turning dictionary into direct in ORC writer can end up with writing huge data. eb6028d tries to fix this by checking the written size after converting each row group. However, in certain cases a single row group already exceed 2G after converting into direct. For example, a varchar column repeated with a single value with length around 500k. This commit fixed this by checking the written size after each batch of write.
This change adds the documentation for the distributed memory-related config options to the properties docs, and adds a new separate section for the memory management related properties.
StageStateMachine getStageInfo is never called with child stages.
This reverts 1) commit ad05dcb. 2) commit 23de11f. PR #11791 (commit 23de11f and ad05dcb), which lazily builds the hashtables for maps, introduced a regression for the case where the MapBlock is created through AbstractMapBlock.getRegion(). The hashtables built on the MapBlock region were not updated in the original MapBlock, thus causing hashtables repeatedly being built on the same base MapBlock.
Previously, all tasks are proactively created when schedule is blocked on SPLIT_QUEUES_FULL to prevent potential deadlock in the presence of broadcast join. Such deadlock usually happens with SPLIT_QUEUES_FULL. However, it could also happen with NO_ACTIVE_DRIVER_GROUP, when the probe side of broadcast join enables grouped execution, and partitions on the probe side table are small. So lifespan can finish split schedule without blocked by SPLIT_QUEUES_FULL. This is a regression introduced by 55c1001. Prior to this commit, all tasks are created unconditionally. The fix is to finish task creation when grouped execution is enabled, since all tasks are created at beginning anyway in this case. (assume the number of lifespans is larger than the task pool size)
|
This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions! |
This PR is generated by a tool I am working on. I tested the tool upon this project, and found these results. It specializes functional interfaces for example
Function<Double,Double>toDoubleUnaryOperatororConsumer<Object,Long>toObjLongConsumer.This change eliminates auto-boxing overhead, improving performance and type safety.
Thank you for making your source-code available on GitHub. :)