Skip to content

Specializing Functional Interfaces#11302

Closed
ameyaKetkar wants to merge 794 commits intoprestodb:masterfrom
ameyaKetkar:master
Closed

Specializing Functional Interfaces#11302
ameyaKetkar wants to merge 794 commits intoprestodb:masterfrom
ameyaKetkar:master

Conversation

@ameyaKetkar
Copy link

@ameyaKetkar ameyaKetkar commented Aug 19, 2018

This PR is generated by a tool I am working on. I tested the tool upon this project, and found these results. It specializes functional interfaces for example Function<Double,Double> to DoubleUnaryOperator or Consumer<Object,Long> to ObjLongConsumer.
This change eliminates auto-boxing overhead, improving performance and type safety.
Thank you for making your source-code available on GitHub. :)

@facebook-github-bot
Copy link
Collaborator

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please sign up at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need the corporate CLA signed.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@facebook-github-bot
Copy link
Collaborator

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@ameyaKetkar
Copy link
Author

Hello,
I have signed the CLA agreement.
Could any one take a look at it ?

@haozhun
Copy link
Contributor

haozhun commented Aug 31, 2018

I was able to understand what you are doing here after reading the actual code changes. However, I would hope that the commit message be a bit more informative. Can you improve it please?

We would also like to see the changes to Block interface be in a separate commit.

@electrum and I believe that the changes otherwise look reasonable.

@martint
Copy link
Contributor

martint commented Aug 31, 2018

What's the purpose of this change? I find that it reduces readability in many cases. A casual reader can easily understand what Function<A, B> means, but ObjLongConsumer is not very intuitive. So, unless there is a distinct performance or type-safety advantage I would prefer we don't go down this path.

@findepi
Copy link
Contributor

findepi commented Sep 1, 2018

we use OptionalInt in the code and it's not uncommon when it hurts (eg. "uh, where is my optional.map(..) method?")
therefore i second @martint 's opinion.

@ameyaKetkar
Copy link
Author

ameyaKetkar commented Sep 5, 2018

@haozhun - I have split the commit into 2, with one dealing with Block interface only. I have also updated the commit message.
@martint - Yes, readability might be a concern for some developers. But this also eliminates auto-boxing giving performance, memory and type-safety benefits. But in my experience of pushing similar changes to many other GitHub projects like Cassandra, CoreNLP I have not observed readability as a concern .
@findepi I agree the specialized Optional api (OptionalInt) does not ship with some important functionality, hence is not a complete replacement of Optional<Integer>. But a developer can always switch back to using Optional<Integer> when required. (The tool I am developing can help in this switching :) ).
But the specialized functional interfaces api (DoubleUnaryOperator) is much better. Majority of them can completely replace their generic counterparts, giving performance benefits for no loss in functionality.

@electrum
Copy link
Contributor

electrum commented Sep 5, 2018

@martint Your example is comparing two different interfaces (function vs consumer). These would be equivalents:

  • Function<Long, T> to LongFunction<T>
  • BiConsumer<T, Long> to ObjLongConsumer<T>

@findepi
Copy link
Contributor

findepi commented Sep 6, 2018

Majority of them can completely replace their generic counterparts, giving performance benefits for no loss in functionality.

Do you have some particular numbers in mind? Is it 10%, 1%, 0.1%?

@martint
Copy link
Contributor

martint commented Sep 6, 2018

David, I know. I wasn’t comparing those two interfaces directly, but the general shape and familiarity of their names (just picked the first ones I ran into as I skimmed the code)

electrum and others added 19 commits November 6, 2018 14:41
It only exists for CURRENT_PATH and should not be exposed to plugins.
Add separate final task info listener to ensure correct final info visibility.
This reverts commit 353ecee, which
containes a temporary work around for a leak caused when the final
query info event is not fired for failed queries.
The presto-jdbc artifact shades the resulting jar and creates a new
source jar. There is a bug in maven (MSHADE-195 and MSHADE-259) that
causes this source artifact to be deployed twice. This is not a problem
for SNAPSHOT or releases to a staging repo but it will fail for releases
to a regular repo where a non-SNAPSHOT version can not be overwritten.

This fixes the deployment to only deploy the source repository
once. With this fix, a deployment e.g. to an internal repo succeeds.
If a query fails before execution is created, catalog events may not be
processed and are not added to the query events.
Presto builds hashtable for MapBlocks eagerly when constructing the
MapBlock even it's not needed in the query. Building a hashtable could
take up to 40% CPU of the scan cost on a map column. This commit defers
the hashtable build to the time it's needed in SeekKey(). Note that we
only do this to the MapBlock, not the MapBlockBuilder to avoid complex
synchronization problems. The MapBlockBuilder will always build the
hashtable. As the result MergingPageOutput and PartitionOutputOperator
will still rebuild the hashtables when needed. The measurements shows
there will be less than 10% pages for MergingPageOutput to build the
hashtables. We will have a seperate PR to improve PartitionOutput
and avoid rebuilding the pages so as to avoid hashtable rebuilding.

Simple select checsum queries show over 40% CPU gain:
Test                          | After  | Before | Improvement
select 2 map columns checksum | 11.69d | 20.06d | 42%
Select 1 map column checksum  |  9.67d | 17.73d | 45%
Building hashtable in map column uses 30% of CPU out of the scan cost
on that column, and Math.floorMod() uses 60% CPU out of building
hashtable. We tested 5 custom range reduction functions and chose
computePositionWithBitShifting(), which is one of the top 3
implementations in performance and does not require the hash table size
to be power of 2. It can reduce the uniform distributed 64-bit hashcode
to [0, hashTableSize) in uniform distribution. After the change
buildHashTable improved 31% and the total CPU for the test select map
query improved 14%.

The following table shows the CPU percentage for computePosition,
getHashPosition and buildHashTable and the total CPU hours for a
select checksum query on a map column.

 			computePosition% getHashPosition% buildHashTable% Total CPU Hours
computePositionWithMod		16.62		21.89		28.04	          14.37
computePositionWithMask		0.51		9.53		14.44	          12.24
computePositionWithBitShifting	4.61		11.66		18.97	          12.32
computePositionWithFloorMod	16.22		22.03		27.33	          14.21

The JMH benchmark results on JDK 10:
Benchmark                                                Mode  Cnt   Score   Error  Units
BenchmarkComputePosition.computePositionWithBitShifting  avgt   30   2.954 ± 0.122  ns/op
BenchmarkComputePosition.computePositionWithDivision     avgt   30   2.852 ± 0.145  ns/op
BenchmarkComputePosition.computePositionWithFloorMod     avgt   30  11.212 ± 0.215  ns/op
BenchmarkComputePosition.computePositionWithMask         avgt   30   2.455 ± 0.054  ns/op
BenchmarkComputePosition.computePositionWithMod          avgt   30   4.490 ± 0.113  ns/op

The performance on JDK 8 and 11 are similar for all methods.
Currently stats and costs are pre-computed for every query.
There are still some bugs in stats calculator that are being
triggered in very rare cases that are incredibly hard to catch.
Given that the stats calculator has to be run for every query, it
is better to be fail safe. As a failure in stats calculation must
not result in failing query.
Additionally, make glue tests command follow the convention of such
commands in .travis.yml file.
kokosing and others added 26 commits January 4, 2019 11:25
These tests used to not pass on travis due to low amount of (memory)
resources. Now it looks they are fine.
It does not influence testing automation, because maven is running
in parallel already. However, it speeds up vastly manual
generation of documentation with: `make clean html`
The column shows the coordinator ID. This is not useful. Also, in the
table that lists queries, it  can be mistaken as a node which executes
the query.

When we support multiple coordinator, we can add the column back,
perhaps under some more meaningful name.
This also fixes a query failure during the new year period caused by
the time zone in TestingSession (used by Presto) being different from
the JVM time zone (used by H2).
Predicates can be pushed below windows if the predicate is constant for all rows
within each partition of the windowing operation. The most tractable case where
this is true is if the predicate is a deterministic expression where all the
input symbols are also partition symbols.
When running queries with presto, if the underlying
table is of type varchar, the predicate gets pushed.

If it's of type char, it will NOT get pushed, which
causes presto to get the whole table, and then filter
locally, which is a big pain (and bad performance).

Using Presto PR #4842 as a guide, I've added support
for char types to also be passed down.  Unit tests
also updated for the type following the pattern.
Also integration tested against a mysql-like backend.
Turning dictionary into direct in ORC writer can end up with writing
huge data.

eb6028d tries to fix this by
checking the written size after converting each row group. However,
in certain cases a single row group already exceed 2G after converting
into direct. For example, a varchar column repeated with a single
value with length around 500k.

This commit fixed this by checking the written size after each batch of
write.
This change adds the documentation for the distributed memory-related
config options to the properties docs, and adds a new separate section
for the memory management related properties.
StageStateMachine getStageInfo is never called with child stages.
This reverts
1) commit ad05dcb.
2) commit 23de11f.

PR #11791 (commit 23de11f and ad05dcb), which lazily builds the
hashtables for maps, introduced a regression for the case where the
MapBlock is created through AbstractMapBlock.getRegion(). The hashtables
built on the MapBlock region were not updated in the original MapBlock,
thus causing hashtables repeatedly being built on the same base MapBlock.
Previously, all tasks are proactively created when schedule
is blocked on SPLIT_QUEUES_FULL to prevent potential deadlock in the
presence of broadcast join.

Such deadlock usually happens with SPLIT_QUEUES_FULL.
However, it could also happen with NO_ACTIVE_DRIVER_GROUP,
when the probe side of broadcast join enables grouped execution,
and partitions on the probe side table are small.
So lifespan can finish split schedule without blocked by
SPLIT_QUEUES_FULL.

This is a regression introduced by 55c1001.
Prior to this commit, all tasks are created unconditionally.

The fix is to finish task creation when grouped execution is enabled,
since all tasks are created at beginning anyway in this case.
(assume the number of lifespans is larger than the task pool size)
@stale
Copy link

stale bot commented Jul 23, 2019

This pull request has been automatically marked as stale because it has not had recent activity. If you'd still like this PR merged, please comment on the task, make sure you've addressed reviewer comments, and rebase on the latest master. Thank you for your contributions!

@stale stale bot added the stale label Jul 23, 2019
@stale stale bot closed this Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.