[native pos] Remove busy-waiting-loop to reduce the CPU usage#19926
[native pos] Remove busy-waiting-loop to reduce the CPU usage#19926miaoever merged 1 commit intoprestodb:masterfrom
Conversation
0df17ed to
4202de7
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@miaoever Thanks.
@shrinidhijoshi Would you take a look as well?
...k-base/src/test/java/com/facebook/presto/spark/execution/http/TestPrestoSparkHttpClient.java
Outdated
Show resolved
Hide resolved
shrinidhijoshi
left a comment
There was a problem hiding this comment.
LGTM! Thanks for investigating this @miaoever .
There was a problem hiding this comment.
I wonder if we can simplify this code by writing 2 different iterators - 1. shuffleMap task and 2. for result task,
Reason being
- Behavior of Shuffle Map task is fundamentally different and is more straight forward (wait for
CompletableFuturereturned byNativeExecutionTask) Most of the stage in all of our prod workload would use the ShuffleMap iterator as that is the 99% use-case.
I am concerned the multiple wait()/notifyAll() pattern here might be hiding behaviors that are even harder to spot than the current busy-wait bug
There was a problem hiding this comment.
Thanks for the reviews @shrinidhijoshi.
Technically waiting for CompletableFuture (CompletableFuture::get()) is also a busy-waiting which is a not good idea to have it on the main thread.
On the other hand, for our scenarios - asynchronous communication between different threads/processes, I believe the wait()/notifyAll() is the right java primitives to use if we can't use more built-in high level blocks (e.g. blockingQueue etc.)
4202de7 to
c122f4f
Compare
|
@miaoever has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
The busy-waiting-loop in PrestoSparkNativeTaskExecutorFactory is running in spark executor main thread during task execution which consuming up to 40% of spark executor JVM CPU in our local profiling. In order to reduce the CPU usage, We make the task main thread waiting on two signals (task finish and output is ready) rather than busy waiting. We also removed unsafe field in the same class (functionAndTypeManager). As the results, in our test run in cluster, we're seeing the same query using 30% less CPU time in total query CPU usage.
c122f4f to
ab78ff6
Compare
|
@miaoever has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
The busy-waiting-loop in PrestoSparkNativeTaskExecutorFactory is running in spark executor main thread during task execution which consuming up to 40% of spark executor JVM CPU in our local profiling. In order to reduce the CPU usage, We make the task main thread waiting on two signals (task finish and output is ready) rather than busy waiting. We also removed unused field in the same class (functionAndTypeManager).
As the results, in our test run in cluster, we're seeing the same query using 30% less CPU time in total query CPU usage.