[native pos] Change from Operator to Task Level Adapter#19799
[native pos] Change from Operator to Task Level Adapter#19799shrinidhijoshi merged 6 commits intoprestodb:masterfrom
Conversation
32877cc to
7736270
Compare
bf47c9d to
ae965eb
Compare
044f699 to
095314c
Compare
mbasmanova
left a comment
There was a problem hiding this comment.
@shrinidhijoshi Love this change. Skimmed the first commit and make some minor comments. Overall, I find this change well written with nice documentation and easy to follow code. Thank you for taking the time to put this in a such beautiful shape.
There was a problem hiding this comment.
What is this change about? Seems unrelated.
There was a problem hiding this comment.
While this class was being used by NativeExecutionOperator, it was assumed that getTaskInfo will always be called while task is running.
The way the new PrestoSparkNativeTaskOutputIterator is implemented, that is not the case. We stop() the task first and then do a final query to get TaskInfo for statistics (check completeTask function), in which case we need to check if there was an exception before throwing.
I can change that implementation to extract TaskInfo before stopping, but it seemed like improving this check was a better approach.
Not leaning strongly either way, let me know if you have a preference
There was a problem hiding this comment.
Should this not include a port?
There was a problem hiding this comment.
The port is selected dynamically by the NativeExecutionProcess/DetachedNativeExecutionProcess so we don't need to specify it here.
presto-spark-base/src/test/java/com/facebook/presto/spark/PrestoSparkQueryRunner.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/facebook/presto/spark/execution/PrestoSparkNativeTaskExecutorFactory.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/facebook/presto/spark/execution/PrestoSparkNativeTaskExecutorFactory.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/facebook/presto/spark/execution/PrestoSparkNativeTaskExecutorFactory.java
Outdated
Show resolved
Hide resolved
.../src/main/java/com/facebook/presto/spark/execution/PrestoSparkNativeTaskExecutorFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
This is quite unfortunate. But, why do we need every executor to log plan fragment? Shouldn't we have driver log these once?
There was a problem hiding this comment.
PlanFragment log in executor aids debugging to easily spot which fragment is being executed. Also, given internally at Meta, executors logs always contains logs of ALL the tasks run on that executor v/s just that task, it is hard to visually draw boundaries on these logs without plan fragment logging.
.../src/main/java/com/facebook/presto/spark/execution/PrestoSparkNativeTaskExecutorFactory.java
Outdated
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/BasePlanFragmenter.java
Outdated
Show resolved
Hide resolved
mbasmanova
left a comment
There was a problem hiding this comment.
@shrinidhijoshi Very happy to see this change. Let's get this in.
14fe76c to
5003d8d
Compare
7e23ec8 to
8227516
Compare
Add a new class PrestoSparkNativeTaskExecutorFactory which is responsible for executing native task remotely. It contains logic to start a native task, monitor for results and handle any exceptions that occur during task execution This is a singleton class that is selectively invoked only while executing a PrestoSparkNativeTaskRDD
Currently, for native execution, we re-write the plan fragment to add a NativeExecutionNode as the root to encapsulate the execution of native task. We do not need this anymore as the newly added PrestoSparkNativeTaskExecutorFactory will transparently forward the PlanFragment as is to CPP process
…xecutorFactory Current integration of metrics for presto-spark-native into spark metrics relies on extracting Operator level info from TaskInfo objects from Native Process. After the introduction of PrestoSparkNativeTaskExecutorFactory, we no longer use NativeExecutionOperator. So instead we can directly extract metrics from TaskInfo object.
Currently, there are no usages of NativeExecutionNode and NativeExecutionOperator after the introduction of PrestoSparkNativeTaskExecutorFactory. This commit removes them and other dormant/unused code that handles these classes.
We no longer use NativeExecutionInfo as we have all the task stats captured correctly as part of TaskInfo objects returned from Cpp process. This commit removes this class
This helps provide a hook for PrestoSparkNativeTaskExecutoryFactory to shutdown the native process
8227516 to
b5cf219
Compare
Summary
This PR introduces below changes
PrestoSparkNativeTaskExecutorFactorysingleton class (executor lifetime). This class is responsible for executingPrestoSparkNativeTaskRDDs.NativeExecutionNode,NativeExecutionOperator,NativeExecutionInfoand corresponding plan rewrites and special handlingDesign
The new
PrestoSparkNativeTaskExecutorFactorydoes below stepsBatchTaskUpdateInfoobjectIteratorto Spark layer, whichpages or taskCompletion from the CPP processTaskInfoAccumulatorto propagate statisticsImplementation notes/caveats
The current implementation does not introduce any new quirks / caveats. Rather exposes some from the old adapter
NativeExecutionProcess,NativeExecutionTask,NativeExecutionProcessFactory, etc)dummyRemoteSourceTaskIdwhen generating shuffle read splits. More details in commentsTest Plan