Add Optimizer for validating and optimizing#19050
Conversation
70aef3d to
4202c5e
Compare
The LogicalPlanner performs various functions including creating, validating, and optimizing plan nodes. The creation of plan nodes are dependent on the analyzer artifacts, while validation and optimization are core engine responsibilities. To separate these responsibilities, we are introducing an optimizer class to manage the validation and optimization of plan nodes. This change also adds a runtime stats optimizerTime for measuring time taken to validate and optimize plannodes.
| import static java.lang.String.format; | ||
| import static java.util.Objects.requireNonNull; | ||
|
|
||
| public class Optimizer |
There was a problem hiding this comment.
I am not really a fan of this name. As this class is also validating plans. I wonder what could be a good name for this class. e.g. QueryPlanner? suggestions? @rschlussel @mlyublena
There was a problem hiding this comment.
I think the name is ok wrt to the fact that it both optimizes and validates.
I think there might be some confusion between Optimizer and PlanOptimizer, but if anything, PlanOptimizer is the confusing one (as it describes a single optimization rule vs the whole process of query optimization)
There was a problem hiding this comment.
Maybe just call it OpitmizerDriver or OptimizeAndValidate - Optimizer definitely looks wrong to me
There was a problem hiding this comment.
I just wanted to add more context. In my later PR, the logical planner related usage would be abstracted behind analyzer interfaces. Only Optimizer class would be used by the SQLQueryExecution
There was a problem hiding this comment.
Driver that runs optimizers one by one - more apt name for this class. This is definitely not an opitmizer
There was a problem hiding this comment.
Okay will change it to OptimizerDriver.
There was a problem hiding this comment.
Hmm maybe Optimizer is OK - let me think more.
| metadata, | ||
| planVariableAllocator); | ||
|
|
||
| PlanNode planNode = getSession().getRuntimeStats().profileNanos( |
There was a problem hiding this comment.
perhaps call this logicalPlanNode
There was a problem hiding this comment.
I feel planNode is sufficient, and its used at many other places. If we want to use logicalPlanNode then it would be better to change all other places as well. I will keep it planNode as of now, let me know if you feel strongly about it.
| metadata, | ||
| planVariableAllocator); | ||
|
|
||
| PlanNode planNode = session.getRuntimeStats().profileNanos( |
|
I think this needs a release note because LOGICAL_PLANNER_TIME_NANOS used to include optimization time, and now they are being split apart. |
| import static java.lang.String.format; | ||
| import static java.util.Objects.requireNonNull; | ||
|
|
||
| public class Optimizer |
There was a problem hiding this comment.
Driver that runs optimizers one by one - more apt name for this class. This is definitely not an opitmizer
kaikalur
left a comment
There was a problem hiding this comment.
I take back that comment - after thinking more - it's not as bad as I initially felt - analyzer/planner/optimizer as high level APIs is OK
The LogicalPlanner performs various functions including creating, validating, and optimizing plan nodes. The creation of plan nodes are dependent on the analyzer artifacts, while validation and optimization are core engine responsibilities.
To separate these responsibilities, we are introducing an optimizer class to manage the validation and optimization of plan nodes.
This change also adds a runtime stats (optimizerTime) for measuring time taken to validate and optimize plan nodes.