feat: Add optimizer to rewrite functions in query plan#26859
feat: Add optimizer to rewrite functions in query plan#26859feilong-liu merged 1 commit intoprestodb:masterfrom
Conversation
Reviewer's GuideIntroduces a pluggable function-call rewriting optimization framework, including a new SPI interface, central manager, optimizer, session/config flags, and wiring into the query planner and test/query-runner infrastructure. Sequence diagram for function call rewriting during query optimizationsequenceDiagram
actor User
participant Client
participant Coordinator
participant PlanOptimizers
participant FunctionCallRewritingOptimizer as FuncCallRewritingOpt
participant PlanRewriter
participant CallExpressionRewriter as CallExprRewriter
participant FunctionCallRewriterImpl as FuncCallRewriter
User->>Client: Submit SQL query
Client->>Coordinator: Execute query
Coordinator->>PlanOptimizers: buildOptimizedPlan(session, query)
Note over Coordinator,PlanOptimizers: PlanOptimizers pipeline includes FunctionCallRewritingOptimizer
PlanOptimizers->>FuncCallRewritingOpt: optimize(plan, session, types, variableAllocator, idAllocator, warningCollector)
FuncCallRewritingOpt->>SystemSessionProperties: isFunctionCallRewriterEnabled(session)
SystemSessionProperties-->>FuncCallRewritingOpt: enabled or disabled
alt function call rewriter disabled
FuncCallRewritingOpt-->>PlanOptimizers: PlanOptimizerResult(plan, changed=false)
else enabled
FuncCallRewritingOpt->>FunctionCallRewriterManager: getAllRewriters()
FunctionCallRewriterManager-->>FuncCallRewritingOpt: Set(FunctionCallRewriter)
alt no registered rewriters
FuncCallRewritingOpt-->>PlanOptimizers: PlanOptimizerResult(plan, changed=false)
else rewriters present
FuncCallRewritingOpt->>PlanRewriter: new PlanRewriter(session.toConnectorSession(), variableAllocator, idAllocator, functionResolution, rewriters)
FuncCallRewritingOpt->>PlanRewriter: rewrite(plan)
loop for each Project and Filter node
PlanRewriter->>CallExprRewriter: rewrite expressions
loop for each CallExpression
CallExprRewriter->>FuncCallRewriter: rewrite(call, session, variableAllocator, idAllocator, functionResolution)
alt rewriter returns new expression
FuncCallRewriter-->>CallExprRewriter: Optional(RowExpression)
CallExprRewriter-->>PlanRewriter: rewritten RowExpression
else not handled
FuncCallRewriter-->>CallExprRewriter: Optional.empty
end
end
end
PlanRewriter-->>FuncCallRewritingOpt: rewrittenPlan
FuncCallRewritingOpt-->>PlanOptimizers: PlanOptimizerResult(rewrittenPlan, changed=true or false)
end
end
PlanOptimizers-->>Coordinator: final optimized plan
Coordinator-->>Client: query results
Client-->>User: results displayed
Class diagram for new function call rewriting frameworkclassDiagram
direction LR
class PlanOptimizer {
<<interface>>
+optimize(plan, session, types, variableAllocator, idAllocator, warningCollector) PlanOptimizerResult
}
class RowExpressionRewriter {
<<abstract>>
}
class SimplePlanRewriter {
<<abstract>>
+rewriteWith(rewriter, plan) PlanNode
}
class StandardFunctionResolution
class FunctionCallRewriter {
<<interface>>
+rewrite(call, session, variableAllocator, idAllocator, functionResolution) Optional~RowExpression~
}
class FunctionCallRewriterManager {
-Set~FunctionCallRewriter~ rewriters
+FunctionCallRewriterManager()
+addRewriter(rewriter) void
+removeRewriter(rewriter) void
+getAllRewriters() Set~FunctionCallRewriter~
}
class FunctionCallRewritingOptimizer {
-Supplier~Set~FunctionCallRewriter~~ rewritersSupplier
-StandardFunctionResolution functionResolution
+FunctionCallRewritingOptimizer(rewritersSupplier, functionResolution)
+optimize(plan, session, types, variableAllocator, idAllocator, warningCollector) PlanOptimizerResult
}
class PlanRewriter {
-ConnectorSession session
-VariableAllocator variableAllocator
-PlanNodeIdAllocator idAllocator
-StandardFunctionResolution functionResolution
-Set~FunctionCallRewriter~ rewriters
-CallExpressionRewriter callRewriter
-boolean planChanged
+PlanRewriter(session, variableAllocator, idAllocator, functionResolution, rewriters)
+isPlanChanged() boolean
+visitProject(node, context) PlanNode
+visitFilter(node, context) PlanNode
}
class CallExpressionRewriter {
-ConnectorSession session
-Set~FunctionCallRewriter~ rewriters
-VariableAllocator variableAllocator
-PlanNodeIdAllocator idAllocator
-StandardFunctionResolution functionResolution
-boolean hasRewritten
+CallExpressionRewriter(session, rewriters, variableAllocator, idAllocator, functionResolution)
+hasRewritten() boolean
+rewriteCall(node, unused, treeRewriter) RowExpression
}
class FeaturesConfig {
-boolean enableFunctionCallRewriter
+setEnableFunctionCallRewriter(enableFunctionCallRewriter) FeaturesConfig
+isEnableFunctionCallRewriter() boolean
}
class SystemSessionProperties {
+String ENABLE_FUNCTION_CALL_REWRITER
+isFunctionCallRewriterEnabled(session) boolean
}
class PlanOptimizers {
+PlanOptimizers(..., FunctionCallRewriterManager functionCallRewriterManager, ...)
}
class CoordinatorModule {
+setup(binder) void
}
class LocalQueryRunner {
+getPlanOptimizers(noExchange, nativeExecutionEnabled) List~PlanOptimizer~
}
FunctionCallRewritingOptimizer ..|> PlanOptimizer
PlanRewriter ..|> SimplePlanRewriter
CallExpressionRewriter ..|> RowExpressionRewriter
FunctionCallRewritingOptimizer o-- PlanRewriter
PlanRewriter o-- CallExpressionRewriter
PlanRewriter o-- FunctionCallRewriter
CallExpressionRewriter o-- FunctionCallRewriter
FunctionCallRewriterManager o-- FunctionCallRewriter
FunctionCallRewritingOptimizer --> FunctionCallRewriterManager : uses via Supplier
FeaturesConfig --> SystemSessionProperties : provides default
SystemSessionProperties --> FunctionCallRewritingOptimizer : enables via session flag
PlanOptimizers --> FunctionCallRewritingOptimizer : constructs and registers
CoordinatorModule --> FunctionCallRewriterManager : bind singleton
LocalQueryRunner --> FunctionCallRewriterManager : create for tests
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- In
CallExpressionRewriter.rewriteCall, returningnullwhen no rewriter applies is risky; consider delegating totreeRewriter.defaultRewrite(node, unused)or returning the original node so the tree rewriter doesn't see unexpected nulls and still rewrites child expressions. - Because
CallExpressionRewriter.hasRewrittenis never reset,visitProject/visitFiltermay rebuild nodes and mark the plan as changed for later nodes even when no rewrite occurred there; consider resetting this flag per node (or tracking rewrite at a finer granularity) to avoid unnecessary plan node recreation.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `CallExpressionRewriter.rewriteCall`, returning `null` when no rewriter applies is risky; consider delegating to `treeRewriter.defaultRewrite(node, unused)` or returning the original node so the tree rewriter doesn't see unexpected nulls and still rewrites child expressions.
- Because `CallExpressionRewriter.hasRewritten` is never reset, `visitProject`/`visitFilter` may rebuild nodes and mark the plan as changed for later nodes even when no rewrite occurred there; consider resetting this flag per node (or tracking rewrite at a finer granularity) to avoid unnecessary plan node recreation.
## Individual Comments
### Comment 1
<location> `presto-main-base/src/main/java/com/facebook/presto/sql/planner/optimizations/FunctionCallRewritingOptimizer.java:136-140` </location>
<code_context>
+ Map.Entry::getKey,
+ e -> RowExpressionTreeRewriter.rewriteWith(callRewriter, e.getValue())));
+
+ if (!callRewriter.hasRewritten() && newSource == node.getSource()) {
+ return node;
+ }
+
+ planChanged = true;
+
+ return new ProjectNode(
</code_context>
<issue_to_address>
**suggestion (bug_risk):** `hasRewritten` is global across the whole plan, so later nodes may be treated as changed even when they are not.
Because the same `CallExpressionRewriter` instance (and its `hasRewritten` flag) is reused across the whole plan, once any call is rewritten `hasRewritten()` remains `true`. That means this check:
```java
if (!callRewriter.hasRewritten() && newSource == node.getSource()) {
return node;
}
planChanged = true;
```
will treat all subsequent `ProjectNode`/`FilterNode` instances as changed, even when their subtree is identical (`newSource == node.getSource()`). This makes `planChanged` inaccurate and causes unnecessary node recreation.
Consider either tracking rewrites per node (snapshot `hasRewritten` at method entry and compare after rewriting), or using a fresh `CallExpressionRewriter` per top-level rewrite plus a local per-node "changed" flag.
Suggested implementation:
```java
public PlanNode visitProject(ProjectNode node, RewriteContext<Void> context)
{
PlanNode newSource = context.rewrite(node.getSource(), context.get());
boolean sourceChanged = newSource != node.getSource();
Assignments.Builder assignmentsBuilder = Assignments.builder();
boolean assignmentsChanged = false;
for (Map.Entry<VariableReferenceExpression, RowExpression> assignment : node.getAssignments().entrySet()) {
RowExpression original = assignment.getValue();
RowExpression rewritten = RowExpressionTreeRewriter.rewriteWith(callRewriter, original);
if (rewritten != original) {
assignmentsChanged = true;
}
assignmentsBuilder.put(assignment.getKey(), rewritten);
}
Assignments newAssignments = assignmentsBuilder.build();
if (!sourceChanged && !assignmentsChanged) {
return node;
}
planChanged = true;
return new ProjectNode(
node.getSourceLocation(),
idAllocator.getNextId(),
newSource,
newAssignments,
node.getLocality());
}
```
If there is an analogous use of `callRewriter.hasRewritten()` in `visitFilter` (or other visitors) to decide whether a node changed, the same pattern should be applied there: compute a local `*Changed` flag by comparing the rewritten expressions to the originals and combine it with any `sourceChanged` check, instead of relying on the global `hasRewritten()` state.
</issue_to_address>
### Comment 2
<location> `presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueryFramework.java:594` </location>
<code_context>
new MBeanExporter(new TestingMBeanServer()),
splitManager,
planOptimizerManager,
+ new FunctionCallRewriterManager(),
pageSourceManager,
statsCalculator,
</code_context>
<issue_to_address>
**suggestion (testing):** Add an end-to-end test that exercises a custom `FunctionCallRewriter` via the new optimizer and session property
The test harness now wires `PlanOptimizers` with a `FunctionCallRewriterManager`, but there’s no end-to-end coverage that a connector/module `FunctionCallRewriter` actually changes the plan when `enable_function_call_rewriter` is enabled.
Please add an integration test (e.g., in an existing query-framework suite) that:
- Registers a test `FunctionCallRewriter` with `FunctionCallRewriterManager`.
- Runs a query with a function the test rewriter handles.
- Verifies via `EXPLAIN`/plan inspection that the call is rewritten when `enable_function_call_rewriter = true` and unchanged when it is `false`.
Suggested implementation:
```java
import com.facebook.presto.sql.planner.PlanFragmenter;
```
To fully implement the requested end-to-end test, you will need to:
1. **Add a test-only `FunctionCallRewriter` implementation**, e.g. `TestingFunctionCallRewriter`, under `presto-tests`:
- Implement the `FunctionCallRewriter` interface from `com.facebook.presto.sql.planner`.
- Have it recognize a test-only function (for example `test_rewriter_func`) and rewrite it to a different expression (e.g. rewrite `test_rewriter_func(x)` to `abs(x)` or some other deterministic function).
- Ensure the rewrite is easily visible in a plan (e.g., function name changes or argument is modified).
2. **Expose or access the `FunctionCallRewriterManager` from the test harness**:
- If `AbstractTestQueryFramework` (or the concrete `QueryRunner` implementation used in tests) does not currently expose the `FunctionCallRewriterManager`, add a getter or a hook in the concrete test class that creates the query runner.
- After creating the `QueryRunner`, register the test rewriter:
```java
queryRunner.getFunctionCallRewriterManager().addRewriter(new TestingFunctionCallRewriter());
```
- Adjust `addRewriter` / registration call to match the actual API of `FunctionCallRewriterManager`.
3. **Create a concrete test class using `AbstractTestQueryFramework`**, for example `TestFunctionCallRewriter`:
- Extend `AbstractTestQueryFramework`.
- In `createQueryRunner()`, create a `QueryRunner`, then register the `TestingFunctionCallRewriter` on its `FunctionCallRewriterManager` as described above.
- Add a test method, e.g.:
```java
@Test
public void testFunctionCallRewriterEnabledAndDisabled()
{
// Session with rewriter disabled
Session disabled = Session.builder(getSession())
.setSystemProperty("enable_function_call_rewriter", "false")
.build();
// Session with rewriter enabled
Session enabled = Session.builder(getSession())
.setSystemProperty("enable_function_call_rewriter", "true")
.build();
String query = "EXPLAIN SELECT test_rewriter_func(42)";
// Plan when disabled
MaterializedResult disabledPlan = computeActual(disabled, query);
String disabledExplain = (String) disabledPlan.getOnlyValue();
// Plan when enabled
MaterializedResult enabledPlan = computeActual(enabled, query);
String enabledExplain = (String) enabledPlan.getOnlyValue();
// Verify that when disabled, the original function name appears
assertTrue(disabledExplain.contains("test_rewriter_func"));
// Verify that when enabled, the function call was rewritten
// (e.g., to ABS or whatever TestingFunctionCallRewriter produces)
assertFalse(enabledExplain.contains("test_rewriter_func"));
assertTrue(enabledExplain.contains("abs")); // adjust to actual rewrite
}
```
- Use the actual session property name used for this feature (you referenced `enable_function_call_rewriter`, confirm its exact spelling and namespace in the existing session property definitions).
4. **Wire the test function into the execution engine**:
- If `test_rewriter_func` is not already registered as a function, add a test-only scalar function in the test module (or reuse an existing one) so that parsing succeeds.
- Ensure the function type signature and rewriter logic are consistent with the metadata and type system.
These additional changes will provide a true end-to-end test that verifies:
- The query harness uses `FunctionCallRewriterManager`.
- A custom `FunctionCallRewriter` actually rewrites a function call in the plan.
- The behavior is gated by the `enable_function_call_rewriter` session property.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if (!callRewriter.hasRewritten() && newSource == node.getSource()) { | ||
| return node; | ||
| } | ||
|
|
||
| planChanged = true; |
There was a problem hiding this comment.
suggestion (bug_risk): hasRewritten is global across the whole plan, so later nodes may be treated as changed even when they are not.
Because the same CallExpressionRewriter instance (and its hasRewritten flag) is reused across the whole plan, once any call is rewritten hasRewritten() remains true. That means this check:
if (!callRewriter.hasRewritten() && newSource == node.getSource()) {
return node;
}
planChanged = true;will treat all subsequent ProjectNode/FilterNode instances as changed, even when their subtree is identical (newSource == node.getSource()). This makes planChanged inaccurate and causes unnecessary node recreation.
Consider either tracking rewrites per node (snapshot hasRewritten at method entry and compare after rewriting), or using a fresh CallExpressionRewriter per top-level rewrite plus a local per-node "changed" flag.
Suggested implementation:
public PlanNode visitProject(ProjectNode node, RewriteContext<Void> context)
{
PlanNode newSource = context.rewrite(node.getSource(), context.get());
boolean sourceChanged = newSource != node.getSource();
Assignments.Builder assignmentsBuilder = Assignments.builder();
boolean assignmentsChanged = false;
for (Map.Entry<VariableReferenceExpression, RowExpression> assignment : node.getAssignments().entrySet()) {
RowExpression original = assignment.getValue();
RowExpression rewritten = RowExpressionTreeRewriter.rewriteWith(callRewriter, original);
if (rewritten != original) {
assignmentsChanged = true;
}
assignmentsBuilder.put(assignment.getKey(), rewritten);
}
Assignments newAssignments = assignmentsBuilder.build();
if (!sourceChanged && !assignmentsChanged) {
return node;
}
planChanged = true;
return new ProjectNode(
node.getSourceLocation(),
idAllocator.getNextId(),
newSource,
newAssignments,
node.getLocality());
}If there is an analogous use of callRewriter.hasRewritten() in visitFilter (or other visitors) to decide whether a node changed, the same pattern should be applied there: compute a local *Changed flag by comparing the rewritten expressions to the originals and combine it with any sourceChanged check, instead of relying on the global hasRewritten() state.
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueryFramework.java
Outdated
Show resolved
Hide resolved
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
84663b4 to
b72a37c
Compare
| * This allows connectors to transform specific function calls into alternative implementations. | ||
| */ | ||
| public class FunctionCallRewritingOptimizer | ||
| implements PlanOptimizer |
There was a problem hiding this comment.
Can we just do iterative rule for project and filter?
There was a problem hiding this comment.
Updated the code to extend RowExpressionRewriteRuleSet
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
b72a37c to
512acf5
Compare
tdcmeehan
left a comment
There was a problem hiding this comment.
Instead of creating a very specific FunctionCallRewriter SPI here, can you use the existing and more general ExpressionOptimizer SPI?
512acf5 to
a63d0a2
Compare
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
Updated the code to use the ExpressionOptimizer SPI instead of adding new SPI |
| /** | ||
| * A rule set that rewrites row expressions using a custom expression optimizer. | ||
| */ | ||
| public class RowExpressionsRewriteOptimizer |
There was a problem hiding this comment.
Thanks for this cleanup @feilong-liu, however I'm wondering if we can simplify this further.
To me this seems identical to SimplifyRowExpressions. There's new session properties which are also almost identical to expression_optimizer_name.
What I'm wondering is, can you make expression_optimizer_name accept a cascading list of expression optimizer names, and these optimizers are evaluated in order. That way, if you configure it such that SET SESSION expression_optimizer_name=default,function_rewrite, when we hit existing rules such as SimplifyRowExpressions, they will automatically call your custom optimizer and perform the rewrite. No new session properties needed, and no new optimizer rules needed, just a small change to evaluate the cascading list of expression rewrites.
We could have ExpressionOptimizerManager#getExpressionOptimizer return a simple composite expression optimizer if there's a comma separated list.
There was a problem hiding this comment.
@tdcmeehan Thanks for the review.
SimplifyRowExpressions is used multiple times in the plan optimizer, and for my case here, it only needs to run once. Also SimplifyRowExpressions has an additional logicalExpressionRewriter rewrite step, which is not needed either. I'd prefer to avoid this additional cost.
Also SimplifyRowExpressions has different purpose than the newly added optimizer. SimplifyRowExpressions is to evaluate and simplify row expressions, while the new optimizer added here is to do function call rewrite. Mixing them together makes is harder to maintain and may complex future usage (for example if future usage case wants to call simplify row expression optimizer, it will also pay the cost of this additional function rewrite which is not needed).
That way, if you configure it such that SET SESSION expression_optimizer_name=default,function_rewrite
I see that expression_optimizer_name is not only used in SimplifyRowExpressions but also used in all usage cases where ExpressionOptimizer is used. If we want to apply this change, we either need to change SimplifyRowExpressions to use a different session property, or run multiple expression optimizers for all other optimizers too, which both are not ideal.
I prefer to keep adding one new optimizer here instead of modifying existing use cases.
There was a problem hiding this comment.
Thanks for your thoughts @feilong-liu. I just want to take a minute to talk this through fully to get aligned on the direction, and I appreciate your patient elaboration of your perspective.
I understand that SimplifyRowExpressions is called multiple times. In fact, as you point out, it's used almost pervasively throughout the code when we simplify expressions. The thinking behind this pervasive use of ExressionOptimizerManager to retrieve the expression optimizer is we want just a single entry point to influence how expressions get simplified, so we don't need to think about complex interactions between optimizers rules or when exactly the expression optimizer gets invoked.
Expression optimizers are written similarly as Rules, the expectation is that they're idempotent. Meaning, it doesn't matter if they're called multiple times, they return the same output.
For your use case, where you simply map from one function to another, I imagine that even if this mapping were making some sort of external service call, you only pay this cost once, and any additional cost is just walking the expression tree. Is your concern primarily the cost of walking the in-memory expression tree? My intuition is we don't need to optimize for this cost, as I expect most of what we want to conserve in the coordinator is more on the fanout between queries to tasks and task management (where we can go from 1 query to 10's of thousands of tasks, which update per second and require serialization and deserialization of nested structures), and connection management between clients and the coordinator (where likewise even queries which are not running are regularly heartbeated by clients, which can cause a large fanout on resource constrained clusters), not the fixed per query cost of plan and expression tree traversal.
Regarding your point on the purpose of SimplifyRowExpressions, it's true that this was originally intended to do as it's named, however with pluggable ExpressionOptimizers it's now a bit more generic and we could actually call it something more generic like OptimizerRowExpressions. It does still do logical expression reordering and simplification, but this doesn't really change the logic of the expression, and like all ExpressionOptimizers, it's idempotent.
tl;dr, instead of adding yet another session property and yet another very specific optimizer rule, I'm just wondering if it would be better to converge on generic functionality if it's cheap to do (which I think it is). If you'd like, I can raise a PR for this cascading approach in ExpressionOptimizerManager - it's fairly minimal and keeps your use case working without the additional session property.
There was a problem hiding this comment.
@tdcmeehan Thanks for explaining the rational behind the proposal. I get your point on desire of having one single point for row expression evaluation/simplification. However I am still concerned on running expression optimizers tens of times unnecessarily.
My intuition is we don't need to optimize for this cost
I had the same intuition a few years ago, which I realized to be not quite true. People tends to have the assumption that traversing query plans and expression trees are cheap and Presto does not need to worry about it much. It's true when Presto is running batch job, however as we started looking into interactive use cases and extends Presto to serve time sensitive use cases, every millisecond counts and running an additional expression optimizer which traverses every expression tree tens of times unnecessarily is really not an acceptable case.
The thinking behind this pervasive use of ExressionOptimizerManager to retrieve the expression optimizer is we want just a single entry point to influence how expressions get simplified, so we don't need to think about complex interactions between optimizers rules or when exactly the expression optimizer gets invoked.
On the contrary, imo having optimizer rules to pick its own expression optimizer to use sounds a cleaner and more intuitive way. People know which expression optimizer is used and have control on it. Having all optimization rules using the same set of expression optimizers complicates the problem (especially considering that some optimizer rules are not purely expression optimization, it's using the expression optimizer to do constant folding which is used as a util function).
My case is simple, which is to do function rewrite and not an "optimization" of expression. It is a one-time rewrite rather than multiple passes of expressions. It does not necessarily need to use the ExressionOptimizerManager. If this is your concern, I can move away from using ExpressionOptimizerManager and revert back to the previous version which does not use ExpressionOptimizer.
The gap this PR tries to address is the lack of a way to do expression rewrite with internal functions, which is to add an optimizer rule that can hook with internal rewriters. If interaction between optimizer rules and expression rewrites is the concern, I can also move it to analyzer instead.
There was a problem hiding this comment.
Thanks for your detailed reply @feilong-liu. While I still have reservations that this type of change would really affect the low latency use cases you mention, I trust that your production experience is telling you something important.
I think given you constraint you've laid out, this is the simplest approach which doesn't introduce a new SPI footprint, so I'm in favor of this approach. My only remaining question is don't we want this to be the first or among the first rules to be evaluated?
There was a problem hiding this comment.
@tdcmeehan The current rules which are above this new rules are mainly rules which transform the query plan to a query plan which Presto can execute, for example, subquery flattening, lambda function desugar, set operation flattening etc. IMO, rule transformations like above is OK to have higher priority, and I also prefer the new rule to operate on a plan which is executable by Presto and deal with the node and expressions which the execution engine recognize.
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
a63d0a2 to
9ce3db9
Compare
|
Please add documentation for the new session property Then please update the release note to link to the new session property documentation. There's an example of such a doc link in Formatting in the Release Notes Guidelines. |
| return session.getSystemProperty(EXPRESSION_OPTIMIZER_NAME, String.class); | ||
| } | ||
|
|
||
| public static String getExpressionOptimizerInRowExpressionRewrite(Session session) |
There was a problem hiding this comment.
should we also add a config property (that way in normal deployments we can set a cluster default)?
There was a problem hiding this comment.
Added config property
There was a problem hiding this comment.
Added config property
Please add documentation for the new configuration property and also the new session property.
CONTRIBUTING.md states
"All new language features, new functions, session and config properties, and major features have documentation added."
| /** | ||
| * A rule set that rewrites row expressions using a custom expression optimizer. | ||
| */ | ||
| public class RowExpressionsRewriteOptimizer |
There was a problem hiding this comment.
Nit: I think calling this an optimizer is confusing
There was a problem hiding this comment.
Rename it to RewriteRowExpressions, I am really not good at naming, so asked AI and this is the suggestion it gave to me lol.
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
9ce3db9 to
578ab01
Compare
Summary: X-link: https://github.com/facebookexternal/presto-facebook/pull/3520 Differential Revision: D89703780
578ab01 to
02f8d77
Compare
|
LGTM |
## Description We have an internal use case where we want to rewrite functions, where both rewrite from and rewrite to functions are internal functions. This PR adds a new function rewrite optimizer, which use an instance of expression optimizer to do the expression rewrite, the name of the expression optimizer is specified with session property `expression_optimizer_in_row_expression_rewrite` which defaults to disabled when it's empty. ## Motivation and Context To do function rewrite with internal functions. ## Impact As in description ## Test Plan Unit tests and local end to end test ## Contributor checklist - [ ] Please make sure your submission complies with our [contributing guide](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md), in particular [code style](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#code-style) and [commit standards](https://github.com/prestodb/presto/blob/master/CONTRIBUTING.md#commit-standards). - [ ] PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced. - [ ] Documented new properties (with its default value), SQL syntax, functions, or other functionality. - [ ] If release notes are required, they follow the [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines). - [ ] Adequate tests were added if applicable. - [ ] CI passed. - [ ] If adding new dependencies, verified they have an [OpenSSF Scorecard](https://securityscorecards.dev/#the-checks) score of 5.0 or higher (or obtained explicit TSC approval for lower scores). ## Release Notes Please follow [release notes guidelines](https://github.com/prestodb/presto/wiki/Release-Notes-Guidelines) and fill in the release notes below. ``` == RELEASE NOTES == General * Add an optimizer which can do function rewrite ```
Description
We have an internal use case where we want to rewrite functions, where both rewrite from and rewrite to functions are internal functions.
This PR adds a new function rewrite optimizer, which use an instance of expression optimizer to do the expression rewrite, the name of the expression optimizer is specified with session property
expression_optimizer_in_row_expression_rewritewhich defaults to disabled when it's empty.Motivation and Context
To do function rewrite with internal functions.
Impact
As in description
Test Plan
Unit tests and local end to end test
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.