Skip to content

Conversation

@pramodsatya
Copy link
Contributor

@pramodsatya pramodsatya commented May 8, 2025

Description

When the optimizer computes plan stats for the rewritten plan, the rule ValuesStatsRule attempts to evaluate variables to constants in function getVariableValues. C++ functions should not be used to constant fold in ValuesStatsRule when native function namespace manager is used in sidecar until constant folding is supported by sidecar. Enable use of built-in functions for constant folding in default RowExpressionOptimizer and use built-in functions to evaluate constants in ValuesStatsRule.

Motivation and Context

When the native CTE tests are run with the sidecar enabled, the testcase testPersistentCteForVarbinaryType fails with the following error:

java.lang.RuntimeException: RowExpression interpreter returned an unresolved expression

This is because the plan optimizer attempts to evaluate CallExpressions in the query plan, such as from_base64('Qm9iIEpvaG5zb24='), to ConstantExpressions when calculating plan stats in ValuesStatsRule. This evaluation will fail when native function implementations are used by the sidecar, since the function implementation type will be CPP and these functions will not be evaluated in the coordinator. ValuesStatsRule should use built-in functions for evaluating constant expressions until the sidecar supports expression evaluation with native engine (WIP: #24126).

Test Plan

Please refer to commit, comprehensive e2e tests for function analysis with native sidecar will be added in #24717.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== NO RELEASE NOTE ==

@pramodsatya pramodsatya requested a review from a team as a code owner May 8, 2025 20:43
@pramodsatya pramodsatya requested a review from jaystarshot May 8, 2025 20:43
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label May 8, 2025
@prestodb-ci prestodb-ci requested review from a team and ScrapCodes and removed request for a team May 8, 2025 20:43
@aditi-pandit
Copy link
Contributor

@pramodsatya : Thanks for debugging this.

ReplaceConstantVariableReferencesWithConstants and ValuesStatsRule are 2 independent rules imo. So I don't think we should use the config of one for the other.

ValuesStatsRule should be disabled for native execution since its relying on evaluating the constant values at the co-ordinator, when we should actually use the worker (or side-car when it supports it) . But disabling now might be too disruptive.

@rschlussel @amitkdutta : Would be great to hear your thoughts on this.

@pramodsatya
Copy link
Contributor Author

@pramodsatya : Thanks for debugging this.

ReplaceConstantVariableReferencesWithConstants and ValuesStatsRule are 2 independent rules imo. So I don't think we should use the config of one for the other.

ValuesStatsRule should be disabled for native execution since its relying on evaluating the constant values at the co-ordinator, when we should actually use the worker (or side-car when it supports it) . But disabling now might be too disruptive.

@rschlussel @amitkdutta : Would be great to hear your thoughts on this.

Thanks for pointing it out @aditi-pandit, I had used rewrite_expression_with_constant_expression property here because it appeared closest to the check required in ValuesStatsRule. Disabled ValuesStatsRule when native execution is enabled for now, and will update the check based on further feedback from the team.

Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought currently even if you are using the sidecar, constant evaluation and constant folding will happen in Java with the "java equivalent" functions. Is that not the case?

VariableReferenceExpression variable = node.getOutputVariables().get(variableId);
List<Object> symbolValues = getVariableValues(node, variableId, session, variable.getType());
statsBuilder.addVariableStatistics(variable, buildVariableStatistics(symbolValues, session, variable.getType()));
if (!isNativeExecutionEnabled(session)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is disabling for all nativeExecution, not just sidecar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is just a placeholder, will revert it once the right check/fix is figured out.

@pramodsatya
Copy link
Contributor Author

pramodsatya commented May 9, 2025

I thought currently even if you are using the sidecar, constant evaluation and constant folding will happen in Java with the "java equivalent" functions. Is that not the case?

@rschlussel, yes that is mostly right, ValuesStatsRule expects constant expressions to be evaluated and typically this would happen with java function implementations. But when native function namespace manager is used via the sidecar, the function implementation type will be CPP and not JAVA, so constant expressions will not be evaluated on coordinator.

I have added the specific testcase that is failing in the PR description, could you please share if disabling the constant expression evaluation in ValuesStatsRule is the right fix?

@tdcmeehan
Copy link
Contributor

tdcmeehan commented May 15, 2025

So it seems this issue is because we swap out the default function namespace to whatever is listed by the sidecar as being registered in Velox. Without #24126, you can't constant fold at all because no function is eligible to be evaluated in the Presto coordinator (they're C++ functions and there's no mechanism to evaluate C++ functions yet).

We could probably tactically add a hacky ExpressionOptimizer that just swaps back the default namespace to the Java namespace, then evaluates them as usual. This would preserve the old behavior of constant folding, while still validating overall function compatibility. We would delete it once we complete the native expression optimizer (I don't think it would make sense to use function validation without native constant folding, once both are mature).

We could also just wait for native constant folding to land.

@rschlussel
Copy link
Contributor

So it seems this issue is because we swap out the default function namespace to whatever is listed by the sidecar as being registered in Velox. Without #24126, you can't constant fold at all because no function is eligible to be evaluated in the Presto coordinator (they're C++ functions and there's no mechanism to evaluate C++ functions yet).

We could probably tactically add a hacky ExpressionOptimizer that just swaps back the default namespace to the Java namespace, then evaluates them as usual. This would preserve the old behavior of constant folding, while still validating overall function compatibility. We would delete it once we complete the native expression optimizer (I don't think it would make sense to use function validation without native constant folding, once both are mature).

We could also just wait for native constant folding to land.

I'd vote to add the temporary solution to support constant folding in the coordinator for now. We'd like to be able to enable sidecar in production, but no constant folding will be a deal breaker. We'd also want to do more testing for sidecar constant folding before enabling it once it lands and have some stretch of time where we can disable it without removing other sidecar features, because constant folding has the possibility of correctness bugs, so it's a riskier change to incorporate.

@rschlussel
Copy link
Contributor

fyi @amitkdutta

@aditi-pandit
Copy link
Contributor

The option to temporarily use a hacked ExpressionOptimizer that goes back to the Java namespace is definitely useful. It will help us also to turn of/off the constant folding logic without much disruption.

Given all the redesign requested for constant folding, it will take several cycles to get it right I feel.

@pramodsatya
Copy link
Contributor Author

Thanks for the feedback, will modify the ExpressionOptimizer to use Java namespace for function eval.

@pramodsatya pramodsatya changed the title Add check for evaluating variables in ValuesStatsRule Use built-in functions for constant folding in ValuesStatsRule May 16, 2025
@pramodsatya pramodsatya marked this pull request as draft May 16, 2025 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants