Skip to content

Conversation

@qianheng-aws
Copy link
Collaborator

@qianheng-aws qianheng-aws commented Sep 23, 2025

Description

This PR includes changes:

  1. Support cacheing OpenSearchRequestBuilder inner PushContext. Otherwise it will be built twice for the cheapest plan.
  2. Ehance the cost computing mechanism by:
    • Split method computeSelfCost from estimateRowCount. In computeSelfCost, we've implemented more proper cost computing mechanism rather than relying on rows count only.
    • Support more accurate script cost by involving script count rather than the status of whether containing scripts.
  3. Move part of our explain IT from json format to yaml format.

The new cost computing mechanism has advantages in several cases:

  • The cost among scans with the same OpenSearch request will get equivilant cost. For example, before this PR, Scan(PushDownContext[LIMIT, PROJECT]) will get smaller cost than Scan(PushDownContext[PROJECT, LIMIT]) although they produce the same request. With this PR, they two will get the equivilent cost. And then the optimizer will finally choose the plan with push down sequence more close to the original logical plan.
  • Before this PR, the optimizer will choose plan like Scan(PushDownContext[AGG, PROJECT]) - Aggregate although the 2 agg could be merged. With this PR, the optimizer will choose the scan with a merged agg pushed down.
  • Before this PR, the cases like https://github.com/opensearch-project/sql/pull/4279#discussion_r2354020132 with several scripts in our agg builder, the optimizer will still choose it since scirpt count won't affect the cost of the plan. This PR will address it.

Related Issues

Resolves #4312

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

# Conflicts:
#	integ-test/src/test/resources/expectedOutput/calcite/explain_agg_with_sum_enhancement.json
Signed-off-by: Heng Qian <[email protected]>
YAMLGenerator.Feature.ALWAYS_QUOTE_NUMBERS_AS_STRINGS); // Quote numeric strings
yamlFactory.enable(YAMLGenerator.Feature.INDENT_ARRAYS_WITH_INDICATOR);
YAML_MAPPER = new ObjectMapper(yamlFactory);
YAML_MAPPER.enable(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this config, otherwise for the plan of v2 which contains multi-children for one level, it will put the children at beginning and the operator name at the end

# Conflicts:
#	integ-test/src/test/resources/expectedOutput/calcite/explain_patterns_simple_pattern_agg_push.json
#	integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_patterns_simple_pattern_agg_push.json
Signed-off-by: Heng Qian <[email protected]>
return this.stream().anyMatch(action -> action.digest().equals(digest));
}

public synchronized OpenSearchRequestBuilder createOrGetRequestBuilder() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add java doc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sync offline, let' move this method out

* than non-pushdown operators.
*/
@Override
public @Nullable RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a UT for these two method to ensure the behaviour would be stable.

case SORT -> dCpu += dRows;
// Refer the org.apache.calcite.rel.metadata.RelMdRowCount.getRowCount(Aggregate rel,...)
case COLLAPSE -> {
dRows = dRows / 10;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// Try to reduce the rows count by 1 to make the cost cheaper slightly than non-push down.
// Because we'd like to push down LIMIT even when the fetch in LIMIT is greater than
// dRows.
case LIMIT -> dRows = Math.min(dRows, ((LimitDigest) operation.digest()).limit()) - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be negative with -1?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be but VolcanoPlanner in Clacite will replace it with TINY cost it's less than 0.

Comment on lines 155 to 160
// Calculate the cost of script filter by multiplying the selectivity of the filter and
// the factor amplified by script count.
dCpu +=
NumberUtil.multiply(
NumberUtil.multiply(dRows, RelMdUtil.guessSelectivity(filterDigest.condition())),
Math.pow(1.1, filterDigest.scriptCount()));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will Math.pow(1.1, filterDigest.scriptCount() - 1) be better? It ensures that when there is only one script, the cost of pushed down script filter is smaller than that the same filter in memory ( dRow * guessedFactor * estimateRowCountFactor < dRow * guessedFactor, otherwise it's something like dRow * guessedFactor * 1.1 * estimateRowCountFactor, which is not necessarily smaller than dRow * guessedFactor)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of estimateRowCountFactor is 0.9, so the cost of pushed-down filter is smaller.

On the other hand, the biggest reduction of cost from pushed-down filter is not from the cost reduction on filter iteself, but the final output rows count of the scan with filter push down.

# Conflicts:
#	integ-test/src/test/resources/expectedOutput/calcite/explain_agg_script_timestamp_push.json
#	integ-test/src/test/resources/expectedOutput/calcite/explain_limit_agg_pushdown2.json
#	opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java
#	opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
Signed-off-by: Heng Qian <[email protected]>
Signed-off-by: Heng Qian <[email protected]>
# Conflicts:
#	integ-test/src/test/resources/expectedOutput/ppl/explain_patterns_simple_pattern_agg_push.json
Signed-off-by: Heng Qian <[email protected]>
@qianheng-aws
Copy link
Collaborator Author

@LantaoJin There are some enhancement on plan with this PR:
explain_agg_with_sum_enhancement.yaml
explain_timechart_count.yaml
explain_filter_then_limit_push.yaml
explain_agg_counts_by1
explain_agg_counts_by2
explain_filter_push.yaml
explain_filter_push_compare_date_string.yaml
explain_filter_push_compare_time_string.yaml
explain_filter_push_compare_timestamp_string.yaml

And #4377 is no longer needed after this PR. I've removed the code and keep the yaml test only.

@LantaoJin LantaoJin merged commit 585a04c into opensearch-project:main Sep 28, 2025
33 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4353-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 585a04c5de2dedcb3e7c2abf130846caac4dee50
# Push it to GitHub
git push --set-upstream origin backport/backport-4353-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4353-to-2.19-dev.

qianheng-aws added a commit to qianheng-aws/sql that referenced this pull request Sep 28, 2025
…h-project#4353)

* Make PushDownContext support caching the built request builder

Signed-off-by: Heng Qian <[email protected]>

* Move PushDownContext outside of the AbstractCalciteIndexScan

Signed-off-by: Heng Qian <[email protected]>

* Refactor cost computing

Signed-off-by: Heng Qian <[email protected]>

* migrate explain IT from json format to yaml format

Signed-off-by: Heng Qian <[email protected]>

* Fix opensearch request reuse by equivalent operator

Signed-off-by: Heng Qian <[email protected]>

* fix IT after merging main

Signed-off-by: Heng Qian <[email protected]>

* remove plans assert in execution IT

Signed-off-by: Heng Qian <[email protected]>

* Merging main

Signed-off-by: Heng Qian <[email protected]>

* Add UT for cost evaluation of AbstractCalciteIndexScan

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>

(cherry picked from commit 585a04c)
Signed-off-by: Heng Qian <[email protected]>
@qianheng-aws qianheng-aws added the backport-manually Filed a PR to backport manually. label Sep 28, 2025
yuancu pushed a commit that referenced this pull request Sep 28, 2025
…n context #4353 (#4399)

* Enhance the cost computing mechanism and push down context (#4353)

* Make PushDownContext support caching the built request builder

Signed-off-by: Heng Qian <[email protected]>

* Move PushDownContext outside of the AbstractCalciteIndexScan

Signed-off-by: Heng Qian <[email protected]>

* Refactor cost computing

Signed-off-by: Heng Qian <[email protected]>

* migrate explain IT from json format to yaml format

Signed-off-by: Heng Qian <[email protected]>

* Fix opensearch request reuse by equivalent operator

Signed-off-by: Heng Qian <[email protected]>

* fix IT after merging main

Signed-off-by: Heng Qian <[email protected]>

* remove plans assert in execution IT

Signed-off-by: Heng Qian <[email protected]>

* Merging main

Signed-off-by: Heng Qian <[email protected]>

* Add UT for cost evaluation of AbstractCalciteIndexScan

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

* merging main

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>

(cherry picked from commit 585a04c)
Signed-off-by: Heng Qian <[email protected]>

* fix compiling

Signed-off-by: Heng Qian <[email protected]>

* fix IT

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated enhancement New feature or request pushdown pushdown related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Enhance cost computing mechanism in AbstractCalciteIndexScan

3 participants