Skip to content

Conversation

@qianheng-aws
Copy link
Collaborator

@qianheng-aws qianheng-aws commented Nov 28, 2025

Description

Refactor alias type field by adding another project with alias

Related Issues

Resolves #4876

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced processing of alias type fields during query planning and execution.
    • Improved field projection and resolution for aliased columns in SQL queries.
  • Tests

    • Added comprehensive test coverage and validation for alias type field handling in query explain plans.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

Walkthrough

Refactors alias type field handling by introducing an AliasFieldsWrappable interface that centralizes alias-to-original field mapping. This replaces scattered alias resolution logic across scan and push-down operations with a unified projection step applied after scanning, requiring changes to schema conversion and multiple scan implementations.

Changes

Cohort / File(s) Change Summary
Core Interface & Visitor
core/src/main/java/org/opensearch/sql/calcite/plan/AliasFieldsWrappable.java, core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Introduces new AliasFieldsWrappable interface with getAliasMapping() and default wrapProjectForAliasFields(RelBuilder) methods. Updates visitor to wrap scanned RelNodes that implement this interface with alias projections.
Scan Implementation
opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java, opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteEnumerableIndexScan.java, opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java
AbstractCalciteIndexScan implements AliasFieldsWrappable and delegates to underlying index. CalciteEnumerableIndexScan uses row type field names directly, removing alias-mapping helper. CalciteLogicalIndexScan removes alias field resolution from pushDownProject.
Schema Conversion
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java
Adds guard in convertSchema to skip alias type fields with original paths present, preventing duplicate schema entries.
Tests & Resources
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java, integ-test/src/test/resources/expectedOutput/calcite/explain_alias_type_field.yaml, integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_alias_type_field.yaml
Adds test infrastructure with new TEST_INDEX_ALIAS import, testAliasTypeField() method, and two YAML expected output snapshots for explain plans with and without push-down optimization.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Files requiring attention: CalciteRelNodeVisitor.java for the new wrapping logic; all three scan classes (AbstractCalciteIndexScan, CalciteEnumerableIndexScan, CalciteLogicalIndexScan) should be reviewed together to understand the unified refactoring pattern and verify consistency in alias handling
  • Logic density: Moderate; the new AliasFieldsWrappable interface default method constructs projections, and removed logic paths across scan classes need verification that functionality is preserved
  • Cross-file coherence: Changes must work together—the removal of alias resolution from scan classes depends on the wrapping mechanism in CalciteRelNodeVisitor functioning correctly

Poem

🐰 Aliases once scattered, now neatly aligned,
A project wraps scans with fields redesigned,
One interface guides them, clear paths to behold,
Where schema and scanners sing stories retold! 🌿✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 14.29% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly reflects the main objective: refactoring alias type fields by inserting a project operator to add aliases before scan, which is the core change across multiple files.
Linked Issues check ✅ Passed The PR implements all coding objectives from issue #4876: removes alias type fields from schema [OpenSearchTypeFactory.java], introduces AliasFieldsWrappable interface for wrapping scans with project operators [AliasFieldsWrappable.java, CalciteRelNodeVisitor.java, AbstractCalciteIndexScan.java], and removes alias path replacement logic from scan operations [CalciteEnumerableIndexScan.java, CalciteLogicalIndexScan.java]. Tests verify the new behavior.
Out of Scope Changes check ✅ Passed All changes directly support the stated objective of refactoring alias type fields by adding project operators and removing path replacement logic. No extraneous modifications detected.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

@qianheng-aws qianheng-aws added enhancement New feature or request calcite calcite migration releated backport 2.19-dev labels Nov 28, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 28, 2025

Note

Docstrings generation - SUCCESS
Generated docstrings for this pull request at #4882

coderabbitai bot added a commit that referenced this pull request Nov 28, 2025
Docstrings generation was requested by @qianheng-aws.

* #4881 (comment)

The following files were modified:

* `core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`
* `core/src/main/java/org/opensearch/sql/calcite/plan/AliasFieldsWrappable.java`
* `core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java`
* `integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java`
* `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java`
* `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteEnumerableIndexScan.java`
* `opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java`
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
integ-test/src/test/resources/expectedOutput/calcite/explain_alias_type_field.yaml (1)

8-9: Consider improving readability of the physical plan line using YAML multiline syntax.

Line 9 is extremely long (exceeds 1000 characters) and difficult to review. While YAML snapshot tests often have dense content, consider using YAML's multiline scalar syntax for improved readability during code reviews and maintenance.

As an optional improvement for future readability, the physical plan could be reformatted using YAML's |- multiline scalar indicator:

  physical: |-
    CalciteEnumerableIndexScan(
      table=[[OpenSearch, opensearch-sql_test_index_alias]],
      PushDownContext=[[
        FILTER->>($0, 10),
        AGGREGATION->rel#:LogicalAggregate.NONE.[](input=..., avg(alias_col)=AVG($0)),
        LIMIT->10000
      ],
      OpenSearchRequestBuilder(...)
    ]

Note: Only apply if your snapshot testing framework supports multiline comparison without requiring exact whitespace matching.

core/src/main/java/org/opensearch/sql/calcite/plan/AliasFieldsWrappable.java (1)

16-35: Alias projection helper is sound; consider minor robustness tweaks

Centralizing alias-field projection in wrapProjectForAliasFields(RelBuilder) is a good fit for the new model and keeps scan/pushdown logic clean.

Two optional refinements to make this more robust and avoid unnecessary nodes:

  • Short‑circuit when there are no alias mappings to avoid constructing an empty projectPlus:
    Map<String, String> mapping = getAliasMapping();
    if (mapping.isEmpty()) {
      return relBuilder.peek();
    }
  • Drive the mapping off the node on top of the builder stack rather than this, so future implementers aren’t required to call the method only on the same instance as relBuilder.peek():
    AliasFieldsWrappable top = (AliasFieldsWrappable) relBuilder.peek();
    Set<Entry<String, String>> aliasFieldsSet = top.getAliasMapping().entrySet();

Current usage is safe, so these are nice-to-have hardening steps rather than required fixes.

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

146-205: Relation visitation correctly hooks in alias-field projection

The updated visitRelation:

  • Scans the table as before.
  • If the resulting RelNode implements AliasFieldsWrappable (e.g., AbstractCalciteIndexScan), immediately wraps it with the alias‑projection via wrapProjectForAliasFields.

This is exactly where the alias projection belongs in the Calcite pipeline and ensures later phases see alias columns as ordinary projected fields rather than physical schema columns.

As a small optional optimization, you could avoid building an extra Project when there are no alias mappings:

RelNode scan = context.relBuilder.peek();
if (scan instanceof AliasFieldsWrappable aliasScan
    && !aliasScan.getAliasMapping().isEmpty()) {
  return aliasScan.wrapProjectForAliasFields(context.relBuilder);
}
return scan;

Functionally, the current implementation is fine.

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java (1)

11-15: Base scan’s AliasFieldsWrappable implementation is minimal and well-placed

Having AbstractCalciteIndexScan implement AliasFieldsWrappable and delegate

@Override
public Map<String, String> getAliasMapping() {
  return osIndex.getAliasMapping();
}

to OpenSearchIndex cleanly centralizes alias metadata at the storage layer while letting the planning layer (via AliasFieldsWrappable.wrapProjectForAliasFields) build the corresponding projections. For indices without alias-type fields an empty map will result in at most an identity project, so this is safe across the board.

If you later add more consumers of alias metadata, consider adding brief Javadoc on getAliasMapping()’s contract (e.g., “key = alias name, value = original field path”) to prevent accidental inversion of the mapping.

Also applies to: 47-52, 64-91, 257-260

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2f64c40 and aeb055c.

📒 Files selected for processing (9)
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (2 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/plan/AliasFieldsWrappable.java (1 hunks)
  • core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1 hunks)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (3 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite/explain_alias_type_field.yaml (1 hunks)
  • integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_alias_type_field.yaml (1 hunks)
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/AbstractCalciteIndexScan.java (4 hunks)
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteEnumerableIndexScan.java (1 hunks)
  • opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/legacy/TestsConstants.java (1)
  • TestsConstants (9-101)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)
  • GitHub Check: security-it-linux (21)
  • GitHub Check: security-it-linux (25)
  • GitHub Check: build-linux (21, integration)
  • GitHub Check: build-linux (25, integration)
  • GitHub Check: build-linux (21, unit)
  • GitHub Check: build-linux (25, doc)
  • GitHub Check: build-linux (25, unit)
  • GitHub Check: build-linux (21, doc)
  • GitHub Check: bwc-tests-full-restart (21)
  • GitHub Check: bwc-tests-full-restart (25)
  • GitHub Check: bwc-tests-rolling-upgrade (25)
  • GitHub Check: bwc-tests-rolling-upgrade (21)
  • GitHub Check: build-windows-macos (macos-14, 25, doc)
  • GitHub Check: build-windows-macos (macos-14, 21, doc)
  • GitHub Check: build-windows-macos (macos-14, 25, integration)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (macos-14, 25, unit)
  • GitHub Check: build-windows-macos (macos-14, 21, integration)
  • GitHub Check: build-windows-macos (macos-14, 21, unit)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, integration)
  • GitHub Check: build-windows-macos (windows-latest, 21, -PbuildPlatform=windows, unit)
  • GitHub Check: build-windows-macos (windows-latest, 25, -PbuildPlatform=windows, unit)
  • GitHub Check: security-it-windows-macos (macos-14, 25)
  • GitHub Check: security-it-windows-macos (macos-14, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 21)
  • GitHub Check: security-it-windows-macos (windows-latest, 25)
  • GitHub Check: CodeQL-Scan (java)
  • GitHub Check: WhiteSource Security Check
🔇 Additional comments (7)
integ-test/src/test/resources/expectedOutput/calcite/explain_alias_type_field.yaml (2)

2-7: Logical plan correctly demonstrates alias field introduction via Project operator.

The LogicalProject operator at line 6 (LogicalProject(alias_col=[$0])) is properly positioned before filters and aggregations, aligning with the refactoring objective to move alias resolution logic from Scan to a dedicated Project operator. The field reference $0 flows through the filter and aggregation correctly.


8-9: Manual verification required for test integration points.

The physical plan correctly maps the alias field to the original field in pushdown operations (range query on "original_col" and avg aggregation on "original_col"). However, the following integration points require verification that could not be completed:

  1. Confirm that the test index opensearch-sql_test_index_alias exists in test setup and has the correct alias field mappings.
  2. Verify that the testAliasTypeField() method in CalciteExplainIT generates this exact output.
  3. Ensure the test file is loaded and assertions are properly configured.
core/src/main/java/org/opensearch/sql/calcite/utils/OpenSearchTypeFactory.java (1)

310-321: Skipping alias-type fields in schema aligns with new alias projection model

Filtering out fields whose ExprType exposes an originalPath before building fieldNameList/typeList is consistent with moving alias handling into a separate projection and avoids alias columns leaking into the base scan schema. Given convertExprTypeToRelDataType already delegates alias types to getOriginalExprType(), the two behaviors are coherent and should keep downstream code working as expected.

integ-test/src/test/resources/expectedOutput/calcite_no_pushdown/explain_alias_type_field.yaml (1)

1-13: Alias-field explain fixture matches the intended logical/physical shape

The expected plan correctly shows LogicalProject(alias_col=[$0]) over CalciteLogicalIndexScan and corresponding Enumerable operators, which is consistent with the refactored alias-field handling. No issues from a test-fixture standpoint.

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteEnumerableIndexScan.java (1)

111-123: Using row type field names for the enumerator is consistent with centralized alias handling

Passing getRowType().getFieldNames() into OpenSearchIndexEnumerator matches the new model where alias columns are no longer part of the scan schema and are added by a separate projection. This avoids duplicating alias‑resolution logic in the enumerator and keeps its view aligned with the actual scan row type.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

8-10: Alias-type explain IT wiring looks correct and complete

Bringing in TEST_INDEX_ALIAS, loading Index.DATA_TYPE_ALIAS during init(), and adding testAliasTypeField() that explains a query over alias_col against TEST_INDEX_ALIAS together give good end‑to‑end coverage of the new alias-field behavior. The test’s expected plan filename matches the newly added YAML resource, so this should reliably guard regressions around alias projection and pushdown.

Also applies to: 36-47, 1959-1968

opensearch/src/main/java/org/opensearch/sql/opensearch/storage/scan/CalciteLogicalIndexScan.java (1)

240-281: Project pushdown now correctly uses the scan schema’s field names

In pushDownProject, using newSchema.getFieldNames().stream() in the OSRequestBuilderAction is appropriate now that alias handling has moved out of the scan:

requestBuilder -> requestBuilder.pushDownProjectStream(newSchema.getFieldNames().stream());

newSchema is built as a subset of the scan’s row type (which no longer includes alias-type fields), so this pushes down only real OpenSearch fields and leaves alias columns to be handled by the separate projection layer. The aggregate‑pushed branch remains a no‑op on the request builder, so existing agg pushdown semantics are preserved.

@yuancu yuancu merged commit 6f7eae0 into opensearch-project:main Nov 28, 2025
42 of 43 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.19-dev failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/sql/backport-2.19-dev 2.19-dev
# Navigate to the new working tree
pushd ../.worktrees/sql/backport-2.19-dev
# Create a new branch
git switch --create backport/backport-4881-to-2.19-dev
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6f7eae0f04f8d05b087a8b1014799faae3a44479
# Push it to GitHub
git push --set-upstream origin backport/backport-4881-to-2.19-dev
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/sql/backport-2.19-dev

Then, create a pull request where the base branch is 2.19-dev and the compare/head branch is backport/backport-4881-to-2.19-dev.

qianheng-aws added a commit to qianheng-aws/sql that referenced this pull request Nov 28, 2025
LantaoJin pushed a commit that referenced this pull request Nov 28, 2025
…ct with alias (#4881) (#4883)

* Refactor alias type field by adding another project with alias (#4881)

Signed-off-by: Heng Qian <[email protected]>

(cherry picked from commit 6f7eae0)
Signed-off-by: Heng Qian <[email protected]>

* Fix compiling

Signed-off-by: Heng Qian <[email protected]>

---------

Signed-off-by: Heng Qian <[email protected]>
@LantaoJin LantaoJin added the backport-manually Filed a PR to backport manually. label Nov 28, 2025
asifabashar pushed a commit to asifabashar/sql that referenced this pull request Dec 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev backport-failed backport-manually Filed a PR to backport manually. calcite calcite migration releated enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Alias type field refactoring by adding project upon scan

3 participants