Skip to content

Conversation

@songkant-aws
Copy link
Contributor

@songkant-aws songkant-aws commented Jan 26, 2026

Description

Fix the bug discovered in #5054. See root cause description in #5054 (comment)

Related Issues

Resolves #5054

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Improved boolean field filtering in queries with optimized pushdown to reduce query execution overhead.
    • Enhanced handling of boolean field comparisons and operators (NOT, IS_TRUE) for better query optimization.
  • Tests

    • Added comprehensive integration tests validating boolean field filtering with various conditions and data types.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

This PR adds support for pushing down boolean field equality comparisons as term queries within filter predicates, enabling query optimization when filters combine boolean field conditions with other operators like query_string.

Changes

Cohort / File(s) Summary
Boolean field pushdown logic
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Added boolean field detection and conversion logic. Direct boolean NamedFieldExpression now converts to term queries (true/false). Enhanced NOT and IS_TRUE handling for boolean fields with isBooleanType() checks. Added isFalse() helper to SimpleQueryExpression for false-term query generation.
PredicateAnalyzer unit tests
opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java
Introduced new boolean field "e" in test schema. Added two test cases: isTrue_booleanField_generatesTermQuery and isTrue_booleanFieldCombinedWithOtherCondition_generatesCompoundQuery validating term query generation for IS_TRUE on boolean fields.
Calcite integration tests
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
Added four public test methods: testFilterQueryStringWithBooleanFieldPushDown, testFilterBooleanFieldWithTRUE, testFilterBooleanFieldWithStringLiteral, and testFilterBooleanFieldFalse. Each validates expected Calcite explain plans with boolean field pushdown enabled.
Calcite expected output files
integ-test/src/test/resources/expectedOutput/calcite/explain_filter_query_string_with_boolean.yaml, integ-test/src/test/resources/expectedOutput/calcite/explain_filter_query_string_with_boolean_false.yaml
New YAML resource files defining expected Calcite logical and physical plans, including OpenSearchRequestBuilder payloads with pushed-down boolean term filters combined with query_string conditions.
YAML REST integration test
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml
New integration test fixture that creates a boolean-field index, bulk-inserts documents, executes a filter query on is_internal=true, and validates two matching results with plugin lifecycle control.
AggregateAnalyzer test updates
opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
Updated analyze_aggCall_complexScriptFilter test expectation: filter_bool_count DSL changed from script-based filter to term query filter on field d with value true.

Sequence Diagram

sequenceDiagram
    participant Client
    participant PredicateAnalyzer
    participant NamedFieldExpression
    participant SimpleQueryExpression
    participant DSLGenerator

    Client->>PredicateAnalyzer: analyzeExpression(filter with boolean field)
    PredicateAnalyzer->>NamedFieldExpression: isBooleanType() check
    NamedFieldExpression-->>PredicateAnalyzer: true (boolean field detected)
    PredicateAnalyzer->>SimpleQueryExpression: isTrue() / isFalse()
    SimpleQueryExpression->>DSLGenerator: create TermQuery(field, true/false)
    DSLGenerator-->>SimpleQueryExpression: TermQuery object
    SimpleQueryExpression-->>PredicateAnalyzer: QueryExpression with term
    PredicateAnalyzer->>DSLGenerator: combine with other filters (AND/OR)
    DSLGenerator-->>PredicateAnalyzer: BoolQuery with must clauses
    PredicateAnalyzer-->>Client: optimized DSL with pushed-down boolean term
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

bugFix, PPL, testing

Suggested reviewers

  • penghuo
  • ps48
  • derek-ho
  • joshuali925
  • anirudha
  • dai-chen
  • Swiddis
  • Yury-Fridlyand
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title describes a bug fix for boolean comparison conditions being simplified to fields, which aligns with the changeset's focus on handling boolean field pushdown and comparison operators.
Description check ✅ Passed The description references issue #5054 and its root cause, relates to boolean field comparison bug fixes, and documents testing checklist items completed—all directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@penghuo penghuo added bugFix PPL Piped processing language labels Jan 26, 2026
Content-Type: 'application/json'
ppl:
body:
query: source=test-boolean | where is_internal=true | fields name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using failed query source=test url=http | where is_internal=true
in #5054

Comment on lines +582 to +586
// Handle NOT(IS_TRUE(boolean_field)) - convert to term query with false value
// This covers cases where IS_TRUE was explicitly applied
if (expr instanceof SimpleQueryExpression simpleExpr && simpleExpr.isBooleanFieldIsTrue()) {
return QueryExpression.create(simpleExpr.rel).isFalse();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (NOT boolean_field = true) will return fields include ture, null and missing fields
  • but boolean_field=false only return fields has false value.

// generate a term query with value true.
// When called on an already-evaluated predicate (builder already set),
// return as-is.
if (builder == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to override isTrue and not API for NamedFieldExpression instead of changing SimpleQueryExpression?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugFix PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] PPL where command does not work as expected.

3 participants