Skip to content

Conversation

@bugmakerrrrrr
Copy link
Contributor

@bugmakerrrrrr bugmakerrrrrr commented Dec 19, 2025

Description

This commit takes advantage of doc values skipper to check the range of date field when rewriting query on the shard level

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • New Features

    • Added doc values skipping optimization for date field queries to improve range query performance.
  • Tests

    • Updated test coverage to validate doc values skipping functionality for date field type variants.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 19, 2025

Walkthrough

Introduces hasSkiplist boolean parameter to DateFieldType constructor to enable skiplist-based doc value optimization. Updates isFieldWithinQuery method to leverage DocValuesSkipper when skiplist is enabled, and propagates the parameter across implementation and test files.

Changes

Cohort / File(s) Summary
Core implementation
server/src/main/java/org/opensearch/index/mapper/DateFieldMapper.java
Added hasSkiplist field to DateFieldType with corresponding constructor parameter. Refactored isFieldWithinQuery to derive min/max bounds from PointValues (if searchable) or from aggregated DocValuesSkipper across leaf contexts (if doc values with skiplist enabled). Updated all constructors and builder to pass hasSkiplist value.
Core tests
server/src/test/java/org/opensearch/index/mapper/DateFieldTypeTests.java
Added parameterized test variants for doc-value skipper (using SortedNumericDocValuesField). Renamed existing tests to indicate point-index usage. Refactored isFieldWithinRangeTestCase to accept generic BiFunction<String, Long, IndexableField> fieldBuilder. Updated constructor calls to include new hasSkiplist parameter.
Aggregation tests
server/src/test/java/org/opensearch/search/aggregations/AggregatorBaseTests.java, server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java, server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregatorTestCase.java, server/src/test/java/org/opensearch/search/aggregations/bucket/range/DateRangeAggregatorTests.java, server/src/test/java/org/opensearch/search/aggregations/bucket/range/RangeAggregatorTests.java, server/src/test/java/org/opensearch/search/aggregations/bucket/terms/MultiTermsAggregatorTests.java
Updated DateFieldType constructor invocations to include new hasSkiplist parameter (passed as false). Minor changes to constructor argument order in some files.

Sequence Diagram

sequenceDiagram
    participant Query as Query Executor
    participant ISWQ as isFieldWithinQuery
    participant PR as PointValues Reader
    participant DVS as DocValuesSkipper
    participant Leaf as LeafReaderContext
    
    Query->>ISWQ: isFieldWithinQuery(IndexReader, bounds)
    alt Field is Searchable
        ISWQ->>PR: extract min/max from PointValues
        PR-->>ISWQ: minValue, maxValue
        ISWQ->>ISWQ: compare with query range
    else Has DocValues with Skiplist
        ISWQ->>Leaf: iterate leaf contexts
        loop For Each Leaf
            Leaf->>DVS: get skipper stats
            DVS-->>Leaf: min, max per leaf
            Leaf->>ISWQ: aggregate global bounds
        end
        ISWQ->>ISWQ: check global min vs max
        alt Global min > max
            ISWQ-->>Query: DISJOINT
        else
            ISWQ->>ISWQ: compare with query range
        end
    else No Skiplist Data
        ISWQ-->>Query: INTERSECTS (conservative)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Primary logic changes: The isFieldWithinQuery method refactor with branching logic for point-based vs. doc-value-based bounds extraction requires careful review of correctness and edge cases (particularly aggregation across leaf contexts and min/max comparisons).
  • Constructor signature propagation: While the hasSkiplist parameter addition is straightforward, it propagates across 6+ test files with similar but distinct contexts (aggregation tests), requiring verification that each instantiation uses the correct value.
  • Test parameterization: New test variants in DateFieldTypeTests.java using generic field builder patterns need careful review to ensure both point-index and doc-value-skipper paths are properly tested.

Possibly related PRs

Suggested labels

Search:Aggregations, Search:Performance, v3.4.0, backport 3.4

Suggested reviewers

  • jainankitk
  • msfroh
  • andrross
  • dbwiddis
  • kotwanikunal
  • sachinpkale
  • shwetathareja
  • anasalkouz

Poem

🐰 Skiplists hop through doc values bright,
Bounds dance between points and leaves in flight,
Min and max now query swift and lean,
Aggregations bloom where skippers have been! 🌿✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: utilizing doc values skipper for date field range checking.
Description check ✅ Passed The description explains the core functionality and the PR includes tests, API changes, and documentation updates as indicated by checked items.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 01b48c3 and bda04be.

📒 Files selected for processing (8)
  • server/src/main/java/org/opensearch/index/mapper/DateFieldMapper.java (4 hunks)
  • server/src/test/java/org/opensearch/index/mapper/DateFieldTypeTests.java (12 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/AggregatorBaseTests.java (1 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java (1 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregatorTestCase.java (1 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/bucket/range/DateRangeAggregatorTests.java (1 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/bucket/range/RangeAggregatorTests.java (2 hunks)
  • server/src/test/java/org/opensearch/search/aggregations/bucket/terms/MultiTermsAggregatorTests.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/mapper/DateFieldTypeTests.java (1)
server/src/main/java/org/opensearch/index/mapper/DateFieldMapper.java (1)
  • DateFieldMapper (100-926)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
🔇 Additional comments (19)
server/src/main/java/org/opensearch/index/mapper/DateFieldMapper.java (4)

39-42: LGTM!

The new imports for DocValuesSkipper and LeafReaderContext are necessary for the new skiplist-based range checking functionality.


417-451: LGTM!

The hasSkiplist field is properly declared and propagated through all constructors. The convenience constructors correctly default hasSkiplist to false, maintaining backward compatibility.


673-709: Well-structured range checking logic.

The implementation correctly handles multiple scenarios:

  1. Searchable fields use PointValues for efficient min/max lookup
  2. Non-searchable fields with doc values and skiplist aggregate global bounds across all leaf contexts
  3. Conservative fallback to INTERSECTS when any segment lacks a skipper prevents false negatives

The edge case at lines 702-704 correctly identifies an empty index by checking globalMin > globalMax (since they're initialized to opposite extremes).


374-385: LGTM!

The builder correctly propagates skiplist.getValue() to the DateFieldType constructor.

server/src/test/java/org/opensearch/search/aggregations/bucket/range/RangeAggregatorTests.java (2)

157-168: LGTM!

The false value for hasSkiplist correctly aligns with the expanded DateFieldType constructor signature.


188-200: LGTM!

Constructor parameter update is consistent with the new API.

server/src/test/java/org/opensearch/search/aggregations/bucket/terms/MultiTermsAggregatorTests.java (1)

1010-1022: LGTM!

The dateFieldType helper is correctly updated with the new hasSkiplist parameter set to false.

server/src/test/java/org/opensearch/search/aggregations/bucket/filterrewrite/FilterRewriteSubAggTests.java (1)

746-758: LGTM!

The aggregableDateFieldType helper correctly includes the new hasSkiplist parameter with false as the default value.

server/src/test/java/org/opensearch/search/aggregations/bucket/histogram/DateHistogramAggregatorTestCase.java (1)

122-134: LGTM!

The aggregableDateFieldType helper is correctly updated with the new hasSkiplist parameter defaulting to false.

server/src/test/java/org/opensearch/search/aggregations/AggregatorBaseTests.java (1)

134-145: LGTM!

The constructor call is correctly updated to include the new hasSkiplist parameter (set to false) before the DateTimeFormatter. This maintains the existing test behavior while aligning with the updated DateFieldType API.

server/src/test/java/org/opensearch/search/aggregations/bucket/range/DateRangeAggregatorTests.java (1)

271-281: LGTM!

The DateFieldType constructor call is correctly updated with the new hasSkiplist parameter set to false, maintaining the existing test behavior while conforming to the updated API signature.

server/src/test/java/org/opensearch/index/mapper/DateFieldTypeTests.java (8)

93-96: LGTM!

The new imports are well-organized: BiFunction supports the parameterized field builder pattern, and the static import of getDefaultDateTimeFormatter() improves readability throughout the test file.


112-150: Good test coverage for both indexing strategies.

The test structure is well-designed:

  • Existing tests renamed with PointIndex suffix clarify they test point-index-based range checking
  • New DVSkipper tests correctly configure hasSkiplist=true with indexed=false and hasDocValues=true to exercise the doc values skipper path
  • Both MILLISECONDS and NANOSECONDS resolutions are covered for each strategy

152-180: Well-designed parameterization.

The BiFunction<String, Long, IndexableField> parameter cleanly abstracts the field creation, allowing the same test logic to validate range checking for both LongPoint-based and SortedNumericDocValuesField-based indexing strategies without code duplication.


278-291: Correct unsearchable field configuration.

The constructor correctly sets hasSkiplist=false alongside hasDocValues=false, which is semantically correct since a skip list requires doc values to be enabled.


464-467: LGTM!

The helper method correctly propagates hasSkiplist=false to maintain existing test behavior.


513-523: LGTM!

Constructor correctly updated with hasSkiplist=false for the date resolution overflow test.


612-622: LGTM!

Constructor correctly updated with hasSkiplist=false for the null value handling test.


631-642: LGTM!

Constructor correctly updated with hasSkiplist=false for the date field type with nulls test.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

❌ Gradle check result for bda04be: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant