Skip to content

Conversation

@RS146BIJAY
Copy link
Contributor

@RS146BIJAY RS146BIJAY commented Dec 2, 2025

Description

Fixing indexing regression and bug fixes for grouping criteria. For testing grouping criteria changes, enabled the grouping criteria on local and tested with setting criteria. Wil raise the changes for integ test enablement for CAS in a separate PR as that require decent changes in integ test as well.

Related Issues

#19919

Check List

  • Functionality includes testing.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed a bulk indexing regression by removing an obsolete retry path and related exception handling.
  • Improvements

    • Better write accounting and stability for composite index writes via explicit document-count tracking.
    • Narrowed composite grouping detection and improved codec selection for context-aware indexing.
    • Added no-op hooks for context-aware grouping field mapping.
  • Tests

    • Removed deprecated tests and added extensive coverage for RAM/flush, tragic-exception, and concurrency scenarios.
  • Documentation

    • Updated changelog.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

📝 Walkthrough

Walkthrough

Removed a custom lookup-lock exception and its retry handling; added size-aware multi-document write APIs and child-writer pending-doc accounting; hardened RAM/flush/tragic-exception handling and codec selection; narrowed composite-field-type filtering; updated and extended tests; changelog text updated.

Changes

Cohort / File(s) Summary
Exception removal & serialization
server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java, server/src/main/java/org/opensearch/OpenSearchServerException.java, server/src/test/java/org/opensearch/ExceptionSerializationTests.java
Deleted LookupMapLockAcquisitionException; removed its registration, mappings, imports, and test IDs.
Bulk action retry logic & tests
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java, server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
Removed retry branch and helper for the deleted exception and deleted related tests.
CompositeIndexWriter — accounting & metrics
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Added childWriterPendingNumDocs (AtomicLong); propagate size-aware increments across add/soft-update/delete flows; added getFlushingBytesUtil/ramBytesUsedUtil with tragic/AlreadyClosedException handling; acquireNewReadLock() test hook; refresh subtracts old writer pending counts.
Document writer API surface
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java, server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java, server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java, server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
Multi-document APIs now accept int size for addDocuments and softUpdateDocuments; implementations updated to accept/pass counts (some delegations currently ignore size).
Engine call-sites & ingestion
server/src/main/java/org/opensearch/index/engine/InternalEngine.java, server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
Updated multi-document call sites to pass document counts (docs.size()); single-doc paths unchanged.
Native codec / IndexWriter config
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
IndexWriterConfig now uses CriteriaBasedCodec(...) when context-aware mode is enabled; otherwise uses base codec.
Mapper & field-type filtering
server/src/main/java/org/opensearch/index/mapper/MapperService.java, server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
getCompositeFieldTypes() now filters to CompositeDataCubeFieldType; added no-op canDeriveSource() and deriveSource(...) overrides to ContextAwareGroupingFieldMapper.
Tests & test utilities
server/src/test/java/org/opensearch/index/engine/*, server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java, server/src/test/java/org/opensearch/action/bulk/*
Removed tests tied to deleted exception; added extensive tragic-exception RAM/flush/get tests, delete-concurrency test, FlushingIndexWriterFactory test utility and EngineConfig overload; updated many tests to pass doc counts.
Misc test/import cleanup
test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java, others
Removed unused/static imports and minor test cleanups.
Changelog
CHANGELOG.md
Added an Unreleased 3.x changelog line referencing indexing regression and grouping criteria fixes (text-only).

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant TransportShardBulkAction
  participant CompositeIndexWriter
  participant CriteriaBasedIndexWriterLookup
  participant ChildIndexWriter

  Client->>TransportShardBulkAction: submit bulk request
  TransportShardBulkAction->>CompositeIndexWriter: index document(s)
  CompositeIndexWriter->>CriteriaBasedIndexWriterLookup: try to acquire read lock / lookup map
  alt lookup map unavailable
    CriteriaBasedIndexWriterLookup-->>CompositeIndexWriter: null/closed
    CompositeIndexWriter->>CompositeIndexWriter: handle closed map (no lookup-lock exception retry)
  else map acquired
    CriteriaBasedIndexWriterLookup-->>CompositeIndexWriter: LiveIndexWriterDeletesMap
    CompositeIndexWriter->>ChildIndexWriter: addDocument(s)/softUpdateDocuments(with size)
    ChildIndexWriter-->>CompositeIndexWriter: success or tragic exception
    CompositeIndexWriter->>CompositeIndexWriter: update childWriterPendingNumDocs and metrics
  end
  CompositeIndexWriter-->>TransportShardBulkAction: response
  TransportShardBulkAction-->>Client: bulk response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

bug, Indexing

Suggested reviewers

  • msfroh
  • andrross
  • mch2
  • dbwiddis
  • cwperks
  • sachinpkale
  • kotwanikunal
  • shwetathareja
  • Rishikesh1159
  • jed326

Poem

🐇 I hopped through maps both old and new,
Counts tucked under paws for every queue,
Locks removed — I skipped the retry hop,
RAM and flushes guarded when errors drop,
A little hop: "Indexing — steady, don't stop!"

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.80% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main objective of the PR, which is fixing an indexing regression and addressing bug fixes for grouping criteria functionality.
Description check ✅ Passed The PR description adequately covers the main change, related issue, testing approach, and follows the required license statement, though some optional checklist items are incomplete.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)

215-215: Reduced encapsulation of mapReadLock.

The visibility of mapReadLock has been changed from private final to package-private, allowing direct access from other classes in the same package. This field controls critical concurrency behavior, and exposing it directly increases the risk of misuse.

Consider:

  1. Keeping the field private and exposing only necessary operations through methods (e.g., tryAcquireLock()).
  2. If package-private access is required for the retry logic, add clear documentation about proper usage patterns and thread-safety requirements.
  3. Restrict access using a package-private accessor method rather than exposing the field directly.

498-498: Simplify boolean comparisons.

The condition uses explicit == false and == true comparisons which are redundant in Java.

Apply this diff:

-if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) {
+if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) {
     current.mapReadLock.close();
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f76826c and 1d42f98.

📒 Files selected for processing (8)
  • CHANGELOG.md (1 hunks)
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1 hunks)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (1 hunks)
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (10 hunks)
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java (2 hunks)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java (0 hunks)
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (4 hunks)
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1 hunks)
💤 Files with no reviewable changes (1)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: gradle-check
🔇 Additional comments (13)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)

87-87: LGTM!

Import correctly added to support the stream operations in getCompositeFieldTypes().


693-697: Verify the behavior change scope and call frequency.

The filtering to return only StarTreeMapper.StarTreeFieldType instances represents a narrowed scope from returning all composite field types. Confirm this change is intentional and whether any callers expect other CompositeMappedFieldType implementations. Additionally, verify the call frequency of this method; if invoked on hot paths, consider caching the filtered result to avoid repeated stream collection operations.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)

106-106: LGTM!

The test constant is appropriately set to a lower value (20) than the production default (100) for faster test execution while still being within the valid range (5-500).

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)

44-46: LGTM!

Mockito imports correctly added to support the new verification test.


71-77: LGTM!

Method call correctly updated to include MAX_NUMBER_OF_RETRIES parameter, aligning with the new bounded retry API.


141-146: LGTM!

Method call correctly updated with retry parameter.


197-202: LGTM!

Method call correctly updated with retry parameter.


208-227: Test validates bounded retry semantics correctly.

The test properly verifies:

  1. LookupMapLockAcquisitionException is thrown after exhausting retries
  2. tryAcquire() is called exactly MAX_NUMBER_OF_RETRIES times

One consideration: the mock setup directly assigns to map.current and map.current.mapReadLock which accesses package-private fields. This works for testing but creates tight coupling to internal implementation details.

server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

724-753: Retry logic moved to lower layer - verify exception handling.

The LookupMapLockAcquisitionException retry logic has been removed from bulk action handling and moved to CompositeIndexWriter with bounded retries. This architectural approach places retry logic closer to where the exception originates.

Ensure that when LookupMapLockAcquisitionException propagates up after max retries are exhausted, it's properly handled and doesn't cause unexpected bulk operation failures.

server/src/main/java/org/opensearch/index/IndexSettings.java (1)

499-506: Significant default value change - verify upgrade impact.

The default retry count increased to 100 with a maximum of 500. Since this is a dynamic setting, existing indices will apply the new default upon upgrade. Consider whether this change should be documented in release notes for operators who have tuned their clusters based on previous defaults.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (3)

691-693: LGTM: Metrics gathering refactoring.

The refactoring from stream-based iteration to explicit for-loops improves code clarity and performance for these simple aggregation operations. The logic is correct in all cases, with proper handling of both current and old maps where necessary, and appropriate locking in ramBytesUsed().

Also applies to: 702-704, 731-742, 758-770, 796-806


210-210: Verify removal of final modifier is intentional.

The final modifier has been removed from CriteriaBasedIndexWriterLookup, CriteriaBasedWriterLock, and LiveIndexWriterDeletesMap. This allows subclassing of these internal implementation classes. Confirm whether:

  1. Subclassing is required for test mocking/stubbing.
  2. If so, consider restricting visibility to test scope or use sealed classes.
  3. If intentional for production extensibility, document extension points and invariants.

Also applies to: 301-301, 406-406


678-679: Verify retry configuration defaults and bounds.

The maxRetryOnLookupMapAcquisitionException setting controls retry behavior for lookup map acquisition. Without access to the codebase, I cannot confirm the specific default values, upper/lower bounds, or whether they align with industry best practices for lock acquisition retry mechanisms. Ensure the following:

  • Default retry count is reasonable (typically 3–10 attempts for in-thread locks per best practices)
  • Upper bounds prevent excessive retries that could cause performance degradation
  • Configuration is documented and tunable per workload expectations
  • Retry strategy includes backoff/jitter to avoid retry storms (not just blind spinning)

CHANGELOG.md Outdated
- Fix node bootstrap error when enable stream transport and remote cluster state ([#19948](https://github.com/opensearch-project/OpenSearch/pull/19948))
- Fix deletion failure/error of unused index template; case when an index template matches a data stream but has a lower priority. ([#20102](https://github.com/opensearch-project/OpenSearch/pull/20102))
- Fix toBuilder method in EngineConfig to include mergedSegmentTransferTracker([20105](https://github.com/opensearch-project/OpenSearch/pull/20105))
- Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor formatting inconsistency.

The PR link format is inconsistent with other entries in the changelog. Missing # before the PR number.

Apply this diff to fix the formatting:

-- Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
+- Fix indexing regression and bug fixes for grouping criteria ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
- Fix indexing regression and bug fixes for grouping criteria ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
🤖 Prompt for AI Agents
CHANGELOG.md around line 94: the PR link is missing the '#' before the PR number
which makes it inconsistent with other entries; update the link by inserting '#'
before the PR number so it reads .../pull/#20145 (i.e. change
(https://github.com/opensearch-project/OpenSearch/pull/20145) to include the
hash) and ensure the rest of the entry formatting matches surrounding lines.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

475-489: Add exponential backoff to retry loop.

The retry loop performs a busy-wait by calling tryAcquire() repeatedly without any delay between attempts. This pattern wastes CPU cycles under contention when the lock is frequently unavailable.

As suggested in the previous review, add exponential backoff between retry attempts with Thread.sleep(Math.min(1L << counter, 100L)) and handle InterruptedException by restoring the interrupt flag and propagating LookupMapLockAcquisitionException.

🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

483-483: Simplify boolean comparisons.

The expressions current.isClosed() == true (line 483) and current.mapReadLock.isHeldByCurrentThread() == true (line 498) contain redundant boolean comparisons.

Apply this diff to simplify:

-                    if (current != null && current.isClosed() == true) {
+                    if (current != null && current.isClosed()) {
-                if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) {
+                if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) {

Also applies to: 498-498

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d42f98 and 4b247a0.

📒 Files selected for processing (8)
  • CHANGELOG.md (1 hunks)
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1 hunks)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (1 hunks)
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (10 hunks)
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java (2 hunks)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java (0 hunks)
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (4 hunks)
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1 hunks)
💤 Files with no reviewable changes (1)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (5)
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/index/IndexSettings.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
🔇 Additional comments (6)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-731: Behavior change: retries now only on version conflicts in onComplete

The retry path in onComplete now exclusively handles VersionConflictEngineException (via isConflictException) bounded by retryOnConflict, with the previous LookupMapLockAcquisitionException-specific retry branch removed. That seems consistent with moving lock-acquisition retry logic down into CompositeIndexWriter and keeps this layer from depending on engine-internal exception types.

Please double-check that:

  • LookupMapLockAcquisitionException is fully retried/consumed within CompositeIndexWriter (or below), and
  • it does not bubble back up to this point expecting a retry here anymore,

so we don’t inadvertently change failure behavior for callers that previously relied on the bulk action to transparently retry on transient lookup-map lock contention.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)

210-210: Visibility changes support the new retry mechanism.

The removal of final modifiers from inner classes and the change of mapReadLock to package-private access are necessary to support the refactored retry logic. The package-private access on mapReadLock (line 215) enables LiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteria to call tryAcquire() directly at line 482.

Also applies to: 215-215, 301-301, 406-406, 408-408


466-471: Proper parameter propagation for configurable retry limit.

The addition of maxRetryOnLookupMapAcquisitionException parameter enables configurable retry behavior, and the value is correctly sourced from engineConfig.getIndexSettings().

Also applies to: 675-680


691-693: Simplified iteration improves readability.

The refactoring from stream-based iteration to direct iteration over criteriaBasedIndexWriterMap.values() is cleaner and avoids unnecessary intermediate operations.

Also applies to: 702-704


731-750: Proper tragic exception checking across all writers.

The iteration through both current and old writer maps to detect tragic exceptions is thorough and correctly checks if writers are closed before accessing their tragic exception state.


753-774: Correct synchronization and state checks.

The ramBytesUsed() method properly acquires write locks before iterating, and the rollback() method correctly checks if writers are open before attempting rollback operations.

Also applies to: 794-811

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

❌ Gradle check result for 4b247a0: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

44-46: Retry wiring in tests looks correct; consider simplifying the mocking for readability.

  • Updating all computeIndexWriterIfAbsentForCriteria invocations to pass MAX_NUMBER_OF_RETRIES keeps the tests consistent with the new API and the retry semantics; this looks correct.
  • testMaxRetryCountWhenWriteLockDuringIndexing correctly verifies that tryAcquire() is invoked exactly MAX_NUMBER_OF_RETRIES times when the lock is never obtained, and the LookupMapLockAcquisitionException is thrown as expected.

As a minor test ergonomics tweak, you could stub and verify directly on writerLock instead of going through map.current.mapReadLock in the when(...) and verify(...) calls. That would make the test a bit less coupled to the internal layout of LiveIndexWriterDeletesMap and CriteriaBasedIndexWriterLookup while preserving the behavior being asserted.

Also applies to: 72-77, 141-146, 197-202, 208-227

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

210-215: Visibility and mutability changes for nested types are acceptable but could use an explicit “for testing” annotation.

Making CriteriaBasedIndexWriterLookup and CriteriaBasedWriterLock more visible, and relaxing mapReadLock and LiveIndexWriterDeletesMap.current from final, is understandable to support the new tests that need to mock and override these internals.

To keep the public surface area tidy and signal intent, consider adding an explicit @opensearch.internal (or similar) Javadoc tag or comment on these nested types/fields indicating that they are exposed primarily for testing. That helps discourage external production code from depending on them and makes future refactors easier.

Also applies to: 301-301, 406-412

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4b247a0 and dbcae67.

📒 Files selected for processing (8)
  • CHANGELOG.md (1 hunks)
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1 hunks)
  • server/src/main/java/org/opensearch/index/IndexSettings.java (1 hunks)
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (10 hunks)
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java (2 hunks)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java (0 hunks)
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (4 hunks)
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1 hunks)
💤 Files with no reviewable changes (1)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (4)
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/IndexSettings.java
  • CHANGELOG.md
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: detect-breaking-change
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)

87-87: LGTM!

The import is necessary for the Collectors.toSet() operation used in the getCompositeFieldTypes() method below.


694-696: Verify initialization safety, caller expectations, and method naming alignment.

The method now filters to return only StarTreeMapper.StarTreeFieldType instances from compositeMappedFieldTypes:

  1. Potential NPE risk: Verify that getCompositeFieldTypes() is never called before internalMerge() initializes compositeMappedFieldTypes at line 552. If called during early initialization phases, .stream() could fail on a null reference.

  2. Semantic narrowing: Confirm whether the method name getCompositeFieldTypes() still accurately reflects its behavior. If other composite field type implementations exist or may be added, consider renaming to getStarTreeFieldTypes() or updating documentation to clarify the filtering behavior.

  3. Performance: If getCompositeFieldTypes() is called frequently in hot paths, consider caching the filtered result to avoid recreating the set on each invocation.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

691-705: Iteration refactors over current/old writer maps look correct and improve clarity.

  • getFlushingBytes() and getPendingNumDocs() now iterate directly over liveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values(), summing per-child metrics before adding the accumulating writer’s values. This preserves behavior and is straightforward.
  • getTragicException() now checks both current and old child writers for a tragic exception before falling back to the accumulating writer, which ensures group-specific failures are surfaced.
  • ramBytesUsed() and rollback() explicitly iterate over both current and old writers, and the use of mapWriteLock.acquire() around the ramBytesUsed() traversals is appropriate for a consistent snapshot.

Overall, these loops are clear and consistent with the data structures being used; no issues from a correctness or concurrency standpoint.

Also applies to: 731-742, 757-772, 796-805

@github-actions
Copy link
Contributor

github-actions bot commented Dec 3, 2025

❌ Gradle check result for dbcae67: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The unbounded spin in CompositeIndexWriter that
repeatedly calls current.mapReadLock.tryAcquire() can starve threads; change the
loop to a bounded retry with exponential backoff and/or a timeout and respond to
thread interrupt/shutdown: e.g., attempt tryAcquire() in a loop with a small
Thread.sleep/backoff between attempts, track elapsed time and break with a clear
exception or return if a configured timeout is exceeded, and check
Thread.currentThread().isInterrupted() (and any local shutdown flag) to stop
retrying promptly; update associated callers to handle the new timeout/exception
behavior accordingly.
- Around line 1081-1083: The deleteInLucene method currently always increments
childWriterPendingNumDocs (childWriterPendingNumDocs.incrementAndGet()) even
when currentWriter is the parent accumulatingIndexWriter, causing
double-counting with accumulatingIndexWriter.getPendingNumDocs(); modify
deleteInLucene so the increment is only performed when currentWriter represents
a child writer (check currentWriter identity/type against
accumulatingIndexWriter or a child-writer flag) or remove the increment here and
move it to the caller that only updates childWriterPendingNumDocs for child
writers; update any related logic that relies on childWriterPendingNumDocs to
ensure counts remain consistent with getPendingNumDocs().
🧹 Nitpick comments (5)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

153-168: Improve resource management and exception handling.

The test correctly validates the no-op behavior, but has a couple of issues:

  1. Resource leak: The XContentBuilder created at line 164 is never closed, and the object is started but not closed.
  2. Overly broad exception handling: Catching Exception (line 165) instead of specific exceptions reduces test clarity.
♻️ Suggested refactor
 public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
     ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
     ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
         "context_aware_grouping",
         fieldType,
         new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
     );
     LeafReader leafReader = mock(LeafReader.class);

-    try {
+    try (var builder = XContentFactory.jsonBuilder().startObject()) {
         mapper.canDeriveSource();
-        mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
-    } catch (Exception e) {
-        fail(e.getMessage());
+        mapper.deriveSource(builder, leafReader, 0);
+        builder.endObject();
     }
 }

Note: With this change, the method will naturally propagate IOException (already declared in the method signature), making the test clearer.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

606-609: Test-only method exposing internal lock.

While exposing acquireNewReadLock() for testing purposes works, consider whether the new test scenario could be validated through public APIs instead. If not, ensure this method is clearly marked as test-only and not used in production code.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

154-204: Unused variable in concurrent test.

The AtomicBoolean run on line 171 is declared but never read or used in the test. Consider removing it to improve code clarity.

🧹 Cleanup
 CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
 CountDownLatch latch = new CountDownLatch(1);
-AtomicBoolean run = new AtomicBoolean(true);
 Thread refresher = new Thread(() -> {
     latch.countDown();
     try {
@@ -191,7 +190,6 @@
 } finally {
     IOUtils.closeWhileHandlingException(lock.getMapReadLock());
-    run.set(false);
     refresher.join();
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)

134-136: Unused size parameter in wrapper method.

The size parameter is intentionally unused as LuceneIndexWriter delegates directly to Lucene's IndexWriter, which tracks pending docs internally. Consider adding a comment or annotation to clarify this is expected for the wrapper pattern.

📝 Documentation improvement
+    /**
+     * Wrapper function for IndexWriter.addDocuments.
+     * 
+     * @param size The number of documents being added (unused in this wrapper, retained for API consistency)
+     */
     @Override
     public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException {
         return indexWriter.addDocuments(docs);
     }

144-154: Consistent unused size parameter pattern.

Like addDocuments, the size parameter in softUpdateDocuments is unused in this wrapper. The same documentation improvement suggested for addDocuments applies here for consistency.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3679ad and 3d9eacc.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
✅ Files skipped from review due to trivial changes (1)
  • CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (9)
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, macos-15)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: Mend Security Check
🔇 Additional comments (12)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

11-11: LGTM!

The new imports are necessary for the test method and are used appropriately.

Also applies to: 13-13, 20-20

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)

131-144: Well-documented field for pending document accounting.

The detailed comment clearly explains the purpose and behavior of childWriterPendingNumDocs. The note about temporary overshoot during refresh is helpful.


352-356: Good defensive check for closed lookup.

This prevents operations on closed maps by checking isClosed() after acquiring the lock and returning null if closed. This is a solid safety mechanism.


549-560: Correct pending document accounting during refresh.

The logic properly accumulates pending docs from old child writers and subtracts them from childWriterPendingNumDocs after merging. This prevents double-counting and aligns with the strategy described in the field documentation.


719-742: Robust exception handling for closed writers.

The pattern of catching AlreadyClosedException and checking for tragic exceptions is excellent. It distinguishes between normal closure (ignore) and catastrophic failure (rethrow), ensuring critical errors aren't silently swallowed during metric collection.


810-833: Consistent exception handling pattern.

The ramBytesUsedUtil method follows the same defensive pattern as getFlushingBytesUtil, providing consistent and robust behavior across metric collection methods.


853-880: Comprehensive rollback ensuring no resource leaks.

The rollback implementation properly cleans up both current and old child writers, catching and ignoring AlreadyClosedException as expected. The comment correctly notes this prevents file leaks despite appearing redundant.


927-942: Size-aware document accounting in addDocuments.

The implementation correctly increments childWriterPendingNumDocs by size after successful indexing, providing accurate accounting for multi-document operations.


960-986: Consistent size-aware accounting for updates.

The softUpdateDocuments method correctly increments childWriterPendingNumDocs by size and maintains delete entry tracking, consistent with the add operations pattern.


1030-1067: Correct delete accounting across writer generations.

The implementation properly handles deletes in both current and old child writers, incrementing childWriterPendingNumDocs only when delete entries are actually added. The conditional logic prevents overcounting stale operations.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

34-34: Consistent test updates for new addDocuments signature.

All test methods correctly pass operation.docs().size() as the third parameter to addDocuments, matching the new size-aware API.

Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263


122-130: Correct parameter ordering for softUpdateDocuments.

The test correctly passes operation.docs().size() as the 6th parameter (after primaryTerm), matching the updated signature.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

❌ Gradle check result for 3d9eacc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Remove obsolete lock acquisition retry setting and related dead code from IndexSettings.

The removal of LookupMapLockAcquisitionException retry handling from the transport layer is correct—lock acquisition is now handled resiliently in CompositeIndexWriter with proper locking primitives. However, the refactoring is incomplete:

  • INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION setting is still defined but unused
  • The field maxRetryOnLookupMapAcquisitionException is never read
  • The getter is defined but never called
  • The setter listener is registered but the value is never used
  • Javadoc at lines 515-520 in IndexSettings.java still references the removed exception

Remove the setting definition, field, getter, setter, listener registration, and update the javadoc.

🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 1081-1083: The increment of childWriterPendingNumDocs in
deleteInLucene is incorrect because deleteInLucene operates on currentWriter
(passed as accumulatingIndexWriter from deleteDocument) and Lucene's
accumulatingIndexWriter.getPendingNumDocs() is already included in
getPendingNumDocs() (see getPendingNumDocs usage around line 747), so remove the
childWriterPendingNumDocs.incrementAndGet() call from deleteInLucene to avoid
double-counting pending docs; ensure any remaining bookkeeping relies solely on
accumulatingIndexWriter.getPendingNumDocs() and tests for
deleteDocument/deleteInLucene reflect no net double increment.

In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 194-199: The Javadoc for the deriveSource override uses the term
"Context Aware Segment" which is inconsistent with this mapper's field name;
update the comment above the deriveSource( XContentBuilder builder, LeafReader
leafReader, int docId ) method in ContextAwareGroupingFieldMapper to refer to
"Context Aware Grouping" (or `context_aware_grouping`) instead, so the
terminology matches the mapper name and the field being omitted from generation.

In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java:
- Around line 153-168: The test leaves the XContentBuilder open and wraps calls
in an unnecessary try-catch; replace the current manual
XContentFactory.jsonBuilder() usage with a try-with-resources block that
constructs and closes the XContentBuilder, build the JSON object inside it
(startObject()/endObject()) and pass the builder's content to deriveSource, and
remove the surrounding try-catch since the test method already declares throws
IOException so any exception will fail the test; update references to
ContextAwareGroupingFieldMapper.canDeriveSource() and deriveSource(...)
accordingly.
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

154-204: New concurrency test covers important edge case.

The test exercises the scenario where a delete occurs while refresh is rotating maps, ensuring proper handling of documents in the old child writer.

However, the run variable (line 171) is set but never read - it appears to be leftover from a previous iteration.

Remove unused variable
         CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
         CountDownLatch latch = new CountDownLatch(1);
-        AtomicBoolean run = new AtomicBoolean(true);
         Thread refresher = new Thread(() -> {
             latch.countDown();
             try {
                 compositeIndexWriter.beforeRefresh();
             } catch (Exception ignored) {}
         });
 
         refresher.start();
         try {
             latch.await();
             compositeIndexWriter.deleteDocument(
                 operation.uid(),
                 false,
                 newDeleteTombstoneDoc(id),
                 1,
                 2,
                 primaryTerm.get(),
                 softDeletesField
             );
         } finally {
             IOUtils.closeWhileHandlingException(lock.getMapReadLock());
-            run.set(false);
             refresher.join();
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3d9eacc and b2c220a.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (6)
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • CHANGELOG.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.

Applied to files:

  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: detect-breaking-change
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: Mend Security Check
🔇 Additional comments (18)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)

11-12: LGTM!

The imports are necessary for the new deriveSource method signature.


187-192: Clarify Javadoc terminology to match the mapper's purpose.

The Javadoc refers to "Context Aware Segment field" but the mapper is for context-aware grouping. Update the comment to use consistent terminology that aligns with the class name.

The @Override annotation is already present and correctly applied.

📝 Suggested Javadoc fix
 /**
- * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment
+ * Context-aware grouping field is not a part of an ingested document, so omitting it from
  * validation.
  */

Likely an incorrect or invalid review comment.

server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

11-11: LGTM!

The imports are necessary for the new test method testContextAwareFieldMapperWithDerivedSource.

Also applies to: 13-13, 20-20

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (6)

131-144: Good addition of pending docs tracking with clear documentation.

The documentation clearly explains the purpose and the potential temporary overshoot during refresh. This is a reasonable trade-off since undershooting could be problematic for memory pressure calculations.


352-357: Potential race condition: check-then-act on isClosed().

After acquiring the lock and checking lookup.isClosed(), another thread could theoretically be in the process of closing. However, since the write lock is required for closing (via mapWriteLock), and we hold the read lock here, this should be safe. The pattern correctly closes the lock before returning null.


719-742: Consistent exception handling pattern for tragic exceptions.

The logic correctly swallows AlreadyClosedException when there's no tragic exception (normal close), but re-throws it when a tragic exception exists. This is the correct behavior for distinguishing between normal lifecycle and error conditions.


810-833: Same exception handling pattern as getFlushingBytesUtil - consistent and correct.


927-941: Size-aware document counting looks correct.

The increment by size after adding documents aligns with the number of documents actually added to the child writer.


549-559: The atomic subtraction is intentional; no floor validation needed.

The code comment at lines 141-143 explicitly documents that overshooting childWriterPendingNumDocs is acceptable because undershooting "can be problematic." The developers have made a deliberate design choice to allow temporary accounting imprecision rather than risk undershooting. The absence of assertions preventing negative values confirms this is intentional. Adding a floor of 0 would contradict the documented design.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

243-257: Good addition of overloaded config method for test flexibility.

Allows tests to provide their own Store instance while maintaining the same default configuration.


509-564: Well-designed test utility for controlled flush behavior.

The FlushingIndexWriterFactory enables deterministic flush behavior in tests. A few observations:

  1. The factory correctly tracks directories for cleanup via close().
  2. The useFailingDirectorySupplier flag allows toggling between normal and failing directories.

One minor note: the directories list is not thread-safe (ArrayList), but since test usage is typically single-threaded during setup, this should be acceptable.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

34-34: Test calls correctly updated to use new size-aware API.

The addDocuments(..., operation.docs().size()) pattern correctly passes document count.

Also applies to: 76-76, 114-114

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)

183-183: Test calls correctly updated with size parameter.

Consistent with the API changes across the codebase.

Also applies to: 231-231


498-532: Good test coverage for tragic exception on current writer.

The test correctly verifies that AlreadyClosedException is thrown when accessing ramBytesUsed() after a tragic exception occurs on a current writer.


534-583: Test for tragic exception on old writer has proper synchronization.

The test uses CountDownLatch and ReleasableLock to properly coordinate between the main thread and the refresh thread, ensuring the tragic exception occurs in the old writer context.


672-757: Comprehensive tragic exception retrieval tests.

Both tests (testTragicExceptionGetWithTragicExceptionOnCurrent and testTragicExceptionGetWithTragicExceptionOnOld) verify that getTragicException() correctly returns a non-null value after a tragic exception occurs, covering both current and old writer scenarios.


840-860: softUpdateDocuments calls correctly include size parameter.

The pattern operation.docs().size() is consistently used to pass document count.

server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)

198-206: Codec placement and documentation are correct, with proper backward compatibility handling.

The inline comment accurately explains why the codec is constructed here rather than in CodecService—the associatedCriteria is per-document and only available during IndexWriter initialization in InternalEngine. The context-aware check correctly gates the use of CriteriaBasedCodec, falling back to the base codec when disabled, ensuring backward compatibility for non-context-aware indexes. The CriteriaBasedCodec properly handles null criteria through null checks in its delegate writers.

Comment on lines +1081 to 1073

childWriterPendingNumDocs.incrementAndGet();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Incorrect placement: incrementing pending docs for accumulating writer operations.

deleteInLucene operates on currentWriter which is passed as accumulatingIndexWriter from deleteDocument. The accumulatingIndexWriter already has its own getPendingNumDocs() tracking in Lucene's IndexWriter. Incrementing childWriterPendingNumDocs here causes double-counting since getPendingNumDocs() (line 747) already includes accumulatingIndexWriter.getPendingNumDocs().

Suggested fix: remove the increment in deleteInLucene
     private void deleteInLucene(
         Term uid,
         boolean isStaleOperation,
         IndexWriter currentWriter,
         Iterable<? extends IndexableField> doc,
         Field... softDeletesField
     ) throws IOException {
         if (isStaleOperation) {
             currentWriter.addDocument(doc);
         } else {
             currentWriter.softUpdateDocument(uid, doc, softDeletesField);
         }
-
-        childWriterPendingNumDocs.incrementAndGet();
     }
🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 1081 - 1083, The increment of childWriterPendingNumDocs in
deleteInLucene is incorrect because deleteInLucene operates on currentWriter
(passed as accumulatingIndexWriter from deleteDocument) and Lucene's
accumulatingIndexWriter.getPendingNumDocs() is already included in
getPendingNumDocs() (see getPendingNumDocs usage around line 747), so remove the
childWriterPendingNumDocs.incrementAndGet() call from deleteInLucene to avoid
double-counting pending docs; ensure any remaining bookkeeping relies solely on
accumulatingIndexWriter.getPendingNumDocs() and tests for
deleteDocument/deleteInLucene reflect no net double increment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-730: Remove orphaned setting and variable—cleanup is incomplete.

The removal of LookupMapLockAcquisitionException retry handling from TransportShardBulkAction is incomplete. The following orphaned code still exists in IndexSettings.java and should be removed:

  • INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION setting definition (line 519) and its stale JavaDoc comment (line 515)
  • maxRetryOnLookupMapAcquisitionException field (line 933) and its initialization (line 1149)
  • setMaxRetryOnLookupMapAcquisitionException() and getMaxRetryOnLookupMapAcquisitionException() methods (lines 2114–2120)—the getter is never called
  • Registration of the setting in IndexScopedSettings.java (line 181)

These should be removed as part of the refactoring to avoid leaving dead code.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)

545-560: Move getPendingNumDocs() call before close() to prevent AlreadyClosedException

In refreshDocumentsForParentDirectory, the current code calls getPendingNumDocs() after closing the IndexWriter:

childDisposableWriter.getIndexWriter().close();
pendingNumDocsByOldChildWriter.addAndGet(childDisposableWriter.getIndexWriter().getPendingNumDocs());

This will throw AlreadyClosedException at runtime. Lucene's IndexWriter.getPendingNumDocs() calls ensureOpen(true), which fails once the writer is closed.

Reorder to capture the pending doc count before closing:

Suggested fix
for (CompositeIndexWriter.DisposableIndexWriter childDisposableWriter : markForRefreshIndexWritersMap.values()) {
    final IndexWriter childWriter = childDisposableWriter.getIndexWriter();
    directoryToCombine.add(childWriter.getDirectory());
+   pendingNumDocsByOldChildWriter.addAndGet(childWriter.getPendingNumDocs());
    childWriter.close();
-   pendingNumDocsByOldChildWriter.addAndGet(childWriter.getPendingNumDocs());
}

Optionally, add an assertion to catch underflow:

-childWriterPendingNumDocs.addAndGet(-pendingNumDocsByOldChildWriter.get());
+final long newValue = childWriterPendingNumDocs.addAndGet(-pendingNumDocsByOldChildWriter.get());
+assert newValue >= 0 : "childWriterPendingNumDocs underflow: " + newValue;

1030-1066: Double-counting parent deletes in deleteInLucene inflates child pending-docs and can cause early maxDocs failures

In deleteDocument you optionally add partial deletes to current/old child writers and increment childWriterPendingNumDocs (lines 1051, 1061). Then you always delegate to deleteInLucene(uid, isStaleOperation, accumulatingIndexWriter, doc, softDeletesField) with the parent writer.

Inside deleteInLucene, you unconditionally increment childWriterPendingNumDocs:

if (isStaleOperation) {
    currentWriter.addDocument(doc);
} else {
    currentWriter.softUpdateDocument(uid, doc, softDeletesField);
}
childWriterPendingNumDocs.incrementAndGet();

But currentWriter here is always the parent accumulatingIndexWriter when called from deleteDocument. This means:

  • Each parent delete increments Lucene's own pending docs via the parent writer and
  • Increments childWriterPendingNumDocs, which is intended to track only child-writer contributions (as confirmed by the comment "only increment this when addDeleteEntry for child writers are called")

As a result, CompositeIndexWriter.getPendingNumDocs() (which returns childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()) systematically overcounts deletes. This inflated count is used by InternalEngine.tryAcquireInFlightDocs to enforce the maxDocs guard, causing the shard to reject operations prematurely even though Lucene has not reached IndexWriter.MAX_DOCS.

Fix by incrementing childWriterPendingNumDocs only for child writers, not the parent:

Proposed fix
     private void deleteInLucene(
         Term uid,
         boolean isStaleOperation,
         IndexWriter currentWriter,
         Iterable<? extends IndexableField> doc,
         Field... softDeletesField
     ) throws IOException {
         if (isStaleOperation) {
             currentWriter.addDocument(doc);
         } else {
             currentWriter.softUpdateDocument(uid, doc, softDeletesField);
         }
-        childWriterPendingNumDocs.incrementAndGet();
+        // Only child writers are tracked via childWriterPendingNumDocs; the parent writer
+        // is already accounted for by IndexWriter#getPendingNumDocs().
+        if (currentWriter != accumulatingIndexWriter) {
+            childWriterPendingNumDocs.incrementAndGet();
+        }
     }
🤖 Fix all issues with AI agents
In @CHANGELOG.md:
- Line 33: Update the PR link text to include the missing “#” for consistency:
change the occurrence
"([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" in the
CHANGELOG entry to
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" so it
matches other entries like "[#20055]" and "[#20284]".
🧹 Nitpick comments (6)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

694-696: Filtering logic is correct and caller-compatible with the narrowed return type.

All callers in the codebase expect CompositeDataCubeFieldType instances and safely handle the filtered result. The method correctly excludes ContextAwareGroupingFieldType, which is not used by any caller of getCompositeFieldTypes() or isCompositeIndexPresent().

Optional: Consider caching the filtered result for performance.

The filtering operation executes on every call. If this method is invoked frequently, consider caching the filtered set alongside compositeMappedFieldTypes to avoid repeated stream operations.

♻️ Potential optimization to cache filtered results

Add a cached field for the filtered set:

 private volatile Set<CompositeMappedFieldType> compositeMappedFieldTypes;
+private volatile Set<CompositeMappedFieldType> compositeDataCubeFieldTypes;
 private volatile Set<String> fieldsPartOfCompositeMappings;

Update the initialization in internalMerge (around line 552):

 // initialize composite fields post merge
 this.compositeMappedFieldTypes = getCompositeFieldTypesFromMapper();
+this.compositeDataCubeFieldTypes = compositeMappedFieldTypes.stream()
+    .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
+    .collect(Collectors.toSet());
 buildCompositeFieldLookup();

Simplify the method:

 public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
-    return compositeMappedFieldTypes.stream()
-        .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
-        .collect(Collectors.toSet());
+    return compositeDataCubeFieldTypes;
 }
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

153-168: Consider simplifying the exception handling.

The test correctly verifies that the derived-source methods can be invoked without throwing. However, the explicit try-catch with fail(e.getMessage()) is unnecessary—simply letting any exception propagate will fail the test automatically with better diagnostics.

♻️ Simplify exception handling
-    public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
+    public void testContextAwareFieldMapperWithDerivedSource() throws Exception {
         ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
         ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
             "context_aware_grouping",
             fieldType,
             new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
         );
         LeafReader leafReader = mock(LeafReader.class);
-
-        try {
-            mapper.canDeriveSource();
-            mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
-        } catch (Exception e) {
-            fail(e.getMessage());
-        }
+        
+        mapper.canDeriveSource();
+        mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
     }
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)

55-67: Clarify and enforce size contract on multi-doc APIs

size is critical for pending-doc accounting; if callers pass an incorrect or non-positive value, getPendingNumDocs() and tryAcquireInFlightDocs() will misbehave. Consider documenting that size must equal the number of documents in docs and be > 0, and add assertions in implementations (e.g., CompositeIndexWriter, LuceneIndexWriter) to enforce this.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

154-204: Concurrency test for deletes on old child writer looks sound (minor nit)

The new testDeleteWithDocumentInOldChildWriter exercises delete behavior while a read lock is held on the current map and beforeRefresh() runs in another thread, which is valuable coverage for the new locking/rotation logic.

Small nit: AtomicBoolean run is never read in the refresher thread; you can drop it to simplify the test.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)

853-880: Child-writer rollback added, but consider resetting child pending-doc counter

The enhanced rollback() now:

  • Rolls back all open child writers in the current and old maps (ignoring AlreadyClosedException), then
  • Rolls back the parent writer and marks the composite as closed.

This ensures child directories are properly closed and reduces leak risk. One minor missing piece is that childWriterPendingNumDocs is not reset here; since ensureOpen() guards most public methods, this is likely harmless, but for clarity and future safety you could reset it (e.g., childWriterPendingNumDocs.set(0)) when the composite transitions to closed.


606-610: acquireNewReadLock correctly exposed for tests but should stay non-public

acquireNewReadLock() returns a CriteriaBasedIndexWriterLookup with the read lock already held, for use in tests like testDeleteWithDocumentInOldChildWriter. The semantics are clear in tests, but this method should remain package-private (as it is now) to avoid accidental misuse in production code—holding a read lock without a matching close() can block refresh rotations.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b2c220a and 887bdb6.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
🚧 Files skipped from review as they are similar to previous changes (1)
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.

Applied to files:

  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: Analyze (java)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Mend Security Check
🔇 Additional comments (19)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

87-87: LGTM!

The import is necessary for the Collectors.toSet() call in the modified getCompositeFieldTypes() method.

server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)

198-206: LGTM! Clear codec selection logic with helpful documentation.

The inline comment effectively explains the architectural constraint that necessitates codec creation at this point. The conditional logic correctly selects between context-aware and standard codec configurations based on the index settings.

server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)

187-199: LGTM! Well-documented no-op overrides.

The Javadoc clearly explains the rationale for the no-op implementations: the context-aware grouping field is metadata that doesn't participate in document ingestion or derived-source workflows.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

243-257: LGTM! Convenient config overload.

This method simplifies test setup by providing sensible defaults for the full config(...) method, reducing boilerplate in test cases.


509-564: LGTM! Well-designed test utility for flush and tragic-exception scenarios.

The FlushingIndexWriterFactory properly extends NativeLuceneIndexWriterFactory and implements Closeable, ensuring all tracked directories are cleaned up via IOUtils.close(). The wrapped IndexWriter correctly flushes after each write operation to simulate specific test conditions.

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)

134-136: Size parameter is intentionally unused in LuceneIndexWriter—this is by design.

Both addDocuments (line 134) and softUpdateDocuments (line 150) in LuceneIndexWriter declare the int size parameter but don't use it. This is intentional: LuceneIndexWriter is a simple wrapper that delegates directly to Lucene's IndexWriter, which doesn't require pending document count tracking. In contrast, CompositeIndexWriter uses the size parameter to update childWriterPendingNumDocs because it manages multiple child IndexWriter instances and needs to track pending documents across them for coordination and synchronization. The parameter is part of the DocumentIndexWriter interface contract, so all implementations must accept it, but usage varies by architectural need.

server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: Multi-document append path correctly migrated to size-aware API

Using indexWriter.addDocuments(docs, uid, docs.size()) only when docs.size() > 1 keeps single-doc path intact and aligns reservedDocs and numDocAppends.inc(docs.size()) with the actual number of docs written.


1249-1258: Stale-doc append path correctly passes doc count

addStaleDocs now calls addDocuments(docs, uid, docs.size()) for multi-doc stale ops, keeping accounting (numDocAppends.inc(docs.size())) consistent with writes.


1369-1390: Update path correctly passes docs.size() into softUpdateDocuments

The multi-doc update path now uses softUpdateDocuments(uid, docs, version, seqNo, primaryTerm, docs.size(), softDeletesField), which is consistent with how reservedDocs and numDocUpdates.inc(docs.size()) are computed, and provides the correct size to DocumentIndexWriter for pending-doc tracking.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

34-35: Tests correctly updated for size-aware write APIs

All test invocations of addDocuments and softUpdateDocuments now pass operation.docs().size() as the size argument, in line with the new DocumentIndexWriter contract and how production code calls these APIs.

Also applies to: 76-77, 114-130, 219-235, 263-276

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)

176-185: Append tests aligned with size-aware CompositeIndexWriter API

All updated test paths now pass operation.docs().size() into addDocuments (and softUpdateDocuments where applicable), which matches the production usage and ensures coverage of the new size parameter.

Also applies to: 224-244, 333-335, 373-375, 437-448, 479-487, 771-773


498-757: New tragic-exception tests provide good coverage (with acceptable test-only OOME usage)

The new tests around ramBytesUsed, getFlushingBytes, and getTragicException under simulated OutOfMemoryError on current and old child writers exercise the new behavior in CompositeIndexWriter (tragic-exception detection and propagation via AlreadyClosedException). Catching OutOfMemoryError is fine here since it’s fully constrained to a synthetic Directory in test scope.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (7)

486-512: New computeIndexWriterIfAbsentForCriteria loop correctly avoids closed lookups

The updated LiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteria uses mapReadLock.tryAcquire() in a loop and skips lookups that report isClosed(), which prevents routing new writes into maps that have been rotated to old and closed during refresh. The finally block also only closes the read lock on failure, allowing callers (via CompositeIndexWriter.computeIndexWriterIfAbsentForCriteria) to manage the balanced unlock via their own try-with on getMapReadLock(). This looks consistent with the lock protocol in the rest of the class.


744-748: Pending-docs aggregation correctly incorporates child and parent writers (subject to child-only increments)

getPendingNumDocs() now returns childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs(), which is exactly what InternalEngine.tryAcquireInFlightDocs needs: an upper bound on documents accounted in Lucene plus those not yet visible to IndexWriter in child writers. As long as increments to childWriterPendingNumDocs are limited to child-writer operations (see separate comment on deleteInLucene), this aggregation is sound.


926-957: addDocuments/addDocument: size-based increments look correct

In both addDocuments and addDocument:

  • You determine criteria, pick the appropriate child DisposableIndexWriter, and protect it with the map read lock plus keyed UID lock.
  • After writing (addDocuments / addDocument), you increment childWriterPendingNumDocs by size or 1 and return the sequence number.

This matches the documented semantics for childWriterPendingNumDocs and ensures child contributions to getPendingNumDocs() track the number of in-flight doc writes.


960-985: softUpdateDocuments increments child count correctly and records delete entry

softUpdateDocuments now consumes the new int size parameter and:

  • Writes via IndexWriter.softUpdateDocuments(uid, docs, softDeletesField).
  • Increments childWriterPendingNumDocs by size.
  • Records the delete entry in the lookup for later parent consolidation.

This is consistent with the way addDocuments uses size and with how deletePreviousVersionsForUpdatedDocuments later applies these delete entries to the parent writer.


704-742: getFlushingBytesUtil correctly implements tragic-exception semantics

The getFlushingBytes() implementation delegates to getFlushingBytesUtil(...), which:

  • Sums getFlushingBytes() across current and old child writers.
  • Ignores AlreadyClosedException for non-tragic closures (normal close/rotation).
  • Rethrows on AlreadyClosedException if the child writer has a non-null getTragicException(), surfacing unrecoverable errors.
  • Adds the parent accumulatingIndexWriter.getFlushingBytes() at the end.

This implementation follows Lucene's recommended patterns: AlreadyClosedException indicates the writer is closed, and getTragicException() is the correct way to distinguish between a normal closure and a fatal, unrecoverable error (e.g., disk full during flush). The code appropriately ignores transient closures while propagating tragic failures.


770-784: Tragic-exception detection now covers both child and parent writers

getTragicException() scans all current and old child DisposableIndexWriters and returns the first non-null tragic exception before falling back to the parent accumulatingIndexWriter. This ensures InternalEngine.failOnTragicEvent can correctly detect when a group-specific child writer encounters a fatal error.


795-833: ramBytesUsedUtil contains unnecessary exception handling for ramBytesUsed()

The code catches AlreadyClosedException from disposableIndexWriter.getIndexWriter().ramBytesUsed() calls, but Lucene's IndexWriter.ramBytesUsed() is a non-failing diagnostic method documented to never throw AlreadyClosedException. Only methods using ensureOpen() throw this exception. The try-catch block around ramBytesUsed() calls is therefore unnecessary and suggests a misunderstanding of the IndexWriter API contract. Remove these catch blocks or replace them with appropriate handling if other failure modes are genuinely expected.

Likely an incorrect or invalid review comment.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2026

❌ Gradle check result for 887bdb6: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1069-1083: Fix double-counting bug in pending docs accounting.

The deleteInLucene method increments childWriterPendingNumDocs (Line 1082) after operating on currentWriter, which is the accumulating parent writer (passed as accumulatingIndexWriter at Line 1065). However, childWriterPendingNumDocs is documented (Lines 131-143) as tracking child-level IndexWriter pending docs only.

Since getPendingNumDocs() (Line 747) returns childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs(), documents added to the accumulating writer are already counted via accumulatingIndexWriter.getPendingNumDocs(). Incrementing childWriterPendingNumDocs for parent writer operations causes double-counting, leading to inflated pending doc counts that may trigger unnecessary flushes.

🐛 Proposed fix

Remove the increment since the accumulating writer's pending docs are already tracked separately:

     private void deleteInLucene(
         Term uid,
         boolean isStaleOperation,
         IndexWriter currentWriter,
         Iterable<? extends IndexableField> doc,
         Field... softDeletesField
     ) throws IOException {
         if (isStaleOperation) {
             currentWriter.addDocument(doc);
         } else {
             currentWriter.softUpdateDocument(uid, doc, softDeletesField);
         }
-
-        childWriterPendingNumDocs.incrementAndGet();
     }
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The loop in CompositeIndexWriter that repeatedly calls
this.current.mapReadLock.tryAcquire() on variable current can spin indefinitely;
add a safeguard by implementing a max retry count and/or timeout inside the
while loop that checks current == null || current.isClosed(), incrementing a
retry counter or tracking elapsed time each iteration, and if exceeded, stop
retrying and throw a clear exception (or return a failure) so callers can handle
shutdown; ensure you reference the same current and mapReadLock.tryAcquire()
usage and preserve the existing semantics when acquisition succeeds.

In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 194-199: Update the Javadoc on the deriveSource method to use the
correct terminology: replace "Context Aware Segment field" with "context aware
grouping field" so it matches the class/field type
(ContextAwareGroupingFieldMapper) and the Javadoc at line 175; leave the no-op
implementation of deriveSource( XContentBuilder builder, LeafReader leafReader,
int docId ) unchanged.
- Around line 187-192: The Javadoc uses inconsistent terminology ("Context Aware
Segment field") while the actual field type is context_aware_grouping; update
the Javadoc on the canDeriveSource() override in ContextAwareGroupingFieldMapper
to say "Context aware grouping field" (or similar consistent phrasing) so it
matches the field type name context_aware_grouping and the class
ContextAwareGroupingFieldMapper.
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 887bdb6 and b08ba39.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (6)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
  • GitHub Check: Mend Security Check
🔇 Additional comments (27)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Retry semantics are sound; verify broader impact of LookupMapLockAcquisitionException removal

The retry logic at lines 724-730 is correct: resetForExecutionForRetry() properly increments the retry counter (line 236 of BulkPrimaryExecutionContext.java) and resets state to INITIAL (line 233), so there is no risk of infinite retry loops or violating the assert context.isInitial() invariant.

However, the removal of LookupMapLockAcquisitionException handling means transient engine lock acquisition failures will now fail fast rather than triggering a retry. Confirm this aligns with the intended behavior and that no other mechanism was expected to handle such transient issues for the indexing regression fix.

server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)

11-12: LGTM: Required imports for new method signatures.

The imports are necessary for the deriveSource() method parameters.

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)

55-55: LGTM: Clean API extension for size-aware document additions.

The addition of the size parameter allows callers to communicate the document count, enabling better resource tracking and accounting in implementations.


59-67: LGTM: Consistent size-aware API for soft updates.

The size parameter is correctly positioned before the varargs softDeletesField parameter, following Java conventions. This maintains consistency with the addDocuments signature change.

server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)

239-245: LGTM: Correct size parameter for multi-document indexing.

The docs.size() parameter is appropriately passed to addDocuments in the multi-document path, while the single-document path correctly continues to use addDocument without the size parameter.


247-260: LGTM: Correct size parameter for soft updates.

The docs.size() parameter is correctly passed to softUpdateDocuments in the multi-document path, with proper parameter ordering before softDeletesField. The single-document path appropriately uses softUpdateDocument without the size parameter.

server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: LGTM: Correct size-aware document additions and counter accounting.

The implementation correctly:

  • Passes docs.size() to addDocuments for multi-document operations
  • Increments numDocAppends by docs.size() to accurately track all documents added (both single and multi-document cases)

1249-1258: LGTM: Correct size parameter for stale document additions.

The docs.size() parameter is correctly passed to addDocuments for stale operations. Note that stale operations correctly do not increment numDocAppends since they represent out-of-order operations that shouldn't be counted as new appends.


1369-1390: LGTM: Correct size-aware soft updates and counter accounting.

The implementation correctly:

  • Passes docs.size() to softUpdateDocuments with proper parameter ordering
  • Increments numDocUpdates by docs.size() for accurate tracking
  • Maintains the append-only index constraint check
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

21-289: LGTM: Test methods correctly updated for new API signatures.

All test methods have been properly updated to pass the docs.size() parameter to addDocuments and softUpdateDocuments calls. The parameter ordering in softUpdateDocuments (size before softDeletesField) is correct throughout.


154-204: LGTM: Well-designed concurrency test for delete during refresh.

This test exercises a specific race condition scenario where a document is deleted while a refresh is in progress and the delete operation obtains a lock on the old child writer. The test correctly:

  • Uses acquireNewReadLock() to simulate holding a lock on the old writer
  • Synchronizes with CountDownLatch to coordinate the refresh thread
  • Properly joins threads and performs cleanup with IOUtils.closeWhileHandlingException
  • Verifies the final state (document count is 0)

The test provides good coverage for the concurrent delete scenario during writer rotation.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)

48-890: LGTM: Existing test methods correctly updated for new API signatures.

All existing test methods have been systematically updated to pass the docs.size() parameter to addDocuments and softUpdateDocuments calls. The updates are consistent throughout and maintain the correct parameter ordering.


498-757: LGTM: Comprehensive tragic exception test coverage.

The new test methods provide thorough coverage of tragic exception scenarios for both current and old writers:

  • RAM bytes tests: Verify ramBytesUsed() throws AlreadyClosedException after tragic exception
  • Flushing bytes tests: Verify getFlushingBytes() throws AlreadyClosedException after tragic exception
  • Tragic exception getter tests: Verify getTragicException() returns non-null after tragic exception

All tests correctly:

  • Use FilterDirectory with simulated OutOfMemoryError to trigger tragic exceptions
  • Handle both current and old writer scenarios (the latter with proper thread synchronization)
  • Clean up resources with IOUtils.closeWhileHandlingException
  • Use appropriate assertions for the expected behavior

The tests ensure robust handling of catastrophic failures in the indexing path.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (14)

131-144: Well-documented pending docs tracking.

The documentation clearly explains the tracking strategy and acknowledges the temporary overshooting during refresh. The approach of tracking child writer pending docs separately from the accumulating writer is sound.


352-355: Good defensive check for closed lookups.

This prevents race conditions where a lookup is closed between lock acquisition and usage by releasing the lock and returning null when the lookup is already closed.


606-609: LGTM!

Simple utility method for unit tests to acquire a read lock on the current map.


707-742: Robust exception handling for flushing bytes calculation.

The pattern of catching AlreadyClosedException and only rethrowing when a tragic exception exists is appropriate. This allows gracefully closed writers to be skipped while ensuring serious errors (tragic exceptions) are propagated.


747-747: LGTM!

Correctly combines pending docs from child writers and the accumulating parent writer.


770-784: LGTM!

Comprehensive tragic exception checking across current, old, and accumulating writers ensures any critical failure is detected and reported.


796-833: LGTM!

Consistent exception handling pattern with getFlushingBytesUtil. The documentation clearly explains the calculation and double-counting avoidance strategy.


853-880: LGTM!

Comprehensive rollback logic ensures all child writers are closed to prevent file leaks, with appropriate handling for already-closed writers.


927-942: LGTM!

The size-aware API correctly tracks pending document counts by incrementing childWriterPendingNumDocs by the actual number of documents added.


945-958: LGTM!

Correctly increments childWriterPendingNumDocs for single document additions.


961-986: LGTM!

Size-aware soft update correctly tracks pending documents for bulk updates.


989-1013: LGTM!

Correctly increments childWriterPendingNumDocs for single document soft updates.


1030-1067: LGTM!

Correctly increments childWriterPendingNumDocs when delete operations are applied to child writers. The comments clearly explain the increments only apply to child writer operations.


545-563: LGTM!

The pending docs accounting during refresh correctly accumulates pending docs from old child writers and subtracts them from childWriterPendingNumDocs after merging into the parent. This prevents double-counting since merged documents become part of accumulatingIndexWriter.getPendingNumDocs().

Comment on lines +194 to +199
/**
* Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment
* generation.
*/
@Override
public void deriveSource(XContentBuilder builder, LeafReader leafReader, int docId) throws IOException {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix terminology inconsistency; no-op implementation is appropriate.

The no-op implementation is consistent with the field's non-ingestible nature (line 175). However, the Javadoc should use "context aware grouping field" instead of "Context Aware Segment field" to match the actual field type.

📝 Proposed Javadoc correction
 /**
- * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment
+ * Context aware grouping field is not a part of an ingested document, so omitting it from source
  * generation.
  */
 @Override
 public void deriveSource(XContentBuilder builder, LeafReader leafReader, int docId) throws IOException {}
🤖 Prompt for AI Agents
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
around lines 194 - 199, Update the Javadoc on the deriveSource method to use the
correct terminology: replace "Context Aware Segment field" with "context aware
grouping field" so it matches the class/field type
(ContextAwareGroupingFieldMapper) and the Javadoc at line 175; leave the no-op
implementation of deriveSource( XContentBuilder builder, LeafReader leafReader,
int docId ) unchanged.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

❌ Gradle check result for b08ba39: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Remove orphaned LookupMapLookupMapLockAcquisitionException setting and documentation.

The removal of LookupMapLockAcquisitionException exception and its retry logic is incomplete. The following artifacts remain unused:

  • Setting INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION in IndexSettings.java (line 519)
  • Field maxRetryOnLookupMapAcquisitionException (line 933) and its getter/setter methods (lines 2114-2119)
  • Stale JavaDoc comment referencing the removed exception (line 515)
  • Setting registration in IndexScopedSettings.java (line 181)

The getter is never called, and TransportShardBulkAction.java contains no references to this setting. Remove these orphaned artifacts or document why they should be retained for backward compatibility.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1069-1083: Potential bug: Incrementing childWriterPendingNumDocs for parent writer operations.

The deleteInLucene() method operates on the accumulatingIndexWriter (parent), not a child writer. However, it increments childWriterPendingNumDocs at line 1082. This appears inconsistent with the field's documented purpose of tracking "pendingNumDocs for child level IndexWriters."

Should this instead rely on the parent writer's own getPendingNumDocs() tracking, or is this intentional to account for tombstone entries?

#!/bin/bash
# Verify all callers of deleteInLucene to understand the intent
ast-grep --pattern 'deleteInLucene($_, $_, $_, $_, $_)'
🧹 Nitpick comments (3)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

494-501: Potential busy-wait loop without backoff.

The while loop uses tryAcquire() which returns null if the lock cannot be acquired or the map is closed. This could spin indefinitely if refresh keeps rotating maps. Consider adding a Thread.yield() or brief pause to reduce CPU consumption during contention.

♻️ Optional improvement to reduce CPU spinning
                 while (current == null || current.isClosed()) {
                     // This function acquires a first read lock on a map which does not have any write lock present...
                     current = this.current.mapReadLock.tryAcquire();
+                    if (current == null) {
+                        Thread.yield();
+                    }
                 }
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)

154-204: New test for delete with document in old child writer.

The test exercises a concurrent scenario where a delete occurs while refresh is transitioning the map. However:

  1. Line 171: AtomicBoolean run is set to false at line 193 but never read, making it dead code.
  2. Lines 174-176: The try block catches all exceptions silently, which could hide test failures in the refresher thread.
♻️ Remove unused variable and improve exception visibility
         CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
         CountDownLatch latch = new CountDownLatch(1);
-        AtomicBoolean run = new AtomicBoolean(true);
+        AtomicReference<Exception> refreshException = new AtomicReference<>();
         Thread refresher = new Thread(() -> {
             latch.countDown();
             try {
                 compositeIndexWriter.beforeRefresh();
-            } catch (Exception ignored) {}
+            } catch (Exception e) {
+                refreshException.set(e);
+            }
         });
 
         refresher.start();
         try {
             latch.await();
             // ... delete logic
         } finally {
             IOUtils.closeWhileHandlingException(lock.getMapReadLock());
-            run.set(false);
             refresher.join();
+            if (refreshException.get() != null) {
+                throw refreshException.get();
+            }
             // ... remaining assertions
         }
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

498-757: Consider extracting the failing directory supplier to reduce duplication.

The dirSupplier lambda for creating a FilterDirectory that throws OutOfMemoryError is repeated verbatim in all six tragic exception tests. Consider extracting it to a private helper method.

♻️ Proposed refactor
private Supplier<Directory> createFailingDirectorySupplier() {
    return () -> new FilterDirectory(newDirectory()) {
        @Override
        public IndexOutput createOutput(String name, IOContext context) throws IOException {
            IndexOutput out = super.createOutput(name, context);
            return new FilterIndexOutput("failing output", "test", out) {
                @Override
                public void writeBytes(byte[] b, int offset, int length) throws IOException {
                    throw new OutOfMemoryError("Simulated write failure");
                }
            };
        }
    };
}

Then in each test:

-Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) {
-    ...
-};
+Supplier<Directory> dirSupplier = createFailingDirectorySupplier();
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b08ba39 and 1880fbb.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (4)
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
  • GitHub Check: Mend Security Check
🔇 Additional comments (33)
CHANGELOG.md (1)

33-33: LGTM!

The changelog entry correctly documents the fix and follows the established format.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)

131-144: Well-documented childWriterPendingNumDocs field.

The documentation clearly explains the purpose and the acceptable overshoot behavior during refresh. This is a good practice for tracking state in concurrent scenarios.


348-361: Improved safety: tryAcquire() now checks for closed lookup.

The addition of the closed check after acquiring the lock prevents returning a closed lookup. This is correct since the read lock is held when checking isClosed(), ensuring the state is consistent during this check.


545-563: Correct pending doc accounting during refresh.

The approach properly tracks pending docs from old child writers and subtracts them from childWriterPendingNumDocs after merging into the parent writer, preventing double-counting. Using AtomicLong is fine here even though this runs single-threaded during refresh.


719-742: Robust handling of closed writers in getFlushingBytesUtil().

The logic correctly distinguishes between normal closure (ignored) and tragic exceptions (rethrown). This prevents false errors during refresh when writers are intentionally closed.


854-880: Defensive rollback handling with AlreadyClosedException.

The added try-catch blocks around child writer rollback prevent failures during cleanup when writers are already closed. The isOpen() check before rollback is a good guard, though the exception handling provides additional safety.

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)

55-67: API signature changes add size parameter.

The interface changes are clean and correctly position the size parameter before the varargs Field... softDeletesField. Implementations in CompositeIndexWriter and LuceneIndexWriter have been updated accordingly.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

243-257: Convenience overload for config(Store store).

This is a clean addition that reduces boilerplate in tests by delegating to the full config() method with default parameters.


509-564: Well-structured FlushingIndexWriterFactory for test scenarios.

The factory correctly:

  • Wraps IndexWriter to flush after each write operation
  • Tracks directories for proper cleanup via Closeable
  • Supports conditional use of a failing directory supplier via AtomicBoolean

This is useful for testing flush behavior and failure scenarios.

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)

134-154: Interface compliance: size parameter added but unused.

The size parameter is correctly added to match the DocumentIndexWriter interface. Since LuceneIndexWriter delegates directly to Lucene's IndexWriter (which handles its own pending doc tracking), the size parameter is intentionally unused here. This is appropriate for the wrapper pattern.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

34-34: Test updates correctly pass document size.

All addDocuments() calls are properly updated to include operation.docs().size() as the size parameter.

Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263


122-130: softUpdateDocuments() calls updated with size parameter.

The calls correctly position the size parameter (operation.docs().size()) before the softDeletesField vararg, matching the updated interface signature.

Also applies to: 227-235, 268-276

server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

694-696: The filtering of getCompositeFieldTypes() to only CompositeDataCubeFieldType instances is intentional and correct. All external callers explicitly expect only this type: they either cast to CompositeDataCubeFieldType, check instanceof StarTreeFieldType, or call isEmpty(). The internal compositeMappedFieldTypes field still stores all CompositeMappedFieldType implementations (including ContextAwareGroupingFieldType) for field lookup purposes via buildCompositeFieldLookup(), while the public API appropriately returns only the DataCube types.

server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: LGTM - Size parameter correctly added for multi-document indexing path.

The docs.size() parameter is correctly passed to align with the new size-aware addDocuments API. The append counter is already incremented by docs.size() which is consistent.


1249-1258: LGTM - Consistent with the multi-document stale docs path.

The size parameter is correctly added for stale document indexing, maintaining consistency with the addDocs method above.


1384-1390: LGTM - Size parameter correctly added for soft update path.

The docs.size() parameter is correctly positioned before softDeletesField in the softUpdateDocuments call, and the update counter is already correctly incremented by docs.size().

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (17)

40-40: LGTM - Import added for new test utilities.


183-183: LGTM - API call updated with size parameter.


231-231: LGTM - API call updated with size parameter.


309-312: LGTM - Exception test correctly updated with new API.


334-334: LGTM - API call updated with size parameter.


374-374: LGTM - API call updated with size parameter.


437-437: LGTM - API calls updated with size parameter.

Also applies to: 447-447


480-480: LGTM - API calls updated with size parameter.

Also applies to: 486-486


498-532: Good test coverage for tragic exception handling on current map.

The test correctly validates that ramBytesUsed() throws AlreadyClosedException when a tragic exception has occurred.


534-583: Test correctly validates behavior during map rotation with tragic exception.

The lock + refresher pattern effectively simulates the state where the tragic exception is in the old map during rotation.


585-619: LGTM - Validates getFlushingBytes behavior with tragic exception.


621-670: LGTM - Validates getFlushingBytes during map rotation with tragic exception.


672-706: LGTM - Validates getTragicException returns non-null after failure.


708-757: LGTM - Validates getTragicException during map rotation.


772-772: LGTM - API call updated with size parameter.


785-788: LGTM - Exception test correctly updated with new API.


839-860: LGTM - Both addDocuments and softUpdateDocuments calls correctly updated with size parameter.

The softUpdateDocuments calls correctly place operation.docs().size() before softDeletesField as per the new API signature.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

❌ Gradle check result for 1880fbb: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

724-730: Address incomplete refactoring: Remove orphaned LookupMapLockAcquisitionException settings.

The exception class was successfully removed, but related configuration infrastructure remains as dead code. The setting constant INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION, the field maxRetryOnLookupMapAcquisitionException, its getter method, and the comment at line 515 of IndexSettings.java should also be removed since they reference a non-existent exception and are never used.

Remove from server/src/main/java/org/opensearch/index/IndexSettings.java:

  • Lines 514-523: Setting definition and its comment
  • Line 933: Field declaration
  • Lines 1149: Field initialization in constructor
  • Lines 1318: Setting registration
  • Lines 2114-2119: Setter and getter methods

Remove from server/src/main/java/org/opensearch/common/settings/IndexScopedSettings.java:

  • Line 181: Setting registration
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1069-1083: Clarify childWriterPendingNumDocs counter scope in deleteInLucene.

The documentation (lines 131-143) explicitly defines childWriterPendingNumDocs to track operations on child-level IndexWriters. However, deleteInLucene is called with accumulatingIndexWriter (the parent) at line 1065, yet unconditionally increments childWriterPendingNumDocs at line 1082 regardless of writer type. This contradicts the documented scope.

The code at line 1059 shows awareness of this distinction (comment: "only increment this when addDeleteEntry for child writers are called"), and the documentation acknowledges overcounting is acceptable (lines 141-143). Clarify whether this parent-writer increment is intentional for safety or should be scoped conditionally. Add an explanatory comment if this counter overshooting by parent operations is by design.

🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java:
- Line 55: The interface parameter `size` on
addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) and
softUpdateDocuments(...) is undocumented and unvalidated which can corrupt
CompositeIndexWriter's pending counts; add clear javadoc to both interface
methods explaining this parameter is the document count (rename to docCount only
if you accept a breaking change), then in CompositeIndexWriter.addDocuments and
CompositeIndexWriter.softUpdateDocuments validate that size equals the actual
iterable count (throw IllegalArgumentException on mismatch), and make
LuceneIndexWriter either perform the same validation or explicitly
document/handle the parameter consistently so all implementations behave the
same.

In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 187-192: Update the Javadoc wording to refer to "Context Aware
Grouping" instead of "Context Aware Segment" for consistency with the class
ContextAwareGroupingFieldMapper (and its content type "context_aware_grouping");
specifically, edit the comment above the canDeriveSource() method and the other
Javadoc occurrence in this class to replace "Context Aware Segment" with
"Context Aware Grouping".

In @server/src/main/java/org/opensearch/index/mapper/MapperService.java:
- Around line 694-696: getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes may be null before internalMerge(); update the method
to return Collections.emptySet() when compositeMappedFieldTypes is null and
restrict the return type/name to reflect that it only returns
CompositeDataCubeFieldType instances (e.g., rename to
getCompositeDataCubeFieldTypes() and change return type to
Set<CompositeDataCubeFieldType>), and update callers (like
isCompositeIndexPresent()) to use the new method; ensure you reference the
compositeMappedFieldTypes field and internalMerge() assignment when implementing
the null check and API rename.

In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java:
- Around line 154-204: The test declares an unused AtomicBoolean variable `run`
(AtomicBoolean run) which is set but never read; remove the unused `run`
variable and its set call to simplify the test: delete the `AtomicBoolean run =
new AtomicBoolean(true);` declaration and the `run.set(false);` line in the
finally block, leaving the Thread `refresher` logic and surrounding calls to
`compositeIndexWriter.beforeRefresh()`/`afterRefresh()` unchanged so behavior
and synchronization via `latch` remain intact.
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)

509-564: Consider renaming failingDirectory for clarity.

The FlushingIndexWriterFactory test utility is well-implemented. However, at line 523, the variable name failingDirectory is misleading—it holds either the failing directory or the regular directory depending on the useFailingDirectorySupplier flag.

♻️ Suggested variable rename for clarity
        @Override
        public IndexWriter createWriter(Directory directory, IndexWriterConfig config) throws IOException {
-            Directory failingDirectory = useFailingDirectorySupplier.get() ? failingWriteDirectorySupplier.get() : directory;
-            directories.add(failingDirectory);
-            return new IndexWriter(failingDirectory, config) {
+            Directory selectedDirectory = useFailingDirectorySupplier.get() ? failingWriteDirectorySupplier.get() : directory;
+            directories.add(selectedDirectory);
+            return new IndexWriter(selectedDirectory, config) {
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

494-501: Busy-wait loop without backoff could cause CPU spinning.

The while (current == null || current.isClosed()) loop with only tryAcquire() may spin aggressively when the map is being rotated frequently. Consider adding Thread.yield() or a brief sleep to reduce CPU contention.

Proposed fix
                 while (current == null || current.isClosed()) {
                     // This function acquires a first read lock on a map which does not have any write lock present. Current keeps
                     // on getting rotated during refresh, so there will be one current on which read lock can be obtained.
                     // Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was
                     // never applied on this map as write lock gets only during closing time. We are doing this instead of acquire,
                     // because acquire can also apply a read lock in case refresh completed and map is closed.
                     current = this.current.mapReadLock.tryAcquire();
+                    if (current == null || current.isClosed()) {
+                        Thread.yield();
+                    }
                 }
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

498-532: Consider extracting duplicated directory supplier to a helper method.

The failing FilterDirectory with OutOfMemoryError pattern is repeated across six test methods (lines 499-510, 535-546, 586-597, 622-633, 673-684, 709-720). This could be extracted to a helper method for maintainability.

Proposed refactor
private Supplier<Directory> createFailingDirectorySupplier() {
    return () -> new FilterDirectory(newDirectory()) {
        @Override
        public IndexOutput createOutput(String name, IOContext context) throws IOException {
            IndexOutput out = super.createOutput(name, context);
            return new FilterIndexOutput("failing output", "test", out) {
                @Override
                public void writeBytes(byte[] b, int offset, int length) throws IOException {
                    throw new OutOfMemoryError("Simulated write failure");
                }
            };
        }
    };
}

Then use in tests:

-        Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) {
-            @Override
-            public IndexOutput createOutput(String name, IOContext context) throws IOException {
-                ...
-            }
-        };
+        Supplier<Directory> dirSupplier = createFailingDirectorySupplier();
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1880fbb and b6df798.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (6)
  • CHANGELOG.md
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: Mend Security Check
🔇 Additional comments (20)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

17-17: LGTM! Imports support the new test infrastructure.

The new imports are appropriate for the FlushingIndexWriterFactory test utility class added below.

Also applies to: 81-81, 89-89


243-257: LGTM! Convenient overload for test scenarios.

This overload simplifies creating an EngineConfig with a custom Store while using sensible defaults for other parameters. The delegation pattern is clean and appropriate for test infrastructure.

server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)

11-12: LGTM: Required imports for new method signatures.

The imports are correctly added to support the new deriveSource method signature.


194-199: Verify method signature and correct terminology in Javadoc.

Two points:

  1. Minor: The Javadoc mentions "Context Aware Segment" instead of "Context Aware Grouping". Please update for consistency with the class purpose.

  2. Verification needed: Please confirm that the parent class ParametrizedFieldMapper has a matching deriveSource signature with a void return type. The no-op implementation is appropriate since parseCreateField (line 175) prevents these fields from being ingested, but we should verify the signature is correct.

The verification script for canDeriveSource() above will also check this method's signature.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)

16-17: LGTM!

The new imports for CountDownLatch and AtomicBoolean are correctly added to support the new concurrency test.


34-34: LGTM!

The addDocuments call signature is updated to include the size parameter, consistent with the API changes in CompositeIndexWriter.


122-130: LGTM!

The softUpdateDocuments call signature is correctly updated to include the size parameter.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (7)

131-144: LGTM!

Good documentation explaining the purpose and behavior of childWriterPendingNumDocs, including the intentional overshoot during refresh to avoid undershooting issues.


352-357: LGTM!

Good defensive check to close and return null when the lookup is already closed, preventing operations on stale/closed maps.


549-559: LGTM!

Proper tracking of pending docs from closed child writers to decrement childWriterPendingNumDocs after syncing with the parent writer.


606-609: LGTM!

Package-private test hook for acquiring read locks, appropriately documented as being for unit tests.


719-742: LGTM!

Good defensive handling of AlreadyClosedException - re-throwing only when there's a tragic exception, otherwise silently skipping closed writers.


855-875: LGTM!

Proper rollback handling that ensures all child-level IndexWriters are closed to prevent file leaks, with appropriate exception handling for already-closed writers.


927-942: LGTM!

The size-aware addDocuments correctly increments childWriterPendingNumDocs by the size parameter after successful indexing.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)

40-40: LGTM!

Import for Supplier added to support the new tragic exception test patterns.


183-183: LGTM!

The addDocuments call signature is correctly updated to include the size parameter.


534-583: LGTM!

Good test coverage for ramBytesUsed behavior when a tragic exception occurs on the old writer during refresh transition. The test correctly uses latch synchronization to ensure the writer is in the old map before assertion.


672-706: LGTM!

Good test coverage verifying that getTragicException() returns a non-null value when a tragic exception occurs on the current writer.


839-860: LGTM!

The addDocuments and softUpdateDocuments calls are correctly updated to include the size parameter in the testHasBlocksMergeFullyDelSegments test.

server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

87-87: LGTM!

The import is correctly added to support the stream filtering operations in getCompositeFieldTypes().

void deleteUnusedFiles() throws IOException;

long addDocuments(Iterable<ParseContext.Document> docs, Term uid) throws IOException;
long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 202


🏁 Script executed:

git ls-files | grep -i "documentindexwriter"

Repository: opensearch-project/OpenSearch

Length of output: 146


🏁 Script executed:

git ls-files | grep -i "engine" | head -20

Repository: opensearch-project/OpenSearch

Length of output: 1671


🏁 Script executed:

cat -n server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 3337


🏁 Script executed:

rg -l "implements DocumentIndexWriter" --type=java

Repository: opensearch-project/OpenSearch

Length of output: 219


🏁 Script executed:

rg -n "addDocuments|softUpdateDocuments" server/src/main/java/org/opensearch/index/engine/ --type=java -A 5 -B 2

Repository: opensearch-project/OpenSearch

Length of output: 14758


🏁 Script executed:

grep -n "size.*mismatch\|validate.*size\|assert.*size" server/src/main/java/org/opensearch/index/engine/*.java

Repository: opensearch-project/OpenSearch

Length of output: 2051


🏁 Script executed:

rg -n "childWriterPendingNumDocs" server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -B 3 -A 3

Repository: opensearch-project/OpenSearch

Length of output: 3966


🏁 Script executed:

git log --oneline -n 20 -- server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 144


🏁 Script executed:

git show HEAD:server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java 2>/dev/null | head -80

Repository: opensearch-project/OpenSearch

Length of output: 2311


🏁 Script executed:

rg "implements DocumentIndexWriter" --type=java

Repository: opensearch-project/OpenSearch

Length of output: 350


Add javadoc and validation for the size parameter in addDocuments() and softUpdateDocuments().

The size parameter is critical for CompositeIndexWriter, which uses it to track pending documents via childWriterPendingNumDocs.addAndGet(size). However, there is no documentation or validation:

  1. Missing javadoc: The parameter lacks documentation explaining it represents document count and must match the iterable's actual size.

  2. No validation: If a caller passes an incorrect size, CompositeIndexWriter will accumulate wrong pending doc counts, corrupting indexing state. Meanwhile, LuceneIndexWriter silently ignores the parameter, creating inconsistent behavior.

  3. Breaking API change: As a public interface, this signature change breaks external implementations.

  4. Ambiguous parameter name: size could mean document count or byte size. Rename to docCount for clarity.

Suggested improvements

Add javadoc:

/**
 * Adds multiple documents to the index.
 * 
 * @param docs the documents to add
 * @param uid the unique identifier term
 * @param size the number of documents in the iterable (must match actual count)
 * @return the sequence number
 * @throws IOException if an I/O error occurs
 * @throws IllegalArgumentException if size doesn't match the actual document count
 */
long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException;

Consider renaming for clarity:

-long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException;
+long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int docCount) throws IOException;

Add validation in CompositeIndexWriter.addDocuments() and softUpdateDocuments():

if (size != Iterables.size(docs)) {
    throw new IllegalArgumentException("size mismatch: expected " + size + " but got " + Iterables.size(docs));
}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java at
line 55, The interface parameter `size` on
addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) and
softUpdateDocuments(...) is undocumented and unvalidated which can corrupt
CompositeIndexWriter's pending counts; add clear javadoc to both interface
methods explaining this parameter is the document count (rename to docCount only
if you accept a breaking change), then in CompositeIndexWriter.addDocuments and
CompositeIndexWriter.softUpdateDocuments validate that size equals the actual
iterable count (throw IllegalArgumentException on mismatch), and make
LuceneIndexWriter either perform the same validation or explicitly
document/handle the parameter consistently so all implementations behave the
same.

Comment on lines +187 to +192
/**
* Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment
* validation.
*/
@Override
public void canDeriveSource() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for canDeriveSource method in parent classes
rg -n -B 2 -A 3 'canDeriveSource' --type=java | head -100

Repository: opensearch-project/OpenSearch

Length of output: 13274


🏁 Script executed:

#!/bin/bash
# Search for method definition (not calls) in parent classes
rg -n 'public\s+(void|boolean)\s+canDeriveSource\s*\(' --type=java

Repository: opensearch-project/OpenSearch

Length of output: 743


🏁 Script executed:

#!/bin/bash
# Get the Javadoc and definition context for parent Mapper.canDeriveSource
sed -n '305,315p' server/src/main/java/org/opensearch/index/mapper/Mapper.java

echo "---"

# Get the class-level Javadoc for ContextAwareGroupingFieldMapper
head -50 server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java

Repository: opensearch-project/OpenSearch

Length of output: 2300


🏁 Script executed:

#!/bin/bash
# Get lines around the canDeriveSource method in ContextAwareGroupingFieldMapper
sed -n '180,200p' server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java

echo "---"

# Search for "Segment" in the file
grep -n -i "segment" server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java

Repository: opensearch-project/OpenSearch

Length of output: 890


Update Javadoc terminology to match the class name.

The Javadoc mentions "Context Aware Segment" but this class handles "Context Aware Grouping" fields (as indicated by the class name ContextAwareGroupingFieldMapper, the content type "context_aware_grouping", and the class documentation). Update the Javadoc at lines 188 and 195 to say "Context Aware Grouping" instead of "Context Aware Segment" for consistency.

🤖 Prompt for AI Agents
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
around lines 187 - 192, Update the Javadoc wording to refer to "Context Aware
Grouping" instead of "Context Aware Segment" for consistency with the class
ContextAwareGroupingFieldMapper (and its content type "context_aware_grouping");
specifically, edit the comment above the canDeriveSource() method and the other
Javadoc occurrence in this class to replace "Context Aware Segment" with
"Context Aware Grouping".

Comment on lines +694 to +696
return compositeMappedFieldTypes.stream()
.filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
.collect(Collectors.toSet());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the MapperService.java file structure and find the field declaration
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
  echo "=== File found: {} ==="
  wc -l {}'

Repository: opensearch-project/OpenSearch

Length of output: 232


🏁 Script executed:

# Check the field declaration around line 230
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
  echo "=== Lines around 230 (field declaration) ==="
  sed -n "220,240p" {} | cat -n
  echo ""
  echo "=== Lines around 552 (claimed internalMerge assignment) ==="
  sed -n "545,560p" {} | cat -n'

Repository: opensearch-project/OpenSearch

Length of output: 1848


🏁 Script executed:

# Find constructor and initialization
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
  echo "=== Constructor search ==="
  grep -n "MapperService\s*(" {} | head -5'

Repository: opensearch-project/OpenSearch

Length of output: 129


🏁 Script executed:

# Find all call sites of getCompositeFieldTypes
rg -n 'getCompositeFieldTypes\s*\(' --type=java -C 3

Repository: opensearch-project/OpenSearch

Length of output: 30002


🏁 Script executed:

# Check the MapperService constructor body to see if it initializes compositeMappedFieldTypes
sed -n '234,350p' server/src/main/java/org/opensearch/index/mapper/MapperService.java | cat -n

Repository: opensearch-project/OpenSearch

Length of output: 5870


Fix null pointer exception in getCompositeFieldTypes().

The field compositeMappedFieldTypes is declared at line 230 without initialization and only assigned during internalMerge() at line 552. The method directly calls .stream() without a null check at line 694, which will throw NullPointerException if called before any merge operation (e.g., early calls from isCompositeIndexPresent() at line 690).

Additionally, the method name is misleading—it filters to return only CompositeDataCubeFieldType instances despite the return type being the broader CompositeMappedFieldType.

Proposed fix
 public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
+    if (compositeMappedFieldTypes == null) {
+        return Collections.emptySet();
+    }
     return compositeMappedFieldTypes.stream()
         .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
         .collect(Collectors.toSet());
 }
🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/mapper/MapperService.java around
lines 694 - 696, getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes may be null before internalMerge(); update the method
to return Collections.emptySet() when compositeMappedFieldTypes is null and
restrict the return type/name to reflect that it only returns
CompositeDataCubeFieldType instances (e.g., rename to
getCompositeDataCubeFieldTypes() and change return type to
Set<CompositeDataCubeFieldType>), and update callers (like
isCompositeIndexPresent()) to use the new method; ensure you reference the
compositeMappedFieldTypes field and internalMerge() assignment when implementing
the null check and API rename.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

❌ Gradle check result for b6df798: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Update stale comment referencing removed LookupMapLockAcquisitionException.

The exception class has been successfully removed from code, but a documentation comment in server/src/main/java/org/opensearch/index/IndexSettings.java at line 515 still references LookupMapLockAcquisitionException. Update or remove this outdated reference.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1069-1083: Incrementing childWriterPendingNumDocs in deleteInLucene may cause accounting drift.

deleteInLucene is called with accumulatingIndexWriter (the parent), not a child writer. Incrementing childWriterPendingNumDocs here conflates parent and child accounting. The parent writer's getPendingNumDocs() is already queried separately in getPendingNumDocs() (line 747), so this increment would double-count deletes on the parent.

#!/bin/bash
# Verify how deleteInLucene is called and whether the increment is correct
rg -n -A5 'deleteInLucene\(' server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
🤖 Fix all issues with AI agents
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java:
- Around line 154-204: Remove the unused AtomicBoolean run and simplify the
refresher thread to only countDown the latch and call
compositeIndexWriter.beforeRefresh(), then terminate; replace the current
refresher lambda with one that does not depend on run. Wrap the call to
compositeIndexWriter.deleteDocument(...) in a try/catch that rethrows the
exception after performing cleanup so the test still fails but deterministic
cleanup runs. In the finally block always release lock.getMapReadLock() and join
the refresher thread before invoking
compositeIndexWriter.afterRefresh()/beforeRefresh() sequences and
IOUtils.closeWhileHandlingException(compositeIndexWriter); ensure you reference
the existing symbols: run (remove), refresher (modify), latch,
compositeIndexWriter.deleteDocument, lock.getMapReadLock(),
compositeIndexWriter.beforeRefresh/afterRefresh, and
IOUtils.closeWhileHandlingException(compositeIndexWriter).
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

153-168: Consider improving exception handling in the test.

The test validates that the new canDeriveSource() and deriveSource() methods can be invoked without errors. However, the broad catch-all exception handler with fail(e.getMessage()) may mask specific issues.

♻️ Suggested refinement
-    public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
+    public void testContextAwareFieldMapperWithDerivedSource() {
         ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
         ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
             "context_aware_grouping",
             fieldType,
             new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
         );
         LeafReader leafReader = mock(LeafReader.class);
 
-        try {
-            mapper.canDeriveSource();
-            mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
-        } catch (Exception e) {
-            fail(e.getMessage());
-        }
+        // Verify canDeriveSource returns false (no-op implementation)
+        assertFalse(mapper.canDeriveSource());
+        
+        // Verify deriveSource completes without throwing (no-op implementation)
+        assertDoesNotThrow(() -> mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0));
     }

This makes the test intent clearer and leverages JUnit 5's assertDoesNotThrow if available, or simply remove the try-catch and let any unexpected exceptions fail the test naturally.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

498-532: Consider extracting the duplicated directory supplier into a helper method.

All six new tragic exception tests create identical FilterDirectory implementations with the same OutOfMemoryError throwing behavior. This duplication could be reduced by extracting a shared helper.

Suggested refactor

Add a helper method in the test class:

private Supplier<Directory> createFailingDirectorySupplier() {
    return () -> new FilterDirectory(newDirectory()) {
        @Override
        public IndexOutput createOutput(String name, IOContext context) throws IOException {
            IndexOutput out = super.createOutput(name, context);
            return new FilterIndexOutput("failing output", "test", out) {
                @Override
                public void writeBytes(byte[] b, int offset, int length) throws IOException {
                    throw new OutOfMemoryError("Simulated write failure");
                }
            };
        }
    };
}

Then use it in each test:

-        Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) {
-            @Override
-            public IndexOutput createOutput(String name, IOContext context) throws IOException {
-                ...
-            }
-        };
+        Supplier<Directory> dirSupplier = createFailingDirectorySupplier();

Also applies to: 534-583, 585-619, 621-670, 672-706, 708-757

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

494-501: Potential busy-wait if tryAcquire consistently returns closed lookups.

The while loop retries tryAcquire when the lookup is null or closed. If tryAcquire returns null due to closed lookups repeatedly (e.g., during rapid refresh cycles), this could spin. Consider adding a yield or brief sleep, or use the timeout variant of tryAcquire.

Suggested improvement
                 while (current == null || current.isClosed()) {
                     // This function acquires a first read lock on a map which does not have any write lock present...
                     current = this.current.mapReadLock.tryAcquire();
+                    if (current == null) {
+                        Thread.yield();
+                    }
                 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b6df798 and cbc3338.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (3)
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (4)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
server/src/main/java/org/opensearch/script/ContextAwareGroupingScript.java (1)
  • ContextAwareGroupingScript (22-44)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Analyze (java)
  • GitHub Check: Mend Security Check
🔇 Additional comments (30)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-730: LGTM - Clean removal of obsolete exception handling.

The removal of LookupMapLockAcquisitionException retry logic and its helper method is appropriate, as the exception class has been removed from the codebase entirely. The remaining retry logic for version conflicts (lines 724-729) is intact and correct.

server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)

87-87: LGTM!

Import correctly added to support the streaming operation in getCompositeFieldTypes().


694-696: The filtering to CompositeDataCubeFieldType is intentional and correct.

The getCompositeFieldTypes() method filters to only CompositeDataCubeFieldType instances by design. Every caller throughout the codebase (StarTreeQueryContext, StarDateFieldMapper, DateHistogramAggregator, etc.) expects CompositeDataCubeFieldType and immediately casts the result. This is the public API for querying DataCube-based composites, while buildCompositeFieldLookup() internally manages all CompositeMappedFieldType variants for indexing purposes. This separation is intentional, not an inconsistency.

server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

11-11: LGTM! New imports support the test.

The imports for LeafReader, XContentFactory, and IOException are appropriately added to support the new test method testContextAwareFieldMapperWithDerivedSource.

Also applies to: 13-13, 20-20

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)

55-55: LGTM! Size-aware API addition.

The addDocuments method now accepts a size parameter to enable size-aware write accounting. This aligns with the PR's objective to track pending document counts more accurately across multi-document operations.


59-67: LGTM! Consistent size parameter placement.

The softUpdateDocuments method now includes the size parameter positioned before the varargs softDeletesField. This placement is correct and maintains consistency with the addDocuments signature change.

server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)

198-206: LGTM! Clear codec initialization logic with helpful documentation.

The comment effectively explains why codec initialization occurs at this point (associatedCriteria binding during IndexWriter initialization). The conditional logic for selecting between CriteriaBasedCodec and the base codec based on the isContextAwareEnabled setting is clear and correct.

server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: LGTM! Correct size-aware write path for document additions.

The addDocs method correctly passes docs.size() to the new addDocuments signature and increments the numDocAppends counter with the actual document count. This ensures accurate tracking of multi-document operations.


1249-1258: LGTM! Stale document handling updated correctly.

The addStaleDocs method consistently passes docs.size() to addDocuments for both single and multi-document paths, ensuring soft-deleted documents are tracked with size awareness.


1369-1390: LGTM! Update path correctly implements size-aware API.

The updateDocs method properly passes docs.size() as the sixth parameter to softUpdateDocuments and increments the numDocUpdates counter with the document count. The placement of the size parameter before softDeletesField aligns with the API signature change.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (3)

17-17: LGTM! Imports support new test utilities.

The imports for IndexableField, Closeable, and AtomicBoolean are appropriately added to support the new FlushingIndexWriterFactory test utility class.

Also applies to: 81-81, 89-89


243-257: LGTM! Convenient configuration overload.

The new config(Store store) method provides a convenient overload that delegates to the existing configuration method with default parameters, improving test readability.


509-564: LGTM! Well-designed test utility for flush verification.

The FlushingIndexWriterFactory is a useful test utility that:

  • Wraps IndexWriter to automatically flush after write operations
  • Tracks directories for proper cleanup via the Closeable interface
  • Supports conditional failing-directory injection via supplier pattern
  • Overrides all relevant write methods (addDocument, addDocuments, softUpdateDocument, softUpdateDocuments)

This enables tests to verify size-aware write accounting and flush behavior without manual flush calls.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)

16-17: LGTM!

Import additions for CountDownLatch and AtomicBoolean are appropriate for the new concurrency test.


34-34: LGTM!

The addDocuments calls are consistently updated to use the new three-argument signature with operation.docs().size() as the size parameter.

Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263


122-130: LGTM!

The softUpdateDocuments calls are consistently updated to include the operation.docs().size() parameter in the correct position.

Also applies to: 227-235, 268-276

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)

134-136: LGTM!

The size parameter is added to match the DocumentIndexWriter interface. It's intentionally unused here since LuceneIndexWriter delegates directly to Lucene's IndexWriter, which handles its own pending document accounting internally.


144-154: LGTM!

The size parameter addition aligns with the interface update. The parameter is intentionally unused in this implementation since Lucene's IndexWriter.softUpdateDocuments handles document accounting internally.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (3)

40-40: LGTM!

Import for Supplier is appropriate for the new FlushingIndexWriterFactory usage in tragic exception tests.


183-183: LGTM!

The addDocuments calls are consistently updated to use the new three-argument signature.

Also applies to: 231-231, 311-311, 334-334, 374-374


840-860: LGTM!

The softUpdateDocuments calls are correctly updated with the size parameter in the right position.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)

131-144: LGTM!

Good documentation explaining the purpose of childWriterPendingNumDocs and acknowledging that temporary overshooting during refresh is acceptable since undershooting would be problematic.


348-360: LGTM!

Good fix to check lookup.isClosed() after acquiring the lock. This prevents returning a closed lookup to callers and properly releases the lock before returning null.


549-559: LGTM!

The pending document accounting during refresh correctly:

  1. Accumulates pending docs from old child writers before they're closed
  2. Subtracts this count after addIndexes succeeds, since the parent writer now accounts for these docs

719-742: LGTM!

The getFlushingBytesUtil correctly handles AlreadyClosedException by only re-throwing when the writer has a tragic exception. This prevents spurious failures during normal map rotation while still propagating actual tragic failures.


745-748: LGTM!

The getPendingNumDocs correctly combines the child writer pending count with the parent writer's pending count.


770-784: LGTM!

The getTragicException method properly checks both current and old maps before falling back to the accumulating writer, ensuring tragic exceptions from any child writer are detected.


853-880: LGTM!

The rollback method correctly:

  1. Checks isOpen() before attempting rollback on child writers
  2. Catches AlreadyClosedException to handle race conditions gracefully
  3. Processes both current and old maps

927-942: LGTM!

The addDocuments method correctly increments childWriterPendingNumDocs by the size parameter after successfully adding documents to the child writer.


961-986: LGTM!

The softUpdateDocuments method correctly uses the size parameter to increment childWriterPendingNumDocs and properly records the delete entry for version tracking.

Comment on lines +154 to +204
public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException {
final String id = "test";
CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter(
config(),
createWriter(),
newSoftDeletesPolicy(),
softDeletesField,
indexWriterFactory
);

Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
}

CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
CountDownLatch latch = new CountDownLatch(1);
AtomicBoolean run = new AtomicBoolean(true);
Thread refresher = new Thread(() -> {
latch.countDown();
try {
compositeIndexWriter.beforeRefresh();
} catch (Exception ignored) {}
});

refresher.start();
try {
latch.await();
compositeIndexWriter.deleteDocument(
operation.uid(),
false,
newDeleteTombstoneDoc(id),
1,
2,
primaryTerm.get(),
softDeletesField
);
} finally {
IOUtils.closeWhileHandlingException(lock.getMapReadLock());
run.set(false);
refresher.join();
compositeIndexWriter.afterRefresh(true);
compositeIndexWriter.beforeRefresh();
compositeIndexWriter.afterRefresh(true);
try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
assertEquals(0, directoryReader.numDocs());
}

IOUtils.closeWhileHandlingException(compositeIndexWriter);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unused AtomicBoolean run variable.

The run variable is declared and set to false in the finally block but is never read. This appears to be leftover from a pattern where the refresher thread would check run.get() in a loop.

Also, the test lacks proper exception handling—if compositeIndexWriter.deleteDocument throws, the finally block still attempts operations that may fail or behave unexpectedly.

Suggested fix
-        AtomicBoolean run = new AtomicBoolean(true);
         Thread refresher = new Thread(() -> {
             latch.countDown();
             try {
                 compositeIndexWriter.beforeRefresh();
             } catch (Exception ignored) {}
         });

         refresher.start();
         try {
             latch.await();
             compositeIndexWriter.deleteDocument(
                 operation.uid(),
                 false,
                 newDeleteTombstoneDoc(id),
                 1,
                 2,
                 primaryTerm.get(),
                 softDeletesField
             );
         } finally {
             IOUtils.closeWhileHandlingException(lock.getMapReadLock());
-            run.set(false);
             refresher.join();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException {
final String id = "test";
CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter(
config(),
createWriter(),
newSoftDeletesPolicy(),
softDeletesField,
indexWriterFactory
);
Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
}
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
CountDownLatch latch = new CountDownLatch(1);
AtomicBoolean run = new AtomicBoolean(true);
Thread refresher = new Thread(() -> {
latch.countDown();
try {
compositeIndexWriter.beforeRefresh();
} catch (Exception ignored) {}
});
refresher.start();
try {
latch.await();
compositeIndexWriter.deleteDocument(
operation.uid(),
false,
newDeleteTombstoneDoc(id),
1,
2,
primaryTerm.get(),
softDeletesField
);
} finally {
IOUtils.closeWhileHandlingException(lock.getMapReadLock());
run.set(false);
refresher.join();
compositeIndexWriter.afterRefresh(true);
compositeIndexWriter.beforeRefresh();
compositeIndexWriter.afterRefresh(true);
try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
assertEquals(0, directoryReader.numDocs());
}
IOUtils.closeWhileHandlingException(compositeIndexWriter);
}
}
public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException {
final String id = "test";
CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter(
config(),
createWriter(),
newSoftDeletesPolicy(),
softDeletesField,
indexWriterFactory
);
Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
}
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
CountDownLatch latch = new CountDownLatch(1);
Thread refresher = new Thread(() -> {
latch.countDown();
try {
compositeIndexWriter.beforeRefresh();
} catch (Exception ignored) {}
});
refresher.start();
try {
latch.await();
compositeIndexWriter.deleteDocument(
operation.uid(),
false,
newDeleteTombstoneDoc(id),
1,
2,
primaryTerm.get(),
softDeletesField
);
} finally {
IOUtils.closeWhileHandlingException(lock.getMapReadLock());
refresher.join();
compositeIndexWriter.afterRefresh(true);
compositeIndexWriter.beforeRefresh();
compositeIndexWriter.afterRefresh(true);
try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
assertEquals(0, directoryReader.numDocs());
}
IOUtils.closeWhileHandlingException(compositeIndexWriter);
}
}
🤖 Prompt for AI Agents
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
around lines 154 - 204, Remove the unused AtomicBoolean run and simplify the
refresher thread to only countDown the latch and call
compositeIndexWriter.beforeRefresh(), then terminate; replace the current
refresher lambda with one that does not depend on run. Wrap the call to
compositeIndexWriter.deleteDocument(...) in a try/catch that rethrows the
exception after performing cleanup so the test still fails but deterministic
cleanup runs. In the finally block always release lock.getMapReadLock() and join
the refresher thread before invoking
compositeIndexWriter.afterRefresh()/beforeRefresh() sequences and
IOUtils.closeWhileHandlingException(compositeIndexWriter); ensure you reference
the existing symbols: run (remove), refresher (modify), latch,
compositeIndexWriter.deleteDocument, lock.getMapReadLock(),
compositeIndexWriter.beforeRefresh/afterRefresh, and
IOUtils.closeWhileHandlingException(compositeIndexWriter).

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

✅ Gradle check result for cbc3338: SUCCESS

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Incomplete removal of LookupMapLockAcquisitionException infrastructure.

The removal of retry logic from onComplete method is incomplete. While the handling was removed from TransportShardBulkAction, the exception's setting infrastructure remains active and orphaned:

  • INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION setting is still defined and registered in IndexSettings.java:519 and IndexScopedSettings.java:181
  • The setting is actively retrieved in IndexSettings.java:1149
  • Documentation comment in IndexSettings.java:515 still references the removed exception

Remove or update these remaining references to complete the cleanup:

  • Remove INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION setting definition and registration, or
  • Update documentation and ensure the setting is not vestigial
  • Update tests that may depend on this setting
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)

489-494: Ensure indexWriterFactory gets closed in tearDown() now that it can be Closeable.

FlushingIndexWriterFactory tracks directories and requires close(), but the base class currently never closes indexWriterFactory, which risks leaking file handles / temp dirs and causing flaky tests (especially on Windows).

Proposed fix (close factory safely in base tearDown)
 @Override
 @After
 public void tearDown() throws Exception {
-    super.tearDown();
-    IOUtils.close(store, () -> terminate(threadPool));
+    try {
+        IOUtils.close(
+            () -> {
+                final IndexWriterFactory factory = indexWriterFactory;
+                if (factory instanceof Closeable) {
+                    ((Closeable) factory).close();
+                }
+            },
+            store,
+            () -> terminate(threadPool)
+        );
+    } finally {
+        super.tearDown();
+    }
 }

Also applies to: 509-564

server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)

202-206: Add explicit null validation for associatedCriteria before CriteriaBasedCodec instantiation.

When isContextAwareEnabled() returns true, associatedCriteria is passed to the CriteriaBasedCodec constructor without validation. Although CriteriaBasedCodec handles null gracefully (checking at line 52 of its segmentInfoFormat() method), the comments in this code (lines 198-201) state that criteria is "determined on a per-document basis and is only available within the InternalEngine," implying it should never be null in this path.

Adding an explicit null check or assertion before instantiation would enforce this precondition and improve code clarity, ensuring that if context-aware is enabled, the required criteria is always present.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1059-1073: Line 1072 incorrectly increments childWriterPendingNumDocs for parent writer operations.

The deleteInLucene method operates on accumulatingIndexWriter (parent), but the code at line 1072 increments childWriterPendingNumDocs. The explicit comment at lines 1038-1040 states "only increment this when addDeleteEntry for child writers are called," indicating this counter should not be incremented for parent writer operations. Since parent pending docs are tracked separately via accumulatingIndexWriter.getPendingNumDocs(), this increment appears to be either erroneous or represents unintended double-counting. Remove the increment at line 1072 or clarify the intent if this is deliberate.

🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The loop in CompositeIndexWriter that repeatedly calls
current.mapReadLock.tryAcquire() can spin forever if only closed maps are
produced; change it to bound the wait by either using a timed tryAcquire (e.g.,
tryAcquire(timeout, TimeUnit)) or by recording startTime and breaking after a
configurable timeout, and between attempts perform a short sleep or
Thread.yield() to avoid busy-spin; also check a shutdown/closed flag on
CompositeIndexWriter (or similar lifecycle indicator) and throw a clear
exception if shutdown is in progress so the caller can abort instead of looping
indefinitely.
🧹 Nitpick comments (5)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

694-696: The filtering logic is correct, but consider whether the performance optimization is necessary.

The current implementation filters on every call to return only CompositeDataCubeFieldType instances. While the filtering is intentional and aligns with all production usage (all 9 callers expect or cast to CompositeDataCubeFieldType), the stream-filter-collect operation creates a new Set each time.

However, this is likely not a critical issue since:

  • The method is called only 9 times in production code, mostly during initialization
  • It's not in hot loops or performance-sensitive paths
  • The filtering overhead is minimal for small sets

Note that the filtering intentionally excludes ContextAwareGroupingFieldType instances from the result. This is correct because buildCompositeFieldLookup() (line 561) uses the unfiltered compositeMappedFieldTypes field to collect all field names, while callers of getCompositeFieldTypes() expect only DataCube field types. Verify that this behavioral change (narrowing isCompositeIndexPresent() to check only for DataCube types) aligns with the intended behavior.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

243-257: Avoid confusion from config(Store store) parameter shadowing the field store.

This is fine functionally, but the parameter name makes call sites harder to read/scan in a base test. Consider renaming the parameter (e.g., engineStore) and/or making the overload protected if it’s only intended for subclasses.


509-564: Harden directory tracking to avoid double-close (and make close() idempotent-ish).

directories is a plain List, so the same Directory can be added multiple times (e.g., if supplier returns the same instance), and IOUtils.close(directories) may double-close and throw during cleanup. Consider de-dup + snapshot/clear on close.

Proposed fix (de-dup + snapshot/clear)
 protected static class FlushingIndexWriterFactory extends NativeLuceneIndexWriterFactory implements Closeable {
 
     private final Supplier<Directory> failingWriteDirectorySupplier;
-    private final List<Directory> directories;
+    private final List<Directory> directories;
     private final AtomicBoolean useFailingDirectorySupplier;
 
     FlushingIndexWriterFactory(Supplier<Directory> failingWriteDirectorySupplier, AtomicBoolean useFailingDirectorySupplier) {
         this.failingWriteDirectorySupplier = failingWriteDirectorySupplier;
         this.directories = new ArrayList<>();
         this.useFailingDirectorySupplier = useFailingDirectorySupplier;
     }
@@
     @Override
     public void close() throws IOException {
-        IOUtils.close(directories);
+        // snapshot + clear prevents re-close on repeated close() calls and reduces risk of concurrent modification
+        final List<Directory> toClose = new ArrayList<>(directories);
+        directories.clear();
+        IOUtils.close(toClose.stream().distinct().toList());
     }
 }
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)

134-136: Consider adding a clarifying comment for the unused size parameter.

The size parameter is added to match the DocumentIndexWriter interface but is not used in this implementation. A brief comment explaining that this parameter is used by other implementations (e.g., CompositeIndexWriter) for pending document tracking would improve maintainability.

📝 Suggested comment addition
 @Override
 public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException {
+    // size parameter unused here; used by CompositeIndexWriter for pending doc tracking
     return indexWriter.addDocuments(docs);
 }

144-154: Consider adding a clarifying comment for the unused size parameter.

Similar to addDocuments, the size parameter is added to match the interface but is not used in this implementation. A brief comment would improve code clarity.

📝 Suggested comment addition
 public void softUpdateDocuments(
     Term uid,
     Iterable<ParseContext.Document> docs,
     long version,
     long seqNo,
     long primaryTerm,
     int size,
     Field... softDeletesField
 ) throws IOException {
+    // size parameter unused here; used by CompositeIndexWriter for pending doc tracking
     indexWriter.softUpdateDocuments(uid, docs, softDeletesField);
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cbc3338 and fd34feb.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
🚧 Files skipped from review as they are similar to previous changes (3)
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: Analyze (java)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: Mend Security Check
🔇 Additional comments (30)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)

87-87: LGTM!

The import is correctly added to support the filtering logic in getCompositeFieldTypes().

server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)

198-201: LGTM: Clear explanatory comments.

The inline comments effectively explain the rationale for initializing the codec at this location rather than in CodecService, which will help future maintainers understand this design decision.

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)

55-55: LGTM! API signature updates are consistent.

The addition of the int size parameter to both addDocuments and softUpdateDocuments enables size-aware pending document tracking at the implementation level.

Also applies to: 65-65

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (6)

183-183: LGTM! Test updates correctly pass document count.

All test invocations have been properly updated to pass operation.docs().size() as the size parameter, maintaining consistency with the new API signatures.

Also applies to: 231-231, 334-334, 374-374, 420-420, 462-462, 472-472, 505-505, 511-511, 552-552, 588-588, 624-624, 676-676, 712-712, 748-748, 800-800, 836-836, 871-871


404-427: LGTM! Basic tragic exception validation.

This test verifies that under normal conditions (no failures), getTragicException() returns null as expected.


523-557: LGTM! Comprehensive tragic exception test coverage.

These tests properly validate that:

  • AlreadyClosedException is thrown when accessing ramBytesUsed, getFlushingBytes, or getTragicException after a tragic failure in child writers
  • Tragic exceptions are properly detected in both current and old writer maps
  • Thread synchronization with CountDownLatch ensures proper test sequencing

The use of FlushingIndexWriterFactory with failing directories effectively simulates tragic failures.

Also applies to: 559-608, 647-681, 683-732, 771-805, 807-856


610-645: LGTM! Tests verify metric collection with old writers.

These tests confirm that ramBytesUsed and getFlushingBytes correctly aggregate metrics from writers in the old map during refresh transitions.

Also applies to: 734-769


911-947: LGTM! Rollback test with old writer.

This test validates that rollback succeeds even when a writer exists in the old map during a concurrent refresh, ensuring proper cleanup of all writer states.


977-985: LGTM! softUpdateDocuments calls updated correctly.

The test calls to softUpdateDocuments now properly include operation.docs().size() as the size parameter.

Also applies to: 989-997

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (13)

131-144: LGTM! Clear documentation for pending document tracking.

The detailed documentation explains the purpose and behavior of childWriterPendingNumDocs, including the acceptable temporary overshooting during refresh.


549-560: LGTM! Proper pending document accounting during refresh.

The refresh logic correctly:

  • Accumulates pending docs from old child writers before closing them
  • Subtracts the accumulated count from the global counter after adding indexes to the parent

This ensures the pending doc count remains accurate across the refresh transition.


606-609: LGTM! Test utility method for acquiring read lock.

Package-private accessor enables test scenarios that require holding a read lock on the current map, as demonstrated in the new test cases.


706-742: LGTM! Robust flushing bytes calculation with tragic exception handling.

The refactored getFlushingBytes() properly:

  • Iterates through both current and old writer maps
  • Catches AlreadyClosedException and only rethrows if a tragic exception exists
  • Aggregates flushing bytes from child writers and the parent

770-784: LGTM! Comprehensive tragic exception detection.

The method now checks for tragic exceptions in both current and old child writers before checking the parent, ensuring any child writer failures are properly detected.


796-833: LGTM! Robust RAM bytes calculation with tragic exception handling.

Similar to getFlushingBytes, this properly handles AlreadyClosedException and aggregates RAM usage across all writers.


853-870: LGTM! Proper rollback of all child writers.

The updated rollback ensures all child writers (both current and old) are properly rolled back before rolling back the parent writer, preventing resource leaks.


917-931: LGTM! Size-aware document addition with proper accounting.

The method correctly:

  • Delegates to the underlying IndexWriter's addDocuments
  • Increments childWriterPendingNumDocs by the provided size
  • Returns the sequence number

935-948: LGTM! Single document addition increments counter.

Properly increments the pending doc counter by 1 for single document additions.


950-976: LGTM! Size-aware soft update with proper accounting.

The soft update correctly increments childWriterPendingNumDocs by the provided size after updating documents.


979-1003: LGTM! Single document soft update increments counter.

Properly increments the pending doc counter by 1 for single document soft updates.


1020-1057: LGTM! Delete operation increments counter for child writers.

The delete logic correctly increments childWriterPendingNumDocs only when delete entries are added to child writers (current or old), not for every delete operation.


352-356: The TOCTOU concern is not applicable to the current implementation.

The read lock acquired in tryAcquire() (line 349) is held throughout the caller's usage of the returned DisposableIndexWriter. Since CriteriaBasedIndexWriterLookup.closed is only set to true within close() (line 309), which requires a write lock on the underlying mapLock, and write locks are only obtained during engine closure (per design comments at lines 497-499), the lookup cannot transition to closed while a read lock is held. The call site (lines 494-501) properly handles null returns via retry loop, and all consumer sites acquire and hold the lock via try-with-resources (e.g., lines 924-930).

Note: The timeout variant tryAcquire(TimeValue timeout) (line 366) lacks the lookup.isClosed() check present in the no-arg variant, creating an inconsistency—though this variant appears unused in the codebase.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)

34-34: LGTM! Test calls updated with document count.

All addDocuments invocations correctly pass operation.docs().size() as the third parameter.

Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263


122-130: LGTM! softUpdateDocuments calls updated correctly.

All softUpdateDocuments invocations now include operation.docs().size() as the size parameter.

Also applies to: 227-235, 268-276


154-204: LGTM! Comprehensive test for delete with old child writer.

This test validates the delete operation when:

  1. A document exists in a child writer
  2. A concurrent refresh moves the writer to the old map
  3. The delete is processed while the writer is in the old state

The test properly uses:

  • acquireNewReadLock() to hold the current map during refresh
  • CountDownLatch for thread coordination
  • Proper resource cleanup with IOUtils.closeWhileHandlingException
server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)

239-245: LGTM! Proper usage of size-aware addDocuments API.

The multi-document path correctly passes docs.size() to the new API, while single-document operations continue using the original addDocument method.


247-260: LGTM! Proper usage of size-aware softUpdateDocuments API.

The update logic correctly:

  • Passes docs.size() for multi-document updates
  • Uses softUpdateDocument (without size) for single documents
server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: LGTM: Size parameter correctly added for multi-document tracking.

The change correctly passes docs.size() to enable pending document tracking. The single-document optimization path is preserved, and the counter increment correctly handles both cases.


1249-1258: LGTM: Stale document handling correctly updated.

The size parameter is correctly propagated for multi-document stale operations. The soft-delete field is properly added to all documents before indexing.


1369-1390: LGTM: Update document path correctly modified.

The size parameter is properly added to the softUpdateDocuments call for multi-document updates. The single-document optimization is preserved, and the counter correctly uses docs.size().

Comment on lines +494 to 501
while (current == null || current.isClosed()) {
// This function acquires a first read lock on a map which does not have any write lock present. Current keeps
// on getting rotated during refresh, so there will be one current on which read lock can be obtained.
// Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was
// never applied on this map as write lock gets only during closing time. We are doing this instead of acquire,
// because acquire can also apply a read lock in case refresh completed and map is closed.
current = this.current.mapReadLock.tryAcquire();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Risk: Potential infinite loop waiting for non-closed map.

The while loop continues until it finds a non-closed current map. If the map rotation consistently produces closed maps (e.g., during shutdown or repeated failures), this could spin indefinitely without yielding or timeout.

🔄 Consider adding timeout or yield
 boolean success = false;
 CriteriaBasedIndexWriterLookup current = null;
+int attempts = 0;
+final int MAX_ATTEMPTS = 1000;
 try {
-    while (current == null || current.isClosed()) {
+    while ((current == null || current.isClosed()) && attempts < MAX_ATTEMPTS) {
         // This function acquires a first read lock on a map which does not have any write lock present. Current keeps
         // on getting rotated during refresh, so there will be one current on which read lock can be obtained.
         // Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was
         // never applied on this map as write lock gets only during closing time. We are doing this instead of acquire,
         // because acquire can also apply a read lock in case refresh completed and map is closed.
         current = this.current.mapReadLock.tryAcquire();
+        if (current == null || current.isClosed()) {
+            attempts++;
+            Thread.yield(); // Allow other threads to progress
+        }
     }
+    
+    if (current == null || current.isClosed()) {
+        throw new IllegalStateException("Unable to acquire non-closed lookup after " + MAX_ATTEMPTS + " attempts");
+    }

     DisposableIndexWriter writer = current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier);

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 494 - 501, The loop in CompositeIndexWriter that repeatedly calls
current.mapReadLock.tryAcquire() can spin forever if only closed maps are
produced; change it to bound the wait by either using a timed tryAcquire (e.g.,
tryAcquire(timeout, TimeUnit)) or by recording startTime and breaking after a
configurable timeout, and between attempts perform a short sleep or
Thread.yield() to avoid busy-spin; also check a shutdown/closed flag on
CompositeIndexWriter (or similar lifecycle indicator) and throw a clear
exception if shutdown is in progress so the caller can abort instead of looping
indefinitely.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

❌ Gradle check result for fd34feb: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

1059-1073: Remove the pending docs increment from deleteInLucene, which incorrectly counts parent writer operations.

The deleteInLucene method operates on accumulatingIndexWriter (the parent writer) but increments childWriterPendingNumDocs at line 1072. This causes double-counting because:

  1. Parent writer operations are tracked internally by accumulatingIndexWriter.getPendingNumDocs()
  2. getPendingNumDocs() returns the sum: childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()
  3. Incrementing childWriterPendingNumDocs for parent writer operations adds them a second time

All other methods in the class (addDocuments, addDocument, softUpdateDocuments, softUpdateDocument) operate on child writers and increment childWriterPendingNumDocs. The comments at lines 1039 and 1050 explicitly state "only increment this when addDeleteEntry for child writers are called." The deleteInLucene method breaks this pattern by incrementing for parent writer operations. Remove line 1072.

🤖 Fix all issues with AI agents
In @CHANGELOG.md:
- Line 34: The changelog entry "Fix indexing regression and bug fixes for
grouping criteria.
([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" uses an
inconsistent PR reference format; update that entry to include the hash symbol
so the link reads
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))", i.e.,
replace "([20145](" with "([#20145](" for the PR reference to match other
entries.

In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java:
- Around line 153-168: The test method
testContextAwareFieldMapperWithDerivedSource leaks the XContentBuilder and never
finishes the JSON object; change creation of the builder from
XContentFactory.jsonBuilder().startObject() to use a try-with-resources (or
explicitly close) and call endObject() before passing it to
mapper.deriveSource(...), ensuring the XContentBuilder is closed in all paths so
the resource is not leaked when invoking mapper.canDeriveSource() and
mapper.deriveSource(...).
🧹 Nitpick comments (11)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)

188-189: Documentation terminology inconsistency.

The javadoc comments refer to "Context Aware Segment field" but the class is named ContextAwareGroupingFieldMapper. Consider updating the documentation to use consistent terminology (e.g., "Context Aware Grouping field") for clarity.

Also applies to: 195-196

server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

162-167: Consider using more idiomatic JUnit test pattern.

The try-catch with fail() pattern is not the most idiomatic approach. Since the test method already declares throws IOException, you can either:

  1. Let exceptions propagate naturally (remove try-catch entirely), or
  2. Use assertDoesNotThrow() if you specifically want to assert that no exception occurs.

Additionally, the test could benefit from positive assertions about the behavior, not just the absence of exceptions.

♻️ Alternative implementation
     public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
         ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
         ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
             "context_aware_grouping",
             fieldType,
             new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
         );
         LeafReader leafReader = mock(LeafReader.class);
 
-        try {
-            mapper.canDeriveSource();
-            mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
-        } catch (Exception e) {
-            fail(e.getMessage());
-        }
+        // Since these are no-op methods, simply verify they don't throw exceptions
+        mapper.canDeriveSource();
+        try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) {
+            mapper.deriveSource(builder, leafReader, 0);
+            builder.endObject();
+        }
+        // Test passes if no exception is thrown
     }
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)

693-697: Consider caching the filtered result for better performance.

The method streams and filters compositeMappedFieldTypes on every invocation. If called frequently during indexing operations, this could impact performance.

♻️ Optimization: Cache the filtered set

Modify the class to maintain a separate cached field for CompositeDataCubeFieldType instances:

Add a new field after line 230:

private volatile Set<CompositeMappedFieldType> compositeMappedFieldTypes;
private volatile Set<CompositeMappedFieldType> compositeDataCubeFieldTypes;

Update the internalMerge method at line 552 to populate both:

// initialize composite fields post merge
this.compositeMappedFieldTypes = getCompositeFieldTypesFromMapper();
this.compositeDataCubeFieldTypes = compositeMappedFieldTypes.stream()
    .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
    .collect(Collectors.toSet());
buildCompositeFieldLookup();

Then simplify the method:

 public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
-    return compositeMappedFieldTypes.stream()
-        .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
-        .collect(Collectors.toSet());
+    return compositeDataCubeFieldTypes;
 }

693-697: Consider renaming or documenting the narrowed return type.

The method name getCompositeFieldTypes() suggests it returns all composite field types, but it now returns only CompositeDataCubeFieldType instances. This could lead to confusion for developers using this API.

📝 Suggested improvements

Option 1: Rename the method (breaking change):

-public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
+public Set<CompositeMappedFieldType> getCompositeDataCubeFieldTypes() {
     return compositeMappedFieldTypes.stream()
         .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
         .collect(Collectors.toSet());
 }

Option 2: Add JavaDoc (non-breaking):

+/**
+ * Returns the set of composite field types that are also CompositeDataCubeFieldType instances.
+ * This is used to identify fields that are part of data cube composite indexes.
+ * 
+ * @return Set of CompositeMappedFieldType instances that implement CompositeDataCubeFieldType
+ */
 public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
     return compositeMappedFieldTypes.stream()
         .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
         .collect(Collectors.toSet());
 }
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)

183-183: API signature updates are consistent and correct.

All calls to addDocuments() have been consistently updated to include operation.docs().size() as the third parameter, aligning with the new size-aware API.

For a minor optimization in test code, consider caching operation.docs().size() to avoid repeated calls, though this is not critical for tests:

 Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
-compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
+int docCount = operation.docs().size();
+compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), docCount);

Also applies to: 231-231, 334-334, 374-374, 420-420, 462-462, 472-472, 505-505, 511-511, 552-552, 588-588, 624-624, 676-676, 712-712, 748-748, 800-800, 836-836, 871-871, 925-925, 976-976


523-556: Significant code duplication across tragic-exception tests.

The tragic-exception test methods share nearly identical FilterDirectory setup code that throws OutOfMemoryError. This duplication increases maintenance burden and makes the tests harder to update consistently.

♻️ Consider extracting common test infrastructure

Create a helper method to reduce duplication:

private Supplier<Directory> createFailingDirectorySupplier() {
    return () -> new FilterDirectory(newDirectory()) {
        @Override
        public IndexOutput createOutput(String name, IOContext context) throws IOException {
            IndexOutput out = super.createOutput(name, context);
            return new FilterIndexOutput("failing output", "test", out) {
                @Override
                public void writeBytes(byte[] b, int offset, int length) throws IOException {
                    throw new OutOfMemoryError("Simulated write failure");
                }
            };
        }
    };
}

private CompositeIndexWriter createCompositeWriterWithFailingFactory(Supplier<Directory> dirSupplier) throws IOException {
    FlushingIndexWriterFactory factory = new FlushingIndexWriterFactory(dirSupplier, new AtomicBoolean(true));
    CompositeIndexWriter writer = new CompositeIndexWriter(
        config(),
        createWriter(),
        newSoftDeletesPolicy(),
        softDeletesField,
        factory
    );
    writer.getConfig().setMaxBufferedDocs(2);
    return writer;
}

Then simplify tests:

public void testRAMBytesUsedWithTragicExceptionOnCurrent() throws Exception {
    Supplier<Directory> dirSupplier = createFailingDirectorySupplier();
    FlushingIndexWriterFactory factory = new FlushingIndexWriterFactory(dirSupplier, new AtomicBoolean(true));
    CompositeIndexWriter compositeIndexWriter = createCompositeWriterWithFailingFactory(dirSupplier);
    
    Engine.Index operation = indexForDoc(createParsedDoc(String.valueOf("-1"), null, DEFAULT_CRITERIA));
    try {
        compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
    } catch (Error ignored) {}
    
    assertThrows(AlreadyClosedException.class, compositeIndexWriter::ramBytesUsed);
    IOUtils.closeWhileHandlingException(compositeIndexWriter, factory);
}

Also applies to: 559-607, 647-681, 683-732, 771-805, 807-856


404-427: Consider clarifying test name.

The test name testGetTragicExceptionWithException is potentially confusing since no exception occurs in this test—it verifies the happy path where getTragicException() returns null. Consider renaming to testGetTragicExceptionWithoutException or testGetTragicExceptionNormalOperation for clarity.


129-139: Empty catch block may hide test failures.

The empty catch block at line 137 could suppress genuine errors during concurrent indexing operations, making test failures harder to diagnose.

Improve error handling

Consider tracking or logging exceptions to identify issues:

 Thread computeThread = new Thread(() -> {
     while (stopped.get() == false) {
         try {
             CompositeIndexWriter.LiveIndexWriterDeletesMap currentMap = mapRef.get();
             currentMap.computeIndexWriterIfAbsentForCriteria("test-criteria", supplier, new ShardId("foo", "_na_", 1));
             computeCount.incrementAndGet();
             indexedDocs.release();
         } catch (Exception e) {
-
+            // Expected during map rotation, but log unexpected exceptions
+            if (!(e instanceof IllegalStateException)) {
+                throw new AssertionError("Unexpected exception during compute", e);
+            }
         }
     }
 });

192-199: Empty catch block may hide test failures.

The empty catch block at line 197 could suppress IOException during refresh operations in the concurrent test, potentially masking issues.

Consider at minimum logging the exception or tracking that it occurred, especially for debugging test failures.

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)

133-136: Clarify what size means (and optionally validate it) since it’s unused here.
Right now size is accepted but ignored; if it’s doc-count, consider renaming the local param to docCount and guarding against negatives to catch caller bugs early.

Proposed tweak
-    public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException {
+    public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int docCount) throws IOException {
+        if (docCount < 0) {
+            throw new IllegalArgumentException("docCount must be >= 0");
+        }
         return indexWriter.addDocuments(docs);
     }

143-154: Same comment for softUpdateDocuments(..., int size, ...): rename to docCount (and optionally validate).
This wrapper ignores version/seqNo/primaryTerm/docCount and delegates to Lucene; that’s fine, but the unused size param is easy to misread as “bytes” rather than “doc count”.

Proposed tweak
     public void softUpdateDocuments(
         Term uid,
         Iterable<ParseContext.Document> docs,
         long version,
         long seqNo,
         long primaryTerm,
-        int size,
+        int docCount,
         Field... softDeletesField
     ) throws IOException {
+        if (docCount < 0) {
+            throw new IllegalArgumentException("docCount must be >= 0");
+        }
         indexWriter.softUpdateDocuments(uid, docs, softDeletesField);
     }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fd34feb and e9991cc.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (2)
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Mend Security Check
🔇 Additional comments (27)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)

11-12: LGTM! Imports support the new method signatures.

The added imports for LeafReader and XContentBuilder are correctly used in the new deriveSource method signature.


187-199: The @OverRide annotations are valid. The methods canDeriveSource() and deriveSource(XContentBuilder, LeafReader, int) are declared in the Mapper base class (lines 309 and 319), so the overrides are correct.

Consider updating the javadoc to use consistent terminology: change "Context Aware Segment" to "Context Aware Grouping" to match the class name and field type.

server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)

11-11: LGTM! Imports support the new test method.

The added imports for LeafReader, XContentFactory, and IOException are correctly used in the new test method.

Also applies to: 13-13, 20-20

server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)

87-87: LGTM!

The import is correctly added to support the streaming operation in getCompositeFieldTypes().


693-697: This filtering is intentional and does not introduce a breaking change.

The method now correctly filters to return only CompositeDataCubeFieldType instances, excluding ContextAwareGroupingFieldType. Evidence confirms this is the intended behavior:

  • ContextAwareGroupingFieldType is a separate "grouping criteria" field type with zero usages in the star tree or composite index modules
  • All production callers of getCompositeFieldTypes() immediately cast the result to CompositeDataCubeFieldType or expect only that type
  • The change aligns with the PR's stated goal of "bug fixes for grouping criteria"—this separation correctly prevents grouping fields from being processed as data cube fields

No verification of callers is needed; this filtering is a fix rather than a breaking change.

Likely an incorrect or invalid review comment.

server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: The retry logic shown handles version conflicts, not LookupMapLockAcquisitionException.

The isConflictException() method (line 755-757) checks only for VersionConflictEngineException. The retry logic at lines 724-730 correctly retries on version conflicts for update operations. There is no LookupMapLockAcquisitionException class in the codebase, and there is no isLookupMapLockAcquisitionException() method—the claims about their removal in the original review are incorrect. The only reference to this exception is a stale comment in IndexSettings.java describing a retry setting. The code itself is correct.

Likely an incorrect or invalid review comment.

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)

55-55: LGTM: Clean API extension for document count tracking.

The addition of the size parameter to addDocuments and softUpdateDocuments enables efficient document count propagation through the indexing stack without requiring callers to recompute sizes.

Also applies to: 65-65

server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (1)

239-245: LGTM: Correct propagation of document counts.

The changes correctly pass docs.size() to the updated addDocuments and softUpdateDocuments methods for multi-document operations, while single-document paths remain unchanged.

Also applies to: 247-260

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)

131-144: LGTM: Well-documented pending document counter.

The childWriterPendingNumDocs field and its comprehensive documentation clearly explain the tracking mechanism and acknowledge the acceptable temporary overshoot during refresh transitions.


352-356: LGTM: Defensive check prevents use-after-close.

The isClosed() check after acquiring the lock properly handles the race where a lookup is closed between lock acquisition and use, correctly releasing the lock before returning null.


545-563: LGTM: Correct pending document accounting during refresh.

The logic properly tracks pending documents from old child writers before closing them, then adjusts the global counter after addIndexes to prevent double-counting once documents are transferred to the parent writer.


705-742: LGTM: Robust exception handling for flushing bytes calculation.

The refactored getFlushingBytesUtil correctly handles AlreadyClosedException by only rethrowing when a tragic exception exists, preventing spurious failures while ensuring real corruption issues are surfaced.


744-748: LGTM: Correct aggregation of pending documents.


770-784: LGTM: Comprehensive tragic exception checking.

The method correctly checks for tragic exceptions across current child writers, old child writers, and the accumulating parent writer, returning the first exception encountered.


786-833: LGTM: Consistent exception handling for RAM usage calculation.

The refactoring mirrors getFlushingBytesUtil with the same robust exception handling pattern, maintaining consistency across resource tracking methods.


853-870: LGTM: Comprehensive rollback prevents resource leaks.

The updated rollback logic correctly rolls back all child writers (current and old) before the parent writer, ensuring complete cleanup as documented in the comments.


494-501: The loop termination guarantees are sound as written.

The while loop correctly handles map rotation during refresh. The tryAcquire() method returns null when the lock cannot be acquired and returns the current lookup when locked (or closes it if already closed). Since liveIndexWriterDeletesMap is volatile and rotates during refresh—creating a new current map that is not closed—the loop will eventually acquire a read lock on a stable, non-closed lookup and terminate.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

243-257: LGTM: Convenient test configuration overload.


509-564: LGTM: Well-implemented test utility for flush behavior.

The FlushingIndexWriterFactory provides a clean test harness for verifying flush-related behavior by wrapping IndexWriter operations. The directory tracking and proper cleanup via IOUtils.close ensure no resource leaks in tests.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

34-34: LGTM: Test calls updated for new API signatures.

All test methods correctly pass operation.docs().size() to the updated addDocuments and softUpdateDocuments methods, maintaining existing test semantics.

Also applies to: 76-76, 114-114, 122-130, 219-219, 227-235, 263-263, 268-276


154-204: LGTM: Good coverage of concurrent delete scenario.

The new test effectively exercises the race condition where a delete operation targets an old child writer during refresh, using proper thread synchronization and safe cleanup. The test verifies correct eventual consistency after refresh completes.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)

40-40: LGTM - necessary import for new test infrastructure.

The Supplier import is appropriately added to support the new FlushingIndexWriterFactory pattern used in tragic-exception tests.


977-997: softUpdateDocuments() signature updates are correct.

The calls to softUpdateDocuments() have been properly updated to include the document size parameter, maintaining consistency with the size-aware API pattern.


537-557: Resource management is adequate.

The tragic-exception tests properly use IOUtils.closeWhileHandlingException() to ensure cleanup of CompositeIndexWriter and FlushingIndexWriterFactory instances, even when exceptions occur during test execution.

Also applies to: 573-608, 661-681, 697-732, 785-805, 821-856


591-607: Verify thread coordination logic in refresh tests.

These tests hold a write lock while spawning a refresher thread that calls beforeRefresh(), which likely requires the same lock. The test pattern may not achieve the intended scenario of testing "writer on old map" if the refresher thread is blocked waiting for the lock held by the main thread.

Consider verifying whether beforeRefresh() successfully transitions the writer to the old map or if it remains blocked. You might add logging or assertions to confirm the refresher thread's state:

AtomicBoolean refreshStarted = new AtomicBoolean(false);
Thread refresher = new Thread(() -> {
    latch.countDown();
    refreshStarted.set(true);
    try {
        compositeIndexWriter.beforeRefresh();
    } catch (Exception ignored) {}
});

refresher.start();
latch.await();
Thread.sleep(50); // Give refresher time to attempt beforeRefresh
// Add assertion to verify intended state was reached

Also applies to: 627-644, 715-731, 751-768, 839-855, 928-946


309-312: LGTM - formatting improvements enhance readability.

The multi-line formatting of expectThrows() calls improves code readability while maintaining functional correctness.

Also applies to: 884-887

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)

168-183: The soft-delete marker invariant is correctly maintained. When deleteDocument() is called with isStaleOperation=true, the doc parameter always contains the soft-delete field. Both production code paths (InternalEngine and IngestionEngine) explicitly call doc.add(softDeletesField) before invoking deleteDocument(), and all test cases follow the same pattern. Therefore, addDocument(doc) always adds a soft-deleted document, preventing any risk of accidentally creating a searchable live document.

- Fix SearchPhaseExecutionException to properly initCause ([#20320](https://github.com/opensearch-project/OpenSearch/pull/20320))
- Fix `cluster.remote.<cluster_alias>.server_name` setting no populating SNI ([#20321](https://github.com/opensearch-project/OpenSearch/pull/20321))
- Fix X-Opaque-Id header propagation (along with other response headers) for streaming Reactor Netty 4 transport ([#20371](https://github.com/opensearch-project/OpenSearch/pull/20371))
- Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent PR reference format in changelog entry.

The PR reference at line 34 is missing the # symbol that is consistently used in other entries in the same section. Update to match the existing format: ([#20145](...) instead of ([20145](...).

🔧 Proposed fix
- Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
+ Fix indexing regression and bug fixes for grouping criteria. ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
- Fix indexing regression and bug fixes for grouping criteria. ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
🤖 Prompt for AI Agents
In @CHANGELOG.md at line 34, The changelog entry "Fix indexing regression and
bug fixes for grouping criteria.
([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" uses an
inconsistent PR reference format; update that entry to include the hash symbol
so the link reads
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))", i.e.,
replace "([20145](" with "([#20145](" for the PR reference to match other
entries.

Comment on lines +153 to +168
public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
"context_aware_grouping",
fieldType,
new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
);
LeafReader leafReader = mock(LeafReader.class);

try {
mapper.canDeriveSource();
mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
} catch (Exception e) {
fail(e.getMessage());
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix resource leak and complete XContentBuilder lifecycle.

The test has two issues:

  1. Resource leak: The XContentBuilder created on Line 164 is never closed. Although this is a test, it's still a best practice to properly close resources.
  2. Incomplete builder: The builder is started with startObject() but never closed with endObject().
🔧 Proposed fix
     public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
         ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
         ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
             "context_aware_grouping",
             fieldType,
             new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
         );
         LeafReader leafReader = mock(LeafReader.class);
 
-        try {
+        try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) {
             mapper.canDeriveSource();
-            mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
-        } catch (Exception e) {
-            fail(e.getMessage());
+            mapper.deriveSource(builder, leafReader, 0);
+            builder.endObject();
         }
     }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
"context_aware_grouping",
fieldType,
new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
);
LeafReader leafReader = mock(LeafReader.class);
try {
mapper.canDeriveSource();
mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
} catch (Exception e) {
fail(e.getMessage());
}
}
public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
"context_aware_grouping",
fieldType,
new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
);
LeafReader leafReader = mock(LeafReader.class);
try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) {
mapper.canDeriveSource();
mapper.deriveSource(builder, leafReader, 0);
builder.endObject();
}
}
🤖 Prompt for AI Agents
In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
around lines 153 - 168, The test method
testContextAwareFieldMapperWithDerivedSource leaks the XContentBuilder and never
finishes the JSON object; change creation of the builder from
XContentFactory.jsonBuilder().startObject() to use a try-with-resources (or
explicitly close) and call endObject() before passing it to
mapper.deriveSource(...), ensuring the XContentBuilder is closed in all paths so
the resource is not leaked when invoking mapper.canDeriveSource() and
mapper.deriveSource(...).

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

❌ Gradle check result for e9991cc: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)

717-753: Clean removal of LookupMapLockAcquisitionException handling, but update stale documentation.

The removal from the onComplete method is complete and correct. The retry logic for version conflicts (lines 724-729) and response processing flow properly. The isConflictException() method correctly handles only VersionConflictEngineException.

However, a stale reference remains: server/src/main/java/org/opensearch/index/IndexSettings.java:515 contains a comment mentioning LookupMapLockAcquisitionException. Update this comment to remove the reference to the now-deleted exception.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (3)

486-513: Leaked map read-lock in writer lookup path (can deadlock refresh).

LiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteria(...) acquires this.current.mapReadLock.tryAcquire() in a loop but never releases it on the success path (Lines 494-505). That will prevent beforeRefresh() from acquiring the corresponding write lock (Line 532), potentially hanging refresh forever. The same applies to getAssociatedIndexWriterForCriteria(...) callers like addDocuments/addDocument/softUpdate* which don’t close that lock. citeturn0search0

Proposed fix (always release the read lock)
         DisposableIndexWriter computeIndexWriterIfAbsentForCriteria(
             String criteria,
             CheckedBiFunction<String, CriteriaBasedIndexWriterLookup, DisposableIndexWriter, IOException> indexWriterSupplier,
             ShardId shardId
         ) {
-            boolean success = false;
             CriteriaBasedIndexWriterLookup current = null;
             try {
                 while (current == null || current.isClosed()) {
                     current = this.current.mapReadLock.tryAcquire();
                 }
-
-                DisposableIndexWriter writer = current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier);
-                success = true;
-                return writer;
+                return current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier);
             } finally {
-                if (success == false && current != null) {
-                    assert current.mapReadLock.isHeldByCurrentThread() == true;
-                    current.mapReadLock.close();
-                }
+                if (current != null) {
+                    assert current.mapReadLock.isHeldByCurrentThread();
+                    current.mapReadLock.close();
+                }
             }
         }

Also applies to: 916-932, 934-948, 950-1003


646-666: Leaked map read-lock on successful getIndexWriterForIdFromLookup return.

getIndexWriterForIdFromLookup(...) acquires indexWriterLookup.mapReadLock (Line 650-651) and only releases it when returning null. When it returns a DisposableIndexWriter, the lock stays held (Line 654-657), and subsequent code acquires the lock again and only releases once, leaving a permanent hold. citeturn0search0

Proposed fix (always release before returning)
     DisposableIndexWriter getIndexWriterForIdFromLookup(BytesRef uid, CriteriaBasedIndexWriterLookup indexWriterLookup) {
-        boolean isCriteriaNotNull = false;
         try {
             indexWriterLookup.mapReadLock.acquire();
             String criteria = indexWriterLookup.getCriteriaForDoc(uid);
             if (criteria != null) {
                 DisposableIndexWriter disposableIndexWriter = indexWriterLookup.getIndexWriterForCriteria(criteria);
                 if (disposableIndexWriter != null) {
-                    isCriteriaNotNull = true;
                     return disposableIndexWriter;
                 }
             }

             return null;
         } finally {
-            if (isCriteriaNotNull == false) {
-                indexWriterLookup.mapReadLock.close();
-            }
+            indexWriterLookup.mapReadLock.close();
         }
     }

545-563: Reorder to read pending docs before closing the writer.

Calling getPendingNumDocs() after close() will throw AlreadyClosedException because Lucene's IndexWriter enforces the closed state. This will cause the pending-doc reconciliation to fail. Move line 553 before line 552 to capture the pending document count before closing.

Proposed fix
         for (CompositeIndexWriter.DisposableIndexWriter childDisposableWriter : markForRefreshIndexWritersMap.values()) {
             directoryToCombine.add(childDisposableWriter.getIndexWriter().getDirectory());
-            childDisposableWriter.getIndexWriter().close();
             pendingNumDocsByOldChildWriter.addAndGet(childDisposableWriter.getIndexWriter().getPendingNumDocs());
+            childDisposableWriter.getIndexWriter().close();
         }
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

114-165: Concurrency tests likely too weak / swallow failures.

  • testConcurrentIndexingDuringRefresh: run.set(false) is called immediately after starting threads (Line 202-204), so the loops may not execute meaningfully.
  • testConcurrentComputeIndexWriterWithMapRotation: empty catch (Exception e) {} in the compute thread hides the actual failure mode (Line 135-138).

Consider adding a minimum-duration/iteration barrier and failing on unexpected exceptions.

Also applies to: 166-209

🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 131-145: The deleteInLucene method incorrectly increments
childWriterPendingNumDocs when it performs deletions on the parent
accumulatingIndexWriter; remove that increment (or guard it) so
childWriterPendingNumDocs is only modified for child-level writer operations.
Specifically, in deleteInLucene avoid calling
childWriterPendingNumDocs.increment (or only call it when the operation targets
a child IndexWriter rather than accumulatingIndexWriter) so getPendingNumDocs()
no longer double-counts parent operations; update any accompanying comments to
reflect that childWriterPendingNumDocs tracks only child-level modifications.
🧹 Nitpick comments (6)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)

509-564: Consider thread-safety for the directories list.

The directories list is accessed without synchronization in createWriter and close. If createWriter can be called from multiple threads during test execution, this could lead to race conditions.

🔒 Proposed fix to use thread-safe collection
 protected static class FlushingIndexWriterFactory extends NativeLuceneIndexWriterFactory implements Closeable {
 
     private final Supplier<Directory> failingWriteDirectorySupplier;
-    private final List<Directory> directories;
+    private final List<Directory> directories = Collections.synchronizedList(new ArrayList<>());
     private final AtomicBoolean useFailingDirectorySupplier;
 
     FlushingIndexWriterFactory(Supplier<Directory> failingWriteDirectorySupplier, AtomicBoolean useFailingDirectorySupplier) {
         this.failingWriteDirectorySupplier = failingWriteDirectorySupplier;
-        this.directories = new ArrayList<>();
         this.useFailingDirectorySupplier = useFailingDirectorySupplier;
     }

Alternatively, if test execution guarantees single-threaded access, the current implementation is acceptable.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

154-204: Consider improving exception handling and removing unused variable.

The new concurrent delete test has a couple of minor issues:

  1. Line 176: The beforeRefresh exception is silently ignored, which could mask real failures during test execution.
  2. Line 171: The AtomicBoolean run variable is created but never used in the test logic.
♻️ Proposed improvements
         CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
         CountDownLatch latch = new CountDownLatch(1);
-        AtomicBoolean run = new AtomicBoolean(true);
         Thread refresher = new Thread(() -> {
             latch.countDown();
             try {
                 compositeIndexWriter.beforeRefresh();
-            } catch (Exception ignored) {}
+            } catch (Exception e) {
+                fail("beforeRefresh failed: " + e.getMessage());
+            }
         });

Alternatively, if silently catching the exception is intentional for this test scenario, consider adding a comment explaining why.


156-162: Inconsistent resource management pattern.

Unlike other tests in this class (e.g., lines 24-60), this test initializes compositeIndexWriter outside the try-finally block. If an exception occurs between initialization (line 156) and entering the finally block (line 191), the writer won't be properly closed.

♻️ Align with the pattern used in other tests
     public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException {
         final String id = "test";
-        CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter(
-            config(),
-            createWriter(),
-            newSoftDeletesPolicy(),
-            softDeletesField,
-            indexWriterFactory
-        );
-
-        Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
-        try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
-            compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
-        }
-
-        CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
-        CountDownLatch latch = new CountDownLatch(1);
-        AtomicBoolean run = new AtomicBoolean(true);
-        Thread refresher = new Thread(() -> {
-            latch.countDown();
-            try {
-                compositeIndexWriter.beforeRefresh();
-            } catch (Exception ignored) {}
-        });
-
-        refresher.start();
+        CompositeIndexWriter compositeIndexWriter = null;
         try {
-            latch.await();
-            compositeIndexWriter.deleteDocument(
-                operation.uid(),
-                false,
-                newDeleteTombstoneDoc(id),
-                1,
-                2,
-                primaryTerm.get(),
-                softDeletesField
+            compositeIndexWriter = new CompositeIndexWriter(
+                config(),
+                createWriter(),
+                newSoftDeletesPolicy(),
+                softDeletesField,
+                indexWriterFactory
             );
+
+            Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA));
+            try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) {
+                compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size());
+            }
+
+            CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock();
+            CountDownLatch latch = new CountDownLatch(1);
+            Thread refresher = new Thread(() -> {
+                latch.countDown();
+                try {
+                    compositeIndexWriter.beforeRefresh();
+                } catch (Exception ignored) {}
+            });
+
+            refresher.start();
+            try {
+                latch.await();
+                compositeIndexWriter.deleteDocument(
+                    operation.uid(),
+                    false,
+                    newDeleteTombstoneDoc(id),
+                    1,
+                    2,
+                    primaryTerm.get(),
+                    softDeletesField
+                );
+            } finally {
+                IOUtils.closeWhileHandlingException(lock.getMapReadLock());
+                refresher.join();
+                compositeIndexWriter.afterRefresh(true);
+                compositeIndexWriter.beforeRefresh();
+                compositeIndexWriter.afterRefresh(true);
+                try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
+                    assertEquals(0, directoryReader.numDocs());
+                }
+            }
         } finally {
-            IOUtils.closeWhileHandlingException(lock.getMapReadLock());
-            run.set(false);
-            refresher.join();
-            compositeIndexWriter.afterRefresh(true);
-            compositeIndexWriter.beforeRefresh();
-            compositeIndexWriter.afterRefresh(true);
-            try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) {
-                assertEquals(0, directoryReader.numDocs());
+            if (compositeIndexWriter != null) {
+                IOUtils.closeWhileHandlingException(compositeIndexWriter);
             }
-
-            IOUtils.closeWhileHandlingException(compositeIndexWriter);
         }
     }

Also applies to: 202-202

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)

316-375: tryAcquire(TimeValue) should mirror the “closed lookup” guard.

CriteriaBasedWriterLock.tryAcquire() now checks lookup.isClosed() and releases (Lines 352-355), but tryAcquire(TimeValue) does not (Lines 366-374). That’s a potential inconsistency under rotation/close timing.


853-870: Rollback should be best-effort across writers to avoid partial leaks.

As written, a single child rollback exception can prevent rolling back remaining writers + parent. Consider collecting/suppressing exceptions so you always attempt to rollback everything.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

404-427: Good coverage for tragic-exception/closed-writer behavior, but consider tightening assertions.

A lot of the new tests intentionally ignore Error (e.g., catch (Error ignored) {}) and then assert closed/ACE behavior later; that’s fine, but where possible, assert that a tragic exception was actually recorded (to reduce false positives).

Also applies to: 523-857, 911-947

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e9991cc and 4f31742.

📒 Files selected for processing (19)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
  • server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
  • server/src/main/java/org/opensearch/index/engine/InternalEngine.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
  • server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
  • server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
  • server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
  • server/src/main/java/org/opensearch/OpenSearchServerException.java
  • server/src/test/java/org/opensearch/ExceptionSerializationTests.java
  • server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
  • server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
  • test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
🚧 Files skipped from review as they are similar to previous changes (4)
  • CHANGELOG.md
  • server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
  • server/src/main/java/org/opensearch/index/mapper/MapperService.java
  • server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
server/src/main/java/org/opensearch/script/ContextAwareGroupingScript.java (1)
  • ContextAwareGroupingScript (22-44)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
  • IOUtils (58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
  • GitHub Check: gradle-check
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: Analyze (java)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: Mend Security Check
🔇 Additional comments (17)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (2)

11-11: LGTM: Imports are appropriate.

The new imports support the derived source testing in the new test method.

Also applies to: 13-13, 20-20


153-168: The test correctly validates no-op override methods and does not require the suggested assertions.

The canDeriveSource() method returns void (not a value to assert), and deriveSource() is an intentional no-op override that does nothing with the builder. These methods override parent class methods to prevent them from throwing UnsupportedOperationException, as documented in the javadoc comments. The test appropriately verifies that both methods can be called without throwing exceptions, which is the correct behavior for no-op overrides.

The XContentBuilder closure is the only minor style point, but it's immaterial since the builder is unused by the no-op method.

Likely an incorrect or invalid review comment.

server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)

17-17: LGTM!

The new imports support the FlushingIndexWriterFactory implementation and are all appropriately used.

Also applies to: 81-81, 89-89


243-257: LGTM!

The convenience overload correctly delegates to the existing config method with appropriate preset parameters.

server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)

239-245: LGTM!

The multi-document path correctly propagates the document count via the new docs.size() parameter, while the single-document path remains unchanged as expected.


247-260: LGTM!

The document count is correctly passed to softUpdateDocuments before the varargs parameter. Single-document updates appropriately use the existing API without the size parameter.

server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)

55-55: LGTM!

The new size parameter enables explicit document count tracking for multi-document operations, supporting improved RAM accounting and pending document management.


59-67: LGTM!

The size parameter is correctly positioned before the varargs softDeletesField parameter, maintaining Java syntax requirements while enabling consistent document count tracking across update operations.

server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)

133-136: Note: size parameter unused in delegation.

The size parameter is accepted per the DocumentIndexWriter interface but not passed to the underlying Lucene IndexWriter. Since standard Lucene writers handle their own document counting, this is acceptable. The parameter is utilized by other implementations like CompositeIndexWriter for custom accounting.


143-154: Note: size parameter unused in delegation.

Consistent with addDocuments, the size parameter is accepted for interface compliance but not propagated to Lucene's native softUpdateDocuments. This is appropriate for this wrapper implementation.

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)

16-17: LGTM!

The new imports support the concurrent delete test scenario added at line 154.


34-34: LGTM!

All test call sites correctly updated to use the new size-aware API signatures, consistently passing operation.docs().size() as the document count parameter.

Also applies to: 76-76, 114-114, 122-130, 219-219, 227-235, 263-263, 268-276

server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)

1240-1247: Pass-through docCount looks correct for multi-doc appends.

indexWriter.addDocuments(docs, uid, docs.size()) aligns with the actual number of Lucene docs being added and matches numDocAppends.inc(docs.size()).


1249-1258: Pass-through docCount looks correct for stale multi-doc appends.

Same as normal append: docs.size() is the right count to forward when all docs are added (even with soft-deletes field injected).


1369-1390: Pass-through docCount looks correct for multi-doc soft updates.

Forwarding docs.size() to softUpdateDocuments(...) is consistent with how numDocUpdates is incremented and with the intent of per-op doc accounting.

server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)

704-743: Defensive AlreadyClosedException handling for RAM/flush stats is sensible.

The “ignore ACE unless tragic” approach in getFlushingBytesUtil(...) and ramBytesUsedUtil(...) matches the test intent and avoids failing benignly during refresh/close races.

Also applies to: 769-784, 795-834

server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)

166-207: Test updates for new size parameter look consistent.

Most call sites pass operation.docs().size() which is the right value for multi-doc operations, and the updated softUpdateDocuments(..., size, ...) calls match the new signature.

Also applies to: 210-270, 318-343, 358-402, 429-490, 492-520, 858-910, 965-1009

Comment on lines +131 to 145
/**
* pendingNumDocs is used to track pendingNumDocs for child level IndexWriters. Since pendingNumDocs is incremented
* (by one) only in DocumentsWriterPerThread#reserveOneDoc for any index or update operation, we keep incrementing
* pendingNumDocs by one for each of these operations. We increment this value whenever we call following functions
* on childWriter:
* - softUpdateDocument
* - softUpdateDocuments
* - addDocuments
* - addDocument
*
* This value may overshoot during refresh temporarily due to double counting few documents in both old child
* IndexWriters and parent which should ok as undershooting pendingNumDocs can be problematic.
*/
private final AtomicLong childWriterPendingNumDocs = new AtomicLong();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

find . -name "CompositeIndexWriter.java" -type f

Repository: opensearch-project/OpenSearch

Length of output: 149


🏁 Script executed:

wc -l ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 154


🏁 Script executed:

sed -n '131,145p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 866


🏁 Script executed:

sed -n '740,755p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 472


🏁 Script executed:

sed -n '1059,1080p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 892


🏁 Script executed:

sed -n '1000,1060p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 3346


🏁 Script executed:

rg 'childWriterPendingNumDocs.incrementAndGet\(\)' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -n

Repository: opensearch-project/OpenSearch

Length of output: 400


🏁 Script executed:

sed -n '940,950p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 521


🏁 Script executed:

sed -n '991,1001p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 1021


🏁 Script executed:

sed -n '1034,1054p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 1483


🏁 Script executed:

sed -n '900,950p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 2145


🏁 Script executed:

sed -n '800,850p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 2378


🏁 Script executed:

rg 'addDeleteEntryToWriter' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -B 5 -A 10

Repository: opensearch-project/OpenSearch

Length of output: 3395


🌐 Web query:

Lucene IndexWriter getPendingNumDocs() method semantics Lucene 9 documentation

💡 Result:

IndexWriter.getPendingNumDocs() returns the current total number of documents in the index including documents that are in the process of being added (i.e., reserved but not yet fully committed/visible). The API is marked experimental and its semantics may change across releases. [1][2]

Sources:

  • Lucene IndexWriter javadoc (core docs). [1]
  • Lucene core API index listing (getPendingNumDocs). [2]

🏁 Script executed:

sed -n '1055,1075p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java

Repository: opensearch-project/OpenSearch

Length of output: 769


childWriterPendingNumDocs incorrectly increments for parent writer operations in deleteInLucene.

deleteInLucene(...) modifies accumulatingIndexWriter (the parent) but still increments childWriterPendingNumDocs (line 1072). Since getPendingNumDocs() returns childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs() (lines 747-748), the operations on the parent writer get counted twice: once in accumulatingIndexWriter.getPendingNumDocs() and again in childWriterPendingNumDocs. Per the javadoc (lines 131-145), childWriterPendingNumDocs should only track child-level operations, not parent operations. This causes inflated pending doc counts and can trigger false document limit rejections.

Proposed fix
     private void deleteInLucene(
         Term uid,
         boolean isStaleOperation,
         IndexWriter currentWriter,
         Iterable<? extends IndexableField> doc,
         Field... softDeletesField
     ) throws IOException {
         if (isStaleOperation) {
             currentWriter.addDocument(doc);
         } else {
             currentWriter.softUpdateDocument(uid, doc, softDeletesField);
         }
-
-        childWriterPendingNumDocs.incrementAndGet();
     }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 131 - 145, The deleteInLucene method incorrectly increments
childWriterPendingNumDocs when it performs deletions on the parent
accumulatingIndexWriter; remove that increment (or guard it) so
childWriterPendingNumDocs is only modified for child-level writer operations.
Specifically, in deleteInLucene avoid calling
childWriterPendingNumDocs.increment (or only call it when the operation targets
a child IndexWriter rather than accumulatingIndexWriter) so getPendingNumDocs()
no longer double-counts parent operations; update any accompanying comments to
reflect that childWriterPendingNumDocs tracks only child-level modifications.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 9, 2026

✅ Gradle check result for 4f31742: SUCCESS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants