-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Fixing indexing regression and bug fixes for grouping criteria #20145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
📝 WalkthroughWalkthroughRemoved a custom lookup-lock exception and its retry handling; added size-aware multi-document write APIs and child-writer pending-doc accounting; hardened RAM/flush/tragic-exception handling and codec selection; narrowed composite-field-type filtering; updated and extended tests; changelog text updated. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant TransportShardBulkAction
participant CompositeIndexWriter
participant CriteriaBasedIndexWriterLookup
participant ChildIndexWriter
Client->>TransportShardBulkAction: submit bulk request
TransportShardBulkAction->>CompositeIndexWriter: index document(s)
CompositeIndexWriter->>CriteriaBasedIndexWriterLookup: try to acquire read lock / lookup map
alt lookup map unavailable
CriteriaBasedIndexWriterLookup-->>CompositeIndexWriter: null/closed
CompositeIndexWriter->>CompositeIndexWriter: handle closed map (no lookup-lock exception retry)
else map acquired
CriteriaBasedIndexWriterLookup-->>CompositeIndexWriter: LiveIndexWriterDeletesMap
CompositeIndexWriter->>ChildIndexWriter: addDocument(s)/softUpdateDocuments(with size)
ChildIndexWriter-->>CompositeIndexWriter: success or tragic exception
CompositeIndexWriter->>CompositeIndexWriter: update childWriterPendingNumDocs and metrics
end
CompositeIndexWriter-->>TransportShardBulkAction: response
TransportShardBulkAction-->>Client: bulk response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
215-215: Reduced encapsulation ofmapReadLock.The visibility of
mapReadLockhas been changed fromprivate finalto package-private, allowing direct access from other classes in the same package. This field controls critical concurrency behavior, and exposing it directly increases the risk of misuse.Consider:
- Keeping the field
privateand exposing only necessary operations through methods (e.g.,tryAcquireLock()).- If package-private access is required for the retry logic, add clear documentation about proper usage patterns and thread-safety requirements.
- Restrict access using a package-private accessor method rather than exposing the field directly.
498-498: Simplify boolean comparisons.The condition uses explicit
== falseand== truecomparisons which are redundant in Java.Apply this diff:
-if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) { +if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) { current.mapReadLock.close(); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: gradle-check
🔇 Additional comments (13)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!Import correctly added to support the stream operations in
getCompositeFieldTypes().
693-697: Verify the behavior change scope and call frequency.The filtering to return only
StarTreeMapper.StarTreeFieldTypeinstances represents a narrowed scope from returning all composite field types. Confirm this change is intentional and whether any callers expect otherCompositeMappedFieldTypeimplementations. Additionally, verify the call frequency of this method; if invoked on hot paths, consider caching the filtered result to avoid repeated stream collection operations.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
106-106: LGTM!The test constant is appropriately set to a lower value (20) than the production default (100) for faster test execution while still being within the valid range (5-500).
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
44-46: LGTM!Mockito imports correctly added to support the new verification test.
71-77: LGTM!Method call correctly updated to include
MAX_NUMBER_OF_RETRIESparameter, aligning with the new bounded retry API.
141-146: LGTM!Method call correctly updated with retry parameter.
197-202: LGTM!Method call correctly updated with retry parameter.
208-227: Test validates bounded retry semantics correctly.The test properly verifies:
LookupMapLockAcquisitionExceptionis thrown after exhausting retriestryAcquire()is called exactlyMAX_NUMBER_OF_RETRIEStimesOne consideration: the mock setup directly assigns to
map.currentandmap.current.mapReadLockwhich accesses package-private fields. This works for testing but creates tight coupling to internal implementation details.server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
724-753: Retry logic moved to lower layer - verify exception handling.The
LookupMapLockAcquisitionExceptionretry logic has been removed from bulk action handling and moved toCompositeIndexWriterwith bounded retries. This architectural approach places retry logic closer to where the exception originates.Ensure that when
LookupMapLockAcquisitionExceptionpropagates up after max retries are exhausted, it's properly handled and doesn't cause unexpected bulk operation failures.server/src/main/java/org/opensearch/index/IndexSettings.java (1)
499-506: Significant default value change - verify upgrade impact.The default retry count increased to 100 with a maximum of 500. Since this is a dynamic setting, existing indices will apply the new default upon upgrade. Consider whether this change should be documented in release notes for operators who have tuned their clusters based on previous defaults.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (3)
691-693: LGTM: Metrics gathering refactoring.The refactoring from stream-based iteration to explicit for-loops improves code clarity and performance for these simple aggregation operations. The logic is correct in all cases, with proper handling of both current and old maps where necessary, and appropriate locking in
ramBytesUsed().Also applies to: 702-704, 731-742, 758-770, 796-806
210-210: Verify removal offinalmodifier is intentional.The
finalmodifier has been removed fromCriteriaBasedIndexWriterLookup,CriteriaBasedWriterLock, andLiveIndexWriterDeletesMap. This allows subclassing of these internal implementation classes. Confirm whether:
- Subclassing is required for test mocking/stubbing.
- If so, consider restricting visibility to test scope or use sealed classes.
- If intentional for production extensibility, document extension points and invariants.
Also applies to: 301-301, 406-406
678-679: Verify retry configuration defaults and bounds.The
maxRetryOnLookupMapAcquisitionExceptionsetting controls retry behavior for lookup map acquisition. Without access to the codebase, I cannot confirm the specific default values, upper/lower bounds, or whether they align with industry best practices for lock acquisition retry mechanisms. Ensure the following:
- Default retry count is reasonable (typically 3–10 attempts for in-thread locks per best practices)
- Upper bounds prevent excessive retries that could cause performance degradation
- Configuration is documented and tunable per workload expectations
- Retry strategy includes backoff/jitter to avoid retry storms (not just blind spinning)
CHANGELOG.md
Outdated
| - Fix node bootstrap error when enable stream transport and remote cluster state ([#19948](https://github.com/opensearch-project/OpenSearch/pull/19948)) | ||
| - Fix deletion failure/error of unused index template; case when an index template matches a data stream but has a lower priority. ([#20102](https://github.com/opensearch-project/OpenSearch/pull/20102)) | ||
| - Fix toBuilder method in EngineConfig to include mergedSegmentTransferTracker([20105](https://github.com/opensearch-project/OpenSearch/pull/20105)) | ||
| - Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor formatting inconsistency.
The PR link format is inconsistent with other entries in the changelog. Missing # before the PR number.
Apply this diff to fix the formatting:
-- Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
+- Fix indexing regression and bug fixes for grouping criteria ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Fix indexing regression and bug fixes for grouping criteria([20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) | |
| - Fix indexing regression and bug fixes for grouping criteria ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) |
🤖 Prompt for AI Agents
CHANGELOG.md around line 94: the PR link is missing the '#' before the PR number
which makes it inconsistent with other entries; update the link by inserting '#'
before the PR number so it reads .../pull/#20145 (i.e. change
(https://github.com/opensearch-project/OpenSearch/pull/20145) to include the
hash) and ensure the rest of the entry formatting matches surrounding lines.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
475-489: Add exponential backoff to retry loop.The retry loop performs a busy-wait by calling
tryAcquire()repeatedly without any delay between attempts. This pattern wastes CPU cycles under contention when the lock is frequently unavailable.As suggested in the previous review, add exponential backoff between retry attempts with
Thread.sleep(Math.min(1L << counter, 100L))and handleInterruptedExceptionby restoring the interrupt flag and propagatingLookupMapLockAcquisitionException.
🧹 Nitpick comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
483-483: Simplify boolean comparisons.The expressions
current.isClosed() == true(line 483) andcurrent.mapReadLock.isHeldByCurrentThread() == true(line 498) contain redundant boolean comparisons.Apply this diff to simplify:
- if (current != null && current.isClosed() == true) { + if (current != null && current.isClosed()) {- if (success == false && current != null && current.mapReadLock.isHeldByCurrentThread() == true) { + if (!success && current != null && current.mapReadLock.isHeldByCurrentThread()) {Also applies to: 498-498
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (5)
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- CHANGELOG.md
- server/src/main/java/org/opensearch/index/IndexSettings.java
- server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
🔇 Additional comments (6)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-731: Behavior change: retries now only on version conflicts inonCompleteThe retry path in
onCompletenow exclusively handlesVersionConflictEngineException(viaisConflictException) bounded byretryOnConflict, with the previousLookupMapLockAcquisitionException-specific retry branch removed. That seems consistent with moving lock-acquisition retry logic down intoCompositeIndexWriterand keeps this layer from depending on engine-internal exception types.Please double-check that:
LookupMapLockAcquisitionExceptionis fully retried/consumed withinCompositeIndexWriter(or below), and- it does not bubble back up to this point expecting a retry here anymore,
so we don’t inadvertently change failure behavior for callers that previously relied on the bulk action to transparently retry on transient lookup-map lock contention.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)
210-210: Visibility changes support the new retry mechanism.The removal of
finalmodifiers from inner classes and the change ofmapReadLockto package-private access are necessary to support the refactored retry logic. The package-private access onmapReadLock(line 215) enablesLiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteriato calltryAcquire()directly at line 482.Also applies to: 215-215, 301-301, 406-406, 408-408
466-471: Proper parameter propagation for configurable retry limit.The addition of
maxRetryOnLookupMapAcquisitionExceptionparameter enables configurable retry behavior, and the value is correctly sourced fromengineConfig.getIndexSettings().Also applies to: 675-680
691-693: Simplified iteration improves readability.The refactoring from stream-based iteration to direct iteration over
criteriaBasedIndexWriterMap.values()is cleaner and avoids unnecessary intermediate operations.Also applies to: 702-704
731-750: Proper tragic exception checking across all writers.The iteration through both current and old writer maps to detect tragic exceptions is thorough and correctly checks if writers are closed before accessing their tragic exception state.
753-774: Correct synchronization and state checks.The
ramBytesUsed()method properly acquires write locks before iterating, and therollback()method correctly checks if writers are open before attempting rollback operations.Also applies to: 794-811
|
❌ Gradle check result for 4b247a0: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
44-46: Retry wiring in tests looks correct; consider simplifying the mocking for readability.
- Updating all
computeIndexWriterIfAbsentForCriteriainvocations to passMAX_NUMBER_OF_RETRIESkeeps the tests consistent with the new API and the retry semantics; this looks correct.testMaxRetryCountWhenWriteLockDuringIndexingcorrectly verifies thattryAcquire()is invoked exactlyMAX_NUMBER_OF_RETRIEStimes when the lock is never obtained, and theLookupMapLockAcquisitionExceptionis thrown as expected.As a minor test ergonomics tweak, you could stub and verify directly on
writerLockinstead of going throughmap.current.mapReadLockin thewhen(...)andverify(...)calls. That would make the test a bit less coupled to the internal layout ofLiveIndexWriterDeletesMapandCriteriaBasedIndexWriterLookupwhile preserving the behavior being asserted.Also applies to: 72-77, 141-146, 197-202, 208-227
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
210-215: Visibility and mutability changes for nested types are acceptable but could use an explicit “for testing” annotation.Making
CriteriaBasedIndexWriterLookupandCriteriaBasedWriterLockmore visible, and relaxingmapReadLockandLiveIndexWriterDeletesMap.currentfromfinal, is understandable to support the new tests that need to mock and override these internals.To keep the public surface area tidy and signal intent, consider adding an explicit
@opensearch.internal(or similar) Javadoc tag or comment on these nested types/fields indicating that they are exposed primarily for testing. That helps discourage external production code from depending on them and makes future refactors easier.Also applies to: 301-301, 406-412
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java(1 hunks)server/src/main/java/org/opensearch/index/IndexSettings.java(1 hunks)server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java(10 hunks)server/src/main/java/org/opensearch/index/mapper/MapperService.java(2 hunks)server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java(0 hunks)server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java(4 hunks)server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java(1 hunks)
💤 Files with no reviewable changes (1)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (4)
- server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
- server/src/main/java/org/opensearch/index/IndexSettings.java
- CHANGELOG.md
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: detect-breaking-change
🔇 Additional comments (3)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!The import is necessary for the
Collectors.toSet()operation used in thegetCompositeFieldTypes()method below.
694-696: Verify initialization safety, caller expectations, and method naming alignment.The method now filters to return only
StarTreeMapper.StarTreeFieldTypeinstances fromcompositeMappedFieldTypes:
Potential NPE risk: Verify that
getCompositeFieldTypes()is never called beforeinternalMerge()initializescompositeMappedFieldTypesat line 552. If called during early initialization phases,.stream()could fail on a null reference.Semantic narrowing: Confirm whether the method name
getCompositeFieldTypes()still accurately reflects its behavior. If other composite field type implementations exist or may be added, consider renaming togetStarTreeFieldTypes()or updating documentation to clarify the filtering behavior.Performance: If
getCompositeFieldTypes()is called frequently in hot paths, consider caching the filtered result to avoid recreating the set on each invocation.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
691-705: Iteration refactors over current/old writer maps look correct and improve clarity.
getFlushingBytes()andgetPendingNumDocs()now iterate directly overliveIndexWriterDeletesMap.current.criteriaBasedIndexWriterMap.values(), summing per-child metrics before adding the accumulating writer’s values. This preserves behavior and is straightforward.getTragicException()now checks bothcurrentandoldchild writers for a tragic exception before falling back to the accumulating writer, which ensures group-specific failures are surfaced.ramBytesUsed()androllback()explicitly iterate over bothcurrentandoldwriters, and the use ofmapWriteLock.acquire()around theramBytesUsed()traversals is appropriate for a consistent snapshot.Overall, these loops are clear and consistent with the data structures being used; no issues from a correctness or concurrency standpoint.
Also applies to: 731-742, 757-772, 796-805
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
Show resolved
Hide resolved
|
❌ Gradle check result for dbcae67: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The unbounded spin in CompositeIndexWriter that
repeatedly calls current.mapReadLock.tryAcquire() can starve threads; change the
loop to a bounded retry with exponential backoff and/or a timeout and respond to
thread interrupt/shutdown: e.g., attempt tryAcquire() in a loop with a small
Thread.sleep/backoff between attempts, track elapsed time and break with a clear
exception or return if a configured timeout is exceeded, and check
Thread.currentThread().isInterrupted() (and any local shutdown flag) to stop
retrying promptly; update associated callers to handle the new timeout/exception
behavior accordingly.
- Around line 1081-1083: The deleteInLucene method currently always increments
childWriterPendingNumDocs (childWriterPendingNumDocs.incrementAndGet()) even
when currentWriter is the parent accumulatingIndexWriter, causing
double-counting with accumulatingIndexWriter.getPendingNumDocs(); modify
deleteInLucene so the increment is only performed when currentWriter represents
a child writer (check currentWriter identity/type against
accumulatingIndexWriter or a child-writer flag) or remove the increment here and
move it to the caller that only updates childWriterPendingNumDocs for child
writers; update any related logic that relies on childWriterPendingNumDocs to
ensure counts remain consistent with getPendingNumDocs().
🧹 Nitpick comments (5)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
153-168: Improve resource management and exception handling.The test correctly validates the no-op behavior, but has a couple of issues:
- Resource leak: The
XContentBuildercreated at line 164 is never closed, and the object is started but not closed.- Overly broad exception handling: Catching
Exception(line 165) instead of specific exceptions reduces test clarity.♻️ Suggested refactor
public void testContextAwareFieldMapperWithDerivedSource() throws IOException { ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( "context_aware_grouping", fieldType, new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") ); LeafReader leafReader = mock(LeafReader.class); - try { + try (var builder = XContentFactory.jsonBuilder().startObject()) { mapper.canDeriveSource(); - mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); - } catch (Exception e) { - fail(e.getMessage()); + mapper.deriveSource(builder, leafReader, 0); + builder.endObject(); } }Note: With this change, the method will naturally propagate
IOException(already declared in the method signature), making the test clearer.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
606-609: Test-only method exposing internal lock.While exposing
acquireNewReadLock()for testing purposes works, consider whether the new test scenario could be validated through public APIs instead. If not, ensure this method is clearly marked as test-only and not used in production code.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
154-204: Unused variable in concurrent test.The
AtomicBoolean runon line 171 is declared but never read or used in the test. Consider removing it to improve code clarity.🧹 Cleanup
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); CountDownLatch latch = new CountDownLatch(1); -AtomicBoolean run = new AtomicBoolean(true); Thread refresher = new Thread(() -> { latch.countDown(); try { @@ -191,7 +190,6 @@ } finally { IOUtils.closeWhileHandlingException(lock.getMapReadLock()); - run.set(false); refresher.join();server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
134-136: Unused size parameter in wrapper method.The
sizeparameter is intentionally unused asLuceneIndexWriterdelegates directly to Lucene'sIndexWriter, which tracks pending docs internally. Consider adding a comment or annotation to clarify this is expected for the wrapper pattern.📝 Documentation improvement
+ /** + * Wrapper function for IndexWriter.addDocuments. + * + * @param size The number of documents being added (unused in this wrapper, retained for API consistency) + */ @Override public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException { return indexWriter.addDocuments(docs); }
144-154: Consistent unused size parameter pattern.Like
addDocuments, thesizeparameter insoftUpdateDocumentsis unused in this wrapper. The same documentation improvement suggested foraddDocumentsapplies here for consistency.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (9)
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
- server/src/main/java/org/opensearch/index/engine/InternalEngine.java
- server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
- server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
- server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, macos-15)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (12)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
11-11: LGTM!The new imports are necessary for the test method and are used appropriately.
Also applies to: 13-13, 20-20
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)
131-144: Well-documented field for pending document accounting.The detailed comment clearly explains the purpose and behavior of
childWriterPendingNumDocs. The note about temporary overshoot during refresh is helpful.
352-356: Good defensive check for closed lookup.This prevents operations on closed maps by checking
isClosed()after acquiring the lock and returning null if closed. This is a solid safety mechanism.
549-560: Correct pending document accounting during refresh.The logic properly accumulates pending docs from old child writers and subtracts them from
childWriterPendingNumDocsafter merging. This prevents double-counting and aligns with the strategy described in the field documentation.
719-742: Robust exception handling for closed writers.The pattern of catching
AlreadyClosedExceptionand checking for tragic exceptions is excellent. It distinguishes between normal closure (ignore) and catastrophic failure (rethrow), ensuring critical errors aren't silently swallowed during metric collection.
810-833: Consistent exception handling pattern.The
ramBytesUsedUtilmethod follows the same defensive pattern asgetFlushingBytesUtil, providing consistent and robust behavior across metric collection methods.
853-880: Comprehensive rollback ensuring no resource leaks.The rollback implementation properly cleans up both current and old child writers, catching and ignoring
AlreadyClosedExceptionas expected. The comment correctly notes this prevents file leaks despite appearing redundant.
927-942: Size-aware document accounting in addDocuments.The implementation correctly increments
childWriterPendingNumDocsbysizeafter successful indexing, providing accurate accounting for multi-document operations.
960-986: Consistent size-aware accounting for updates.The
softUpdateDocumentsmethod correctly incrementschildWriterPendingNumDocsbysizeand maintains delete entry tracking, consistent with the add operations pattern.
1030-1067: Correct delete accounting across writer generations.The implementation properly handles deletes in both current and old child writers, incrementing
childWriterPendingNumDocsonly when delete entries are actually added. The conditional logic prevents overcounting stale operations.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
34-34: Consistent test updates for new addDocuments signature.All test methods correctly pass
operation.docs().size()as the third parameter toaddDocuments, matching the new size-aware API.Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263
122-130: Correct parameter ordering for softUpdateDocuments.The test correctly passes
operation.docs().size()as the 6th parameter (afterprimaryTerm), matching the updated signature.
|
❌ Gradle check result for 3d9eacc: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Remove obsolete lock acquisition retry setting and related dead code from IndexSettings.The removal of
LookupMapLockAcquisitionExceptionretry handling from the transport layer is correct—lock acquisition is now handled resiliently inCompositeIndexWriterwith proper locking primitives. However, the refactoring is incomplete:
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONsetting is still defined but unused- The field
maxRetryOnLookupMapAcquisitionExceptionis never read- The getter is defined but never called
- The setter listener is registered but the value is never used
- Javadoc at lines 515-520 in
IndexSettings.javastill references the removed exceptionRemove the setting definition, field, getter, setter, listener registration, and update the javadoc.
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 1081-1083: The increment of childWriterPendingNumDocs in
deleteInLucene is incorrect because deleteInLucene operates on currentWriter
(passed as accumulatingIndexWriter from deleteDocument) and Lucene's
accumulatingIndexWriter.getPendingNumDocs() is already included in
getPendingNumDocs() (see getPendingNumDocs usage around line 747), so remove the
childWriterPendingNumDocs.incrementAndGet() call from deleteInLucene to avoid
double-counting pending docs; ensure any remaining bookkeeping relies solely on
accumulatingIndexWriter.getPendingNumDocs() and tests for
deleteDocument/deleteInLucene reflect no net double increment.
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 194-199: The Javadoc for the deriveSource override uses the term
"Context Aware Segment" which is inconsistent with this mapper's field name;
update the comment above the deriveSource( XContentBuilder builder, LeafReader
leafReader, int docId ) method in ContextAwareGroupingFieldMapper to refer to
"Context Aware Grouping" (or `context_aware_grouping`) instead, so the
terminology matches the mapper name and the field being omitted from generation.
In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java:
- Around line 153-168: The test leaves the XContentBuilder open and wraps calls
in an unnecessary try-catch; replace the current manual
XContentFactory.jsonBuilder() usage with a try-with-resources block that
constructs and closes the XContentBuilder, build the JSON object inside it
(startObject()/endObject()) and pass the builder's content to deriveSource, and
remove the surrounding try-catch since the test method already declares throws
IOException so any exception will fail the test; update references to
ContextAwareGroupingFieldMapper.canDeriveSource() and deriveSource(...)
accordingly.
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
154-204: New concurrency test covers important edge case.The test exercises the scenario where a delete occurs while refresh is rotating maps, ensuring proper handling of documents in the old child writer.
However, the
runvariable (line 171) is set but never read - it appears to be leftover from a previous iteration.Remove unused variable
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); CountDownLatch latch = new CountDownLatch(1); - AtomicBoolean run = new AtomicBoolean(true); Thread refresher = new Thread(() -> { latch.countDown(); try { compositeIndexWriter.beforeRefresh(); } catch (Exception ignored) {} }); refresher.start(); try { latch.await(); compositeIndexWriter.deleteDocument( operation.uid(), false, newDeleteTombstoneDoc(id), 1, 2, primaryTerm.get(), softDeletesField ); } finally { IOUtils.closeWhileHandlingException(lock.getMapReadLock()); - run.set(false); refresher.join();
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (6)
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
- server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java
- server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
- server/src/main/java/org/opensearch/index/engine/InternalEngine.java
- CHANGELOG.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: detect-breaking-change
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (18)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)
11-12: LGTM!The imports are necessary for the new
deriveSourcemethod signature.
187-192: Clarify Javadoc terminology to match the mapper's purpose.The Javadoc refers to "Context Aware Segment field" but the mapper is for
context-aware grouping. Update the comment to use consistent terminology that aligns with the class name.The
@Overrideannotation is already present and correctly applied.📝 Suggested Javadoc fix
/** - * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment + * Context-aware grouping field is not a part of an ingested document, so omitting it from * validation. */Likely an incorrect or invalid review comment.
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
11-11: LGTM!The imports are necessary for the new test method
testContextAwareFieldMapperWithDerivedSource.Also applies to: 13-13, 20-20
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (6)
131-144: Good addition of pending docs tracking with clear documentation.The documentation clearly explains the purpose and the potential temporary overshoot during refresh. This is a reasonable trade-off since undershooting could be problematic for memory pressure calculations.
352-357: Potential race condition: check-then-act onisClosed().After acquiring the lock and checking
lookup.isClosed(), another thread could theoretically be in the process of closing. However, since the write lock is required for closing (viamapWriteLock), and we hold the read lock here, this should be safe. The pattern correctly closes the lock before returning null.
719-742: Consistent exception handling pattern for tragic exceptions.The logic correctly swallows
AlreadyClosedExceptionwhen there's no tragic exception (normal close), but re-throws it when a tragic exception exists. This is the correct behavior for distinguishing between normal lifecycle and error conditions.
810-833: Same exception handling pattern as getFlushingBytesUtil - consistent and correct.
927-941: Size-aware document counting looks correct.The increment by
sizeafter adding documents aligns with the number of documents actually added to the child writer.
549-559: The atomic subtraction is intentional; no floor validation needed.The code comment at lines 141-143 explicitly documents that overshooting
childWriterPendingNumDocsis acceptable because undershooting "can be problematic." The developers have made a deliberate design choice to allow temporary accounting imprecision rather than risk undershooting. The absence of assertions preventing negative values confirms this is intentional. Adding a floor of 0 would contradict the documented design.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: Good addition of overloaded config method for test flexibility.Allows tests to provide their own Store instance while maintaining the same default configuration.
509-564: Well-designed test utility for controlled flush behavior.The
FlushingIndexWriterFactoryenables deterministic flush behavior in tests. A few observations:
- The factory correctly tracks directories for cleanup via
close().- The
useFailingDirectorySupplierflag allows toggling between normal and failing directories.One minor note: the
directorieslist is not thread-safe (ArrayList), but since test usage is typically single-threaded during setup, this should be acceptable.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
34-34: Test calls correctly updated to use new size-aware API.The
addDocuments(..., operation.docs().size())pattern correctly passes document count.Also applies to: 76-76, 114-114
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
183-183: Test calls correctly updated with size parameter.Consistent with the API changes across the codebase.
Also applies to: 231-231
498-532: Good test coverage for tragic exception on current writer.The test correctly verifies that
AlreadyClosedExceptionis thrown when accessingramBytesUsed()after a tragic exception occurs on a current writer.
534-583: Test for tragic exception on old writer has proper synchronization.The test uses
CountDownLatchandReleasableLockto properly coordinate between the main thread and the refresh thread, ensuring the tragic exception occurs in the old writer context.
672-757: Comprehensive tragic exception retrieval tests.Both tests (
testTragicExceptionGetWithTragicExceptionOnCurrentandtestTragicExceptionGetWithTragicExceptionOnOld) verify thatgetTragicException()correctly returns a non-null value after a tragic exception occurs, covering both current and old writer scenarios.
840-860: softUpdateDocuments calls correctly include size parameter.The pattern
operation.docs().size()is consistently used to pass document count.server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-206: Codec placement and documentation are correct, with proper backward compatibility handling.The inline comment accurately explains why the codec is constructed here rather than in CodecService—the associatedCriteria is per-document and only available during IndexWriter initialization in InternalEngine. The context-aware check correctly gates the use of CriteriaBasedCodec, falling back to the base codec when disabled, ensuring backward compatibility for non-context-aware indexes. The CriteriaBasedCodec properly handles null criteria through null checks in its delegate writers.
|
|
||
| childWriterPendingNumDocs.incrementAndGet(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect placement: incrementing pending docs for accumulating writer operations.
deleteInLucene operates on currentWriter which is passed as accumulatingIndexWriter from deleteDocument. The accumulatingIndexWriter already has its own getPendingNumDocs() tracking in Lucene's IndexWriter. Incrementing childWriterPendingNumDocs here causes double-counting since getPendingNumDocs() (line 747) already includes accumulatingIndexWriter.getPendingNumDocs().
Suggested fix: remove the increment in deleteInLucene
private void deleteInLucene(
Term uid,
boolean isStaleOperation,
IndexWriter currentWriter,
Iterable<? extends IndexableField> doc,
Field... softDeletesField
) throws IOException {
if (isStaleOperation) {
currentWriter.addDocument(doc);
} else {
currentWriter.softUpdateDocument(uid, doc, softDeletesField);
}
-
- childWriterPendingNumDocs.incrementAndGet();
}🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 1081 - 1083, The increment of childWriterPendingNumDocs in
deleteInLucene is incorrect because deleteInLucene operates on currentWriter
(passed as accumulatingIndexWriter from deleteDocument) and Lucene's
accumulatingIndexWriter.getPendingNumDocs() is already included in
getPendingNumDocs() (see getPendingNumDocs usage around line 747), so remove the
childWriterPendingNumDocs.incrementAndGet() call from deleteInLucene to avoid
double-counting pending docs; ensure any remaining bookkeeping relies solely on
accumulatingIndexWriter.getPendingNumDocs() and tests for
deleteDocument/deleteInLucene reflect no net double increment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-730: Remove orphaned setting and variable—cleanup is incomplete.The removal of
LookupMapLockAcquisitionExceptionretry handling fromTransportShardBulkActionis incomplete. The following orphaned code still exists inIndexSettings.javaand should be removed:
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONsetting definition (line 519) and its stale JavaDoc comment (line 515)maxRetryOnLookupMapAcquisitionExceptionfield (line 933) and its initialization (line 1149)setMaxRetryOnLookupMapAcquisitionException()andgetMaxRetryOnLookupMapAcquisitionException()methods (lines 2114–2120)—the getter is never called- Registration of the setting in
IndexScopedSettings.java(line 181)These should be removed as part of the refactoring to avoid leaving dead code.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
545-560: MovegetPendingNumDocs()call beforeclose()to preventAlreadyClosedExceptionIn
refreshDocumentsForParentDirectory, the current code callsgetPendingNumDocs()after closing the IndexWriter:childDisposableWriter.getIndexWriter().close(); pendingNumDocsByOldChildWriter.addAndGet(childDisposableWriter.getIndexWriter().getPendingNumDocs());This will throw
AlreadyClosedExceptionat runtime. Lucene'sIndexWriter.getPendingNumDocs()callsensureOpen(true), which fails once the writer is closed.Reorder to capture the pending doc count before closing:
Suggested fix
for (CompositeIndexWriter.DisposableIndexWriter childDisposableWriter : markForRefreshIndexWritersMap.values()) { final IndexWriter childWriter = childDisposableWriter.getIndexWriter(); directoryToCombine.add(childWriter.getDirectory()); + pendingNumDocsByOldChildWriter.addAndGet(childWriter.getPendingNumDocs()); childWriter.close(); - pendingNumDocsByOldChildWriter.addAndGet(childWriter.getPendingNumDocs()); }Optionally, add an assertion to catch underflow:
-childWriterPendingNumDocs.addAndGet(-pendingNumDocsByOldChildWriter.get()); +final long newValue = childWriterPendingNumDocs.addAndGet(-pendingNumDocsByOldChildWriter.get()); +assert newValue >= 0 : "childWriterPendingNumDocs underflow: " + newValue;
1030-1066: Double-counting parent deletes indeleteInLuceneinflates child pending-docs and can cause early maxDocs failuresIn
deleteDocumentyou optionally add partial deletes to current/old child writers and incrementchildWriterPendingNumDocs(lines 1051, 1061). Then you always delegate todeleteInLucene(uid, isStaleOperation, accumulatingIndexWriter, doc, softDeletesField)with the parent writer.Inside
deleteInLucene, you unconditionally incrementchildWriterPendingNumDocs:if (isStaleOperation) { currentWriter.addDocument(doc); } else { currentWriter.softUpdateDocument(uid, doc, softDeletesField); } childWriterPendingNumDocs.incrementAndGet();But
currentWriterhere is always the parentaccumulatingIndexWriterwhen called fromdeleteDocument. This means:
- Each parent delete increments Lucene's own pending docs via the parent writer and
- Increments
childWriterPendingNumDocs, which is intended to track only child-writer contributions (as confirmed by the comment "only increment this when addDeleteEntry for child writers are called")As a result,
CompositeIndexWriter.getPendingNumDocs()(which returnschildWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()) systematically overcounts deletes. This inflated count is used byInternalEngine.tryAcquireInFlightDocsto enforce themaxDocsguard, causing the shard to reject operations prematurely even though Lucene has not reachedIndexWriter.MAX_DOCS.Fix by incrementing
childWriterPendingNumDocsonly for child writers, not the parent:Proposed fix
private void deleteInLucene( Term uid, boolean isStaleOperation, IndexWriter currentWriter, Iterable<? extends IndexableField> doc, Field... softDeletesField ) throws IOException { if (isStaleOperation) { currentWriter.addDocument(doc); } else { currentWriter.softUpdateDocument(uid, doc, softDeletesField); } - childWriterPendingNumDocs.incrementAndGet(); + // Only child writers are tracked via childWriterPendingNumDocs; the parent writer + // is already accounted for by IndexWriter#getPendingNumDocs(). + if (currentWriter != accumulatingIndexWriter) { + childWriterPendingNumDocs.incrementAndGet(); + } }
🤖 Fix all issues with AI agents
In @CHANGELOG.md:
- Line 33: Update the PR link text to include the missing “#” for consistency:
change the occurrence
"([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" in the
CHANGELOG entry to
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" so it
matches other entries like "[#20055]" and "[#20284]".
🧹 Nitpick comments (6)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
694-696: Filtering logic is correct and caller-compatible with the narrowed return type.All callers in the codebase expect
CompositeDataCubeFieldTypeinstances and safely handle the filtered result. The method correctly excludesContextAwareGroupingFieldType, which is not used by any caller ofgetCompositeFieldTypes()orisCompositeIndexPresent().Optional: Consider caching the filtered result for performance.
The filtering operation executes on every call. If this method is invoked frequently, consider caching the filtered set alongside
compositeMappedFieldTypesto avoid repeated stream operations.♻️ Potential optimization to cache filtered results
Add a cached field for the filtered set:
private volatile Set<CompositeMappedFieldType> compositeMappedFieldTypes; +private volatile Set<CompositeMappedFieldType> compositeDataCubeFieldTypes; private volatile Set<String> fieldsPartOfCompositeMappings;Update the initialization in
internalMerge(around line 552):// initialize composite fields post merge this.compositeMappedFieldTypes = getCompositeFieldTypesFromMapper(); +this.compositeDataCubeFieldTypes = compositeMappedFieldTypes.stream() + .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) + .collect(Collectors.toSet()); buildCompositeFieldLookup();Simplify the method:
public Set<CompositeMappedFieldType> getCompositeFieldTypes() { - return compositeMappedFieldTypes.stream() - .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) - .collect(Collectors.toSet()); + return compositeDataCubeFieldTypes; }server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
153-168: Consider simplifying the exception handling.The test correctly verifies that the derived-source methods can be invoked without throwing. However, the explicit try-catch with
fail(e.getMessage())is unnecessary—simply letting any exception propagate will fail the test automatically with better diagnostics.♻️ Simplify exception handling
- public void testContextAwareFieldMapperWithDerivedSource() throws IOException { + public void testContextAwareFieldMapperWithDerivedSource() throws Exception { ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( "context_aware_grouping", fieldType, new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") ); LeafReader leafReader = mock(LeafReader.class); - - try { - mapper.canDeriveSource(); - mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); - } catch (Exception e) { - fail(e.getMessage()); - } + + mapper.canDeriveSource(); + mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); }server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
55-67: Clarify and enforcesizecontract on multi-doc APIs
sizeis critical for pending-doc accounting; if callers pass an incorrect or non-positive value,getPendingNumDocs()andtryAcquireInFlightDocs()will misbehave. Consider documenting thatsizemust equal the number of documents indocsand be> 0, and add assertions in implementations (e.g.,CompositeIndexWriter,LuceneIndexWriter) to enforce this.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
154-204: Concurrency test for deletes on old child writer looks sound (minor nit)The new
testDeleteWithDocumentInOldChildWriterexercises delete behavior while a read lock is held on the current map andbeforeRefresh()runs in another thread, which is valuable coverage for the new locking/rotation logic.Small nit:
AtomicBoolean runis never read in the refresher thread; you can drop it to simplify the test.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
853-880: Child-writer rollback added, but consider resetting child pending-doc counterThe enhanced
rollback()now:
- Rolls back all open child writers in the current and old maps (ignoring
AlreadyClosedException), then- Rolls back the parent writer and marks the composite as closed.
This ensures child directories are properly closed and reduces leak risk. One minor missing piece is that
childWriterPendingNumDocsis not reset here; sinceensureOpen()guards most public methods, this is likely harmless, but for clarity and future safety you could reset it (e.g.,childWriterPendingNumDocs.set(0)) when the composite transitions to closed.
606-610:acquireNewReadLockcorrectly exposed for tests but should stay non-public
acquireNewReadLock()returns aCriteriaBasedIndexWriterLookupwith the read lock already held, for use in tests liketestDeleteWithDocumentInOldChildWriter. The semantics are clear in tests, but this method should remain package-private (as it is now) to avoid accidental misuse in production code—holding a read lock without a matchingclose()can block refresh rotations.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
🚧 Files skipped from review as they are similar to previous changes (1)
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: Analyze (java)
- GitHub Check: detect-breaking-change
- GitHub Check: Mend Security Check
🔇 Additional comments (19)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
87-87: LGTM!The import is necessary for the
Collectors.toSet()call in the modifiedgetCompositeFieldTypes()method.server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-206: LGTM! Clear codec selection logic with helpful documentation.The inline comment effectively explains the architectural constraint that necessitates codec creation at this point. The conditional logic correctly selects between context-aware and standard codec configurations based on the index settings.
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)
187-199: LGTM! Well-documented no-op overrides.The Javadoc clearly explains the rationale for the no-op implementations: the context-aware grouping field is metadata that doesn't participate in document ingestion or derived-source workflows.
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: LGTM! Convenient config overload.This method simplifies test setup by providing sensible defaults for the full
config(...)method, reducing boilerplate in test cases.
509-564: LGTM! Well-designed test utility for flush and tragic-exception scenarios.The
FlushingIndexWriterFactoryproperly extendsNativeLuceneIndexWriterFactoryand implementsCloseable, ensuring all tracked directories are cleaned up viaIOUtils.close(). The wrappedIndexWritercorrectly flushes after each write operation to simulate specific test conditions.server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)
134-136: Size parameter is intentionally unused in LuceneIndexWriter—this is by design.Both
addDocuments(line 134) andsoftUpdateDocuments(line 150) inLuceneIndexWriterdeclare theint sizeparameter but don't use it. This is intentional:LuceneIndexWriteris a simple wrapper that delegates directly to Lucene'sIndexWriter, which doesn't require pending document count tracking. In contrast,CompositeIndexWriteruses thesizeparameter to updatechildWriterPendingNumDocsbecause it manages multiple childIndexWriterinstances and needs to track pending documents across them for coordination and synchronization. The parameter is part of theDocumentIndexWriterinterface contract, so all implementations must accept it, but usage varies by architectural need.server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: Multi-document append path correctly migrated to size-aware APIUsing
indexWriter.addDocuments(docs, uid, docs.size())only whendocs.size() > 1keeps single-doc path intact and alignsreservedDocsandnumDocAppends.inc(docs.size())with the actual number of docs written.
1249-1258: Stale-doc append path correctly passes doc count
addStaleDocsnow callsaddDocuments(docs, uid, docs.size())for multi-doc stale ops, keeping accounting (numDocAppends.inc(docs.size())) consistent with writes.
1369-1390: Update path correctly passesdocs.size()intosoftUpdateDocumentsThe multi-doc update path now uses
softUpdateDocuments(uid, docs, version, seqNo, primaryTerm, docs.size(), softDeletesField), which is consistent with howreservedDocsandnumDocUpdates.inc(docs.size())are computed, and provides the correct size toDocumentIndexWriterfor pending-doc tracking.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
34-35: Tests correctly updated for size-aware write APIsAll test invocations of
addDocumentsandsoftUpdateDocumentsnow passoperation.docs().size()as the size argument, in line with the newDocumentIndexWritercontract and how production code calls these APIs.Also applies to: 76-77, 114-130, 219-235, 263-276
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)
176-185: Append tests aligned with size-aware CompositeIndexWriter APIAll updated test paths now pass
operation.docs().size()intoaddDocuments(andsoftUpdateDocumentswhere applicable), which matches the production usage and ensures coverage of the new size parameter.Also applies to: 224-244, 333-335, 373-375, 437-448, 479-487, 771-773
498-757: New tragic-exception tests provide good coverage (with acceptable test-only OOME usage)The new tests around
ramBytesUsed,getFlushingBytes, andgetTragicExceptionunder simulatedOutOfMemoryErroron current and old child writers exercise the new behavior inCompositeIndexWriter(tragic-exception detection and propagation viaAlreadyClosedException). CatchingOutOfMemoryErroris fine here since it’s fully constrained to a syntheticDirectoryin test scope.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (7)
486-512: NewcomputeIndexWriterIfAbsentForCriterialoop correctly avoids closed lookupsThe updated
LiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteriausesmapReadLock.tryAcquire()in a loop and skips lookups that reportisClosed(), which prevents routing new writes into maps that have been rotated tooldand closed during refresh. Thefinallyblock also only closes the read lock on failure, allowing callers (viaCompositeIndexWriter.computeIndexWriterIfAbsentForCriteria) to manage the balanced unlock via their own try-with ongetMapReadLock(). This looks consistent with the lock protocol in the rest of the class.
744-748: Pending-docs aggregation correctly incorporates child and parent writers (subject to child-only increments)
getPendingNumDocs()now returnschildWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs(), which is exactly whatInternalEngine.tryAcquireInFlightDocsneeds: an upper bound on documents accounted in Lucene plus those not yet visible toIndexWriterin child writers. As long as increments tochildWriterPendingNumDocsare limited to child-writer operations (see separate comment ondeleteInLucene), this aggregation is sound.
926-957: addDocuments/addDocument: size-based increments look correctIn both
addDocumentsandaddDocument:
- You determine criteria, pick the appropriate child
DisposableIndexWriter, and protect it with the map read lock plus keyed UID lock.- After writing (
addDocuments/addDocument), you incrementchildWriterPendingNumDocsbysizeor1and return the sequence number.This matches the documented semantics for
childWriterPendingNumDocsand ensures child contributions togetPendingNumDocs()track the number of in-flight doc writes.
960-985: softUpdateDocuments increments child count correctly and records delete entry
softUpdateDocumentsnow consumes the newint sizeparameter and:
- Writes via
IndexWriter.softUpdateDocuments(uid, docs, softDeletesField).- Increments
childWriterPendingNumDocsbysize.- Records the delete entry in the lookup for later parent consolidation.
This is consistent with the way
addDocumentsusessizeand with howdeletePreviousVersionsForUpdatedDocumentslater applies these delete entries to the parent writer.
704-742:getFlushingBytesUtilcorrectly implements tragic-exception semanticsThe
getFlushingBytes()implementation delegates togetFlushingBytesUtil(...), which:
- Sums
getFlushingBytes()across current and old child writers.- Ignores
AlreadyClosedExceptionfor non-tragic closures (normal close/rotation).- Rethrows on
AlreadyClosedExceptionif the child writer has a non-nullgetTragicException(), surfacing unrecoverable errors.- Adds the parent
accumulatingIndexWriter.getFlushingBytes()at the end.This implementation follows Lucene's recommended patterns:
AlreadyClosedExceptionindicates the writer is closed, andgetTragicException()is the correct way to distinguish between a normal closure and a fatal, unrecoverable error (e.g., disk full during flush). The code appropriately ignores transient closures while propagating tragic failures.
770-784: Tragic-exception detection now covers both child and parent writers
getTragicException()scans all current and old childDisposableIndexWriters and returns the first non-null tragic exception before falling back to the parentaccumulatingIndexWriter. This ensuresInternalEngine.failOnTragicEventcan correctly detect when a group-specific child writer encounters a fatal error.
795-833:ramBytesUsedUtilcontains unnecessary exception handling forramBytesUsed()The code catches
AlreadyClosedExceptionfromdisposableIndexWriter.getIndexWriter().ramBytesUsed()calls, but Lucene'sIndexWriter.ramBytesUsed()is a non-failing diagnostic method documented to never throwAlreadyClosedException. Only methods usingensureOpen()throw this exception. Thetry-catchblock aroundramBytesUsed()calls is therefore unnecessary and suggests a misunderstanding of the IndexWriter API contract. Remove these catch blocks or replace them with appropriate handling if other failure modes are genuinely expected.Likely an incorrect or invalid review comment.
|
❌ Gradle check result for 887bdb6: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1069-1083: Fix double-counting bug in pending docs accounting.The
deleteInLucenemethod incrementschildWriterPendingNumDocs(Line 1082) after operating oncurrentWriter, which is the accumulating parent writer (passed asaccumulatingIndexWriterat Line 1065). However,childWriterPendingNumDocsis documented (Lines 131-143) as tracking child-level IndexWriter pending docs only.Since
getPendingNumDocs()(Line 747) returnschildWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs(), documents added to the accumulating writer are already counted viaaccumulatingIndexWriter.getPendingNumDocs(). IncrementingchildWriterPendingNumDocsfor parent writer operations causes double-counting, leading to inflated pending doc counts that may trigger unnecessary flushes.🐛 Proposed fix
Remove the increment since the accumulating writer's pending docs are already tracked separately:
private void deleteInLucene( Term uid, boolean isStaleOperation, IndexWriter currentWriter, Iterable<? extends IndexableField> doc, Field... softDeletesField ) throws IOException { if (isStaleOperation) { currentWriter.addDocument(doc); } else { currentWriter.softUpdateDocument(uid, doc, softDeletesField); } - - childWriterPendingNumDocs.incrementAndGet(); }
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The loop in CompositeIndexWriter that repeatedly calls
this.current.mapReadLock.tryAcquire() on variable current can spin indefinitely;
add a safeguard by implementing a max retry count and/or timeout inside the
while loop that checks current == null || current.isClosed(), incrementing a
retry counter or tracking elapsed time each iteration, and if exceeded, stop
retrying and throw a clear exception (or return a failure) so callers can handle
shutdown; ensure you reference the same current and mapReadLock.tryAcquire()
usage and preserve the existing semantics when acquisition succeeds.
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 194-199: Update the Javadoc on the deriveSource method to use the
correct terminology: replace "Context Aware Segment field" with "context aware
grouping field" so it matches the class/field type
(ContextAwareGroupingFieldMapper) and the Javadoc at line 175; leave the no-op
implementation of deriveSource( XContentBuilder builder, LeafReader leafReader,
int docId ) unchanged.
- Around line 187-192: The Javadoc uses inconsistent terminology ("Context Aware
Segment field") while the actual field type is context_aware_grouping; update
the Javadoc on the canDeriveSource() override in ContextAwareGroupingFieldMapper
to say "Context aware grouping field" (or similar consistent phrasing) so it
matches the field type name context_aware_grouping and the class
ContextAwareGroupingFieldMapper.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (6)
- CHANGELOG.md
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java
- server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (27)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Retry semantics are sound; verify broader impact of LookupMapLockAcquisitionException removalThe retry logic at lines 724-730 is correct:
resetForExecutionForRetry()properly increments the retry counter (line 236 of BulkPrimaryExecutionContext.java) and resets state toINITIAL(line 233), so there is no risk of infinite retry loops or violating theassert context.isInitial()invariant.However, the removal of LookupMapLockAcquisitionException handling means transient engine lock acquisition failures will now fail fast rather than triggering a retry. Confirm this aligns with the intended behavior and that no other mechanism was expected to handle such transient issues for the indexing regression fix.
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)
11-12: LGTM: Required imports for new method signatures.The imports are necessary for the
deriveSource()method parameters.server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)
55-55: LGTM: Clean API extension for size-aware document additions.The addition of the
sizeparameter allows callers to communicate the document count, enabling better resource tracking and accounting in implementations.
59-67: LGTM: Consistent size-aware API for soft updates.The
sizeparameter is correctly positioned before the varargssoftDeletesFieldparameter, following Java conventions. This maintains consistency with theaddDocumentssignature change.server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)
239-245: LGTM: Correct size parameter for multi-document indexing.The
docs.size()parameter is appropriately passed toaddDocumentsin the multi-document path, while the single-document path correctly continues to useaddDocumentwithout the size parameter.
247-260: LGTM: Correct size parameter for soft updates.The
docs.size()parameter is correctly passed tosoftUpdateDocumentsin the multi-document path, with proper parameter ordering beforesoftDeletesField. The single-document path appropriately usessoftUpdateDocumentwithout the size parameter.server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: LGTM: Correct size-aware document additions and counter accounting.The implementation correctly:
- Passes
docs.size()toaddDocumentsfor multi-document operations- Increments
numDocAppendsbydocs.size()to accurately track all documents added (both single and multi-document cases)
1249-1258: LGTM: Correct size parameter for stale document additions.The
docs.size()parameter is correctly passed toaddDocumentsfor stale operations. Note that stale operations correctly do not incrementnumDocAppendssince they represent out-of-order operations that shouldn't be counted as new appends.
1369-1390: LGTM: Correct size-aware soft updates and counter accounting.The implementation correctly:
- Passes
docs.size()tosoftUpdateDocumentswith proper parameter ordering- Increments
numDocUpdatesbydocs.size()for accurate tracking- Maintains the append-only index constraint check
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
21-289: LGTM: Test methods correctly updated for new API signatures.All test methods have been properly updated to pass the
docs.size()parameter toaddDocumentsandsoftUpdateDocumentscalls. The parameter ordering insoftUpdateDocuments(size beforesoftDeletesField) is correct throughout.
154-204: LGTM: Well-designed concurrency test for delete during refresh.This test exercises a specific race condition scenario where a document is deleted while a refresh is in progress and the delete operation obtains a lock on the old child writer. The test correctly:
- Uses
acquireNewReadLock()to simulate holding a lock on the old writer- Synchronizes with
CountDownLatchto coordinate the refresh thread- Properly joins threads and performs cleanup with
IOUtils.closeWhileHandlingException- Verifies the final state (document count is 0)
The test provides good coverage for the concurrent delete scenario during writer rotation.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (2)
48-890: LGTM: Existing test methods correctly updated for new API signatures.All existing test methods have been systematically updated to pass the
docs.size()parameter toaddDocumentsandsoftUpdateDocumentscalls. The updates are consistent throughout and maintain the correct parameter ordering.
498-757: LGTM: Comprehensive tragic exception test coverage.The new test methods provide thorough coverage of tragic exception scenarios for both current and old writers:
- RAM bytes tests: Verify
ramBytesUsed()throwsAlreadyClosedExceptionafter tragic exception- Flushing bytes tests: Verify
getFlushingBytes()throwsAlreadyClosedExceptionafter tragic exception- Tragic exception getter tests: Verify
getTragicException()returns non-null after tragic exceptionAll tests correctly:
- Use
FilterDirectorywith simulatedOutOfMemoryErrorto trigger tragic exceptions- Handle both current and old writer scenarios (the latter with proper thread synchronization)
- Clean up resources with
IOUtils.closeWhileHandlingException- Use appropriate assertions for the expected behavior
The tests ensure robust handling of catastrophic failures in the indexing path.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (14)
131-144: Well-documented pending docs tracking.The documentation clearly explains the tracking strategy and acknowledges the temporary overshooting during refresh. The approach of tracking child writer pending docs separately from the accumulating writer is sound.
352-355: Good defensive check for closed lookups.This prevents race conditions where a lookup is closed between lock acquisition and usage by releasing the lock and returning null when the lookup is already closed.
606-609: LGTM!Simple utility method for unit tests to acquire a read lock on the current map.
707-742: Robust exception handling for flushing bytes calculation.The pattern of catching
AlreadyClosedExceptionand only rethrowing when a tragic exception exists is appropriate. This allows gracefully closed writers to be skipped while ensuring serious errors (tragic exceptions) are propagated.
747-747: LGTM!Correctly combines pending docs from child writers and the accumulating parent writer.
770-784: LGTM!Comprehensive tragic exception checking across current, old, and accumulating writers ensures any critical failure is detected and reported.
796-833: LGTM!Consistent exception handling pattern with
getFlushingBytesUtil. The documentation clearly explains the calculation and double-counting avoidance strategy.
853-880: LGTM!Comprehensive rollback logic ensures all child writers are closed to prevent file leaks, with appropriate handling for already-closed writers.
927-942: LGTM!The size-aware API correctly tracks pending document counts by incrementing
childWriterPendingNumDocsby the actual number of documents added.
945-958: LGTM!Correctly increments
childWriterPendingNumDocsfor single document additions.
961-986: LGTM!Size-aware soft update correctly tracks pending documents for bulk updates.
989-1013: LGTM!Correctly increments
childWriterPendingNumDocsfor single document soft updates.
1030-1067: LGTM!Correctly increments
childWriterPendingNumDocswhen delete operations are applied to child writers. The comments clearly explain the increments only apply to child writer operations.
545-563: LGTM!The pending docs accounting during refresh correctly accumulates pending docs from old child writers and subtracts them from
childWriterPendingNumDocsafter merging into the parent. This prevents double-counting since merged documents become part ofaccumulatingIndexWriter.getPendingNumDocs().
| /** | ||
| * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment | ||
| * generation. | ||
| */ | ||
| @Override | ||
| public void deriveSource(XContentBuilder builder, LeafReader leafReader, int docId) throws IOException {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix terminology inconsistency; no-op implementation is appropriate.
The no-op implementation is consistent with the field's non-ingestible nature (line 175). However, the Javadoc should use "context aware grouping field" instead of "Context Aware Segment field" to match the actual field type.
📝 Proposed Javadoc correction
/**
- * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment
+ * Context aware grouping field is not a part of an ingested document, so omitting it from source
* generation.
*/
@Override
public void deriveSource(XContentBuilder builder, LeafReader leafReader, int docId) throws IOException {}🤖 Prompt for AI Agents
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
around lines 194 - 199, Update the Javadoc on the deriveSource method to use the
correct terminology: replace "Context Aware Segment field" with "context aware
grouping field" so it matches the class/field type
(ContextAwareGroupingFieldMapper) and the Javadoc at line 175; leave the no-op
implementation of deriveSource( XContentBuilder builder, LeafReader leafReader,
int docId ) unchanged.
|
❌ Gradle check result for b08ba39: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Remove orphaned LookupMapLookupMapLockAcquisitionException setting and documentation.The removal of
LookupMapLockAcquisitionExceptionexception and its retry logic is incomplete. The following artifacts remain unused:
- Setting
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONinIndexSettings.java(line 519)- Field
maxRetryOnLookupMapAcquisitionException(line 933) and its getter/setter methods (lines 2114-2119)- Stale JavaDoc comment referencing the removed exception (line 515)
- Setting registration in
IndexScopedSettings.java(line 181)The getter is never called, and
TransportShardBulkAction.javacontains no references to this setting. Remove these orphaned artifacts or document why they should be retained for backward compatibility.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1069-1083: Potential bug: IncrementingchildWriterPendingNumDocsfor parent writer operations.The
deleteInLucene()method operates on theaccumulatingIndexWriter(parent), not a child writer. However, it incrementschildWriterPendingNumDocsat line 1082. This appears inconsistent with the field's documented purpose of tracking "pendingNumDocs for child level IndexWriters."Should this instead rely on the parent writer's own
getPendingNumDocs()tracking, or is this intentional to account for tombstone entries?#!/bin/bash # Verify all callers of deleteInLucene to understand the intent ast-grep --pattern 'deleteInLucene($_, $_, $_, $_, $_)'
🧹 Nitpick comments (3)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
494-501: Potential busy-wait loop without backoff.The
whileloop usestryAcquire()which returnsnullif the lock cannot be acquired or the map is closed. This could spin indefinitely if refresh keeps rotating maps. Consider adding aThread.yield()or brief pause to reduce CPU consumption during contention.♻️ Optional improvement to reduce CPU spinning
while (current == null || current.isClosed()) { // This function acquires a first read lock on a map which does not have any write lock present... current = this.current.mapReadLock.tryAcquire(); + if (current == null) { + Thread.yield(); + } }server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
154-204: New test for delete with document in old child writer.The test exercises a concurrent scenario where a delete occurs while refresh is transitioning the map. However:
- Line 171:
AtomicBoolean runis set tofalseat line 193 but never read, making it dead code.- Lines 174-176: The
tryblock catches all exceptions silently, which could hide test failures in the refresher thread.♻️ Remove unused variable and improve exception visibility
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); CountDownLatch latch = new CountDownLatch(1); - AtomicBoolean run = new AtomicBoolean(true); + AtomicReference<Exception> refreshException = new AtomicReference<>(); Thread refresher = new Thread(() -> { latch.countDown(); try { compositeIndexWriter.beforeRefresh(); - } catch (Exception ignored) {} + } catch (Exception e) { + refreshException.set(e); + } }); refresher.start(); try { latch.await(); // ... delete logic } finally { IOUtils.closeWhileHandlingException(lock.getMapReadLock()); - run.set(false); refresher.join(); + if (refreshException.get() != null) { + throw refreshException.get(); + } // ... remaining assertions }server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
498-757: Consider extracting the failing directory supplier to reduce duplication.The
dirSupplierlambda for creating aFilterDirectorythat throwsOutOfMemoryErroris repeated verbatim in all six tragic exception tests. Consider extracting it to a private helper method.♻️ Proposed refactor
private Supplier<Directory> createFailingDirectorySupplier() { return () -> new FilterDirectory(newDirectory()) { @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { IndexOutput out = super.createOutput(name, context); return new FilterIndexOutput("failing output", "test", out) { @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { throw new OutOfMemoryError("Simulated write failure"); } }; } }; }Then in each test:
-Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) { - ... -}; +Supplier<Directory> dirSupplier = createFailingDirectorySupplier();
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (4)
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (33)
CHANGELOG.md (1)
33-33: LGTM!The changelog entry correctly documents the fix and follows the established format.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (5)
131-144: Well-documentedchildWriterPendingNumDocsfield.The documentation clearly explains the purpose and the acceptable overshoot behavior during refresh. This is a good practice for tracking state in concurrent scenarios.
348-361: Improved safety:tryAcquire()now checks for closed lookup.The addition of the closed check after acquiring the lock prevents returning a closed lookup. This is correct since the read lock is held when checking
isClosed(), ensuring the state is consistent during this check.
545-563: Correct pending doc accounting during refresh.The approach properly tracks pending docs from old child writers and subtracts them from
childWriterPendingNumDocsafter merging into the parent writer, preventing double-counting. UsingAtomicLongis fine here even though this runs single-threaded during refresh.
719-742: Robust handling of closed writers ingetFlushingBytesUtil().The logic correctly distinguishes between normal closure (ignored) and tragic exceptions (rethrown). This prevents false errors during refresh when writers are intentionally closed.
854-880: Defensive rollback handling withAlreadyClosedException.The added try-catch blocks around child writer rollback prevent failures during cleanup when writers are already closed. The
isOpen()check before rollback is a good guard, though the exception handling provides additional safety.server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
55-67: API signature changes addsizeparameter.The interface changes are clean and correctly position the
sizeparameter before the varargsField... softDeletesField. Implementations inCompositeIndexWriterandLuceneIndexWriterhave been updated accordingly.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: Convenience overload forconfig(Store store).This is a clean addition that reduces boilerplate in tests by delegating to the full
config()method with default parameters.
509-564: Well-structuredFlushingIndexWriterFactoryfor test scenarios.The factory correctly:
- Wraps
IndexWriterto flush after each write operation- Tracks directories for proper cleanup via
Closeable- Supports conditional use of a failing directory supplier via
AtomicBooleanThis is useful for testing flush behavior and failure scenarios.
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)
134-154: Interface compliance:sizeparameter added but unused.The
sizeparameter is correctly added to match theDocumentIndexWriterinterface. SinceLuceneIndexWriterdelegates directly to Lucene'sIndexWriter(which handles its own pending doc tracking), the size parameter is intentionally unused here. This is appropriate for the wrapper pattern.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
34-34: Test updates correctly pass document size.All
addDocuments()calls are properly updated to includeoperation.docs().size()as the size parameter.Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263
122-130:softUpdateDocuments()calls updated with size parameter.The calls correctly position the size parameter (
operation.docs().size()) before thesoftDeletesFieldvararg, matching the updated interface signature.Also applies to: 227-235, 268-276
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
694-696: The filtering ofgetCompositeFieldTypes()to onlyCompositeDataCubeFieldTypeinstances is intentional and correct. All external callers explicitly expect only this type: they either cast toCompositeDataCubeFieldType, checkinstanceof StarTreeFieldType, or callisEmpty(). The internalcompositeMappedFieldTypesfield still stores allCompositeMappedFieldTypeimplementations (includingContextAwareGroupingFieldType) for field lookup purposes viabuildCompositeFieldLookup(), while the public API appropriately returns only the DataCube types.server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: LGTM - Size parameter correctly added for multi-document indexing path.The
docs.size()parameter is correctly passed to align with the new size-awareaddDocumentsAPI. The append counter is already incremented bydocs.size()which is consistent.
1249-1258: LGTM - Consistent with the multi-document stale docs path.The size parameter is correctly added for stale document indexing, maintaining consistency with the
addDocsmethod above.
1384-1390: LGTM - Size parameter correctly added for soft update path.The
docs.size()parameter is correctly positioned beforesoftDeletesFieldin thesoftUpdateDocumentscall, and the update counter is already correctly incremented bydocs.size().server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (17)
40-40: LGTM - Import added for new test utilities.
183-183: LGTM - API call updated with size parameter.
231-231: LGTM - API call updated with size parameter.
309-312: LGTM - Exception test correctly updated with new API.
334-334: LGTM - API call updated with size parameter.
374-374: LGTM - API call updated with size parameter.
437-437: LGTM - API calls updated with size parameter.Also applies to: 447-447
480-480: LGTM - API calls updated with size parameter.Also applies to: 486-486
498-532: Good test coverage for tragic exception handling on current map.The test correctly validates that
ramBytesUsed()throwsAlreadyClosedExceptionwhen a tragic exception has occurred.
534-583: Test correctly validates behavior during map rotation with tragic exception.The lock + refresher pattern effectively simulates the state where the tragic exception is in the old map during rotation.
585-619: LGTM - Validates getFlushingBytes behavior with tragic exception.
621-670: LGTM - Validates getFlushingBytes during map rotation with tragic exception.
672-706: LGTM - Validates getTragicException returns non-null after failure.
708-757: LGTM - Validates getTragicException during map rotation.
772-772: LGTM - API call updated with size parameter.
785-788: LGTM - Exception test correctly updated with new API.
839-860: LGTM - Both addDocuments and softUpdateDocuments calls correctly updated with size parameter.The
softUpdateDocumentscalls correctly placeoperation.docs().size()beforesoftDeletesFieldas per the new API signature.
|
❌ Gradle check result for 1880fbb: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
724-730: Address incomplete refactoring: Remove orphaned LookupMapLockAcquisitionException settings.The exception class was successfully removed, but related configuration infrastructure remains as dead code. The setting constant
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTION, the fieldmaxRetryOnLookupMapAcquisitionException, its getter method, and the comment at line 515 of IndexSettings.java should also be removed since they reference a non-existent exception and are never used.Remove from
server/src/main/java/org/opensearch/index/IndexSettings.java:
- Lines 514-523: Setting definition and its comment
- Line 933: Field declaration
- Lines 1149: Field initialization in constructor
- Lines 1318: Setting registration
- Lines 2114-2119: Setter and getter methods
Remove from
server/src/main/java/org/opensearch/common/settings/IndexScopedSettings.java:
- Line 181: Setting registration
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1069-1083: ClarifychildWriterPendingNumDocscounter scope indeleteInLucene.The documentation (lines 131-143) explicitly defines
childWriterPendingNumDocsto track operations on child-level IndexWriters. However,deleteInLuceneis called withaccumulatingIndexWriter(the parent) at line 1065, yet unconditionally incrementschildWriterPendingNumDocsat line 1082 regardless of writer type. This contradicts the documented scope.The code at line 1059 shows awareness of this distinction (comment: "only increment this when addDeleteEntry for child writers are called"), and the documentation acknowledges overcounting is acceptable (lines 141-143). Clarify whether this parent-writer increment is intentional for safety or should be scoped conditionally. Add an explanatory comment if this counter overshooting by parent operations is by design.
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java:
- Line 55: The interface parameter `size` on
addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) and
softUpdateDocuments(...) is undocumented and unvalidated which can corrupt
CompositeIndexWriter's pending counts; add clear javadoc to both interface
methods explaining this parameter is the document count (rename to docCount only
if you accept a breaking change), then in CompositeIndexWriter.addDocuments and
CompositeIndexWriter.softUpdateDocuments validate that size equals the actual
iterable count (throw IllegalArgumentException on mismatch), and make
LuceneIndexWriter either perform the same validation or explicitly
document/handle the parameter consistently so all implementations behave the
same.
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java:
- Around line 187-192: Update the Javadoc wording to refer to "Context Aware
Grouping" instead of "Context Aware Segment" for consistency with the class
ContextAwareGroupingFieldMapper (and its content type "context_aware_grouping");
specifically, edit the comment above the canDeriveSource() method and the other
Javadoc occurrence in this class to replace "Context Aware Segment" with
"Context Aware Grouping".
In @server/src/main/java/org/opensearch/index/mapper/MapperService.java:
- Around line 694-696: getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes may be null before internalMerge(); update the method
to return Collections.emptySet() when compositeMappedFieldTypes is null and
restrict the return type/name to reflect that it only returns
CompositeDataCubeFieldType instances (e.g., rename to
getCompositeDataCubeFieldTypes() and change return type to
Set<CompositeDataCubeFieldType>), and update callers (like
isCompositeIndexPresent()) to use the new method; ensure you reference the
compositeMappedFieldTypes field and internalMerge() assignment when implementing
the null check and API rename.
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java:
- Around line 154-204: The test declares an unused AtomicBoolean variable `run`
(AtomicBoolean run) which is set but never read; remove the unused `run`
variable and its set call to simplify the test: delete the `AtomicBoolean run =
new AtomicBoolean(true);` declaration and the `run.set(false);` line in the
finally block, leaving the Thread `refresher` logic and surrounding calls to
`compositeIndexWriter.beforeRefresh()`/`afterRefresh()` unchanged so behavior
and synchronization via `latch` remain intact.
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
509-564: Consider renamingfailingDirectoryfor clarity.The
FlushingIndexWriterFactorytest utility is well-implemented. However, at line 523, the variable namefailingDirectoryis misleading—it holds either the failing directory or the regular directory depending on theuseFailingDirectorySupplierflag.♻️ Suggested variable rename for clarity
@Override public IndexWriter createWriter(Directory directory, IndexWriterConfig config) throws IOException { - Directory failingDirectory = useFailingDirectorySupplier.get() ? failingWriteDirectorySupplier.get() : directory; - directories.add(failingDirectory); - return new IndexWriter(failingDirectory, config) { + Directory selectedDirectory = useFailingDirectorySupplier.get() ? failingWriteDirectorySupplier.get() : directory; + directories.add(selectedDirectory); + return new IndexWriter(selectedDirectory, config) {server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
494-501: Busy-wait loop without backoff could cause CPU spinning.The
while (current == null || current.isClosed())loop with onlytryAcquire()may spin aggressively when the map is being rotated frequently. Consider addingThread.yield()or a brief sleep to reduce CPU contention.Proposed fix
while (current == null || current.isClosed()) { // This function acquires a first read lock on a map which does not have any write lock present. Current keeps // on getting rotated during refresh, so there will be one current on which read lock can be obtained. // Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was // never applied on this map as write lock gets only during closing time. We are doing this instead of acquire, // because acquire can also apply a read lock in case refresh completed and map is closed. current = this.current.mapReadLock.tryAcquire(); + if (current == null || current.isClosed()) { + Thread.yield(); + } }server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
498-532: Consider extracting duplicated directory supplier to a helper method.The failing
FilterDirectorywithOutOfMemoryErrorpattern is repeated across six test methods (lines 499-510, 535-546, 586-597, 622-633, 673-684, 709-720). This could be extracted to a helper method for maintainability.Proposed refactor
private Supplier<Directory> createFailingDirectorySupplier() { return () -> new FilterDirectory(newDirectory()) { @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { IndexOutput out = super.createOutput(name, context); return new FilterIndexOutput("failing output", "test", out) { @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { throw new OutOfMemoryError("Simulated write failure"); } }; } }; }Then use in tests:
- Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) { - @Override - public IndexOutput createOutput(String name, IOContext context) throws IOException { - ... - } - }; + Supplier<Directory> dirSupplier = createFailingDirectorySupplier();
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (6)
- CHANGELOG.md
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
- server/src/main/java/org/opensearch/index/engine/InternalEngine.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, macos-15)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: Mend Security Check
🔇 Additional comments (20)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
17-17: LGTM! Imports support the new test infrastructure.The new imports are appropriate for the
FlushingIndexWriterFactorytest utility class added below.Also applies to: 81-81, 89-89
243-257: LGTM! Convenient overload for test scenarios.This overload simplifies creating an
EngineConfigwith a customStorewhile using sensible defaults for other parameters. The delegation pattern is clean and appropriate for test infrastructure.server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)
11-12: LGTM: Required imports for new method signatures.The imports are correctly added to support the new
deriveSourcemethod signature.
194-199: Verify method signature and correct terminology in Javadoc.Two points:
Minor: The Javadoc mentions "Context Aware Segment" instead of "Context Aware Grouping". Please update for consistency with the class purpose.
Verification needed: Please confirm that the parent class
ParametrizedFieldMapperhas a matchingderiveSourcesignature with avoidreturn type. The no-op implementation is appropriate sinceparseCreateField(line 175) prevents these fields from being ingested, but we should verify the signature is correct.The verification script for
canDeriveSource()above will also check this method's signature.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)
16-17: LGTM!The new imports for
CountDownLatchandAtomicBooleanare correctly added to support the new concurrency test.
34-34: LGTM!The
addDocumentscall signature is updated to include the size parameter, consistent with the API changes inCompositeIndexWriter.
122-130: LGTM!The
softUpdateDocumentscall signature is correctly updated to include the size parameter.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (7)
131-144: LGTM!Good documentation explaining the purpose and behavior of
childWriterPendingNumDocs, including the intentional overshoot during refresh to avoid undershooting issues.
352-357: LGTM!Good defensive check to close and return null when the lookup is already closed, preventing operations on stale/closed maps.
549-559: LGTM!Proper tracking of pending docs from closed child writers to decrement
childWriterPendingNumDocsafter syncing with the parent writer.
606-609: LGTM!Package-private test hook for acquiring read locks, appropriately documented as being for unit tests.
719-742: LGTM!Good defensive handling of
AlreadyClosedException- re-throwing only when there's a tragic exception, otherwise silently skipping closed writers.
855-875: LGTM!Proper rollback handling that ensures all child-level IndexWriters are closed to prevent file leaks, with appropriate exception handling for already-closed writers.
927-942: LGTM!The size-aware
addDocumentscorrectly incrementschildWriterPendingNumDocsby the size parameter after successful indexing.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
40-40: LGTM!Import for
Supplieradded to support the new tragic exception test patterns.
183-183: LGTM!The
addDocumentscall signature is correctly updated to include the size parameter.
534-583: LGTM!Good test coverage for
ramBytesUsedbehavior when a tragic exception occurs on the old writer during refresh transition. The test correctly uses latch synchronization to ensure the writer is in the old map before assertion.
672-706: LGTM!Good test coverage verifying that
getTragicException()returns a non-null value when a tragic exception occurs on the current writer.
839-860: LGTM!The
addDocumentsandsoftUpdateDocumentscalls are correctly updated to include the size parameter in thetestHasBlocksMergeFullyDelSegmentstest.server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
87-87: LGTM!The import is correctly added to support the stream filtering operations in
getCompositeFieldTypes().
| void deleteUnusedFiles() throws IOException; | ||
|
|
||
| long addDocuments(Iterable<ParseContext.Document> docs, Term uid) throws IOException; | ||
| long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
cat -n server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 202
🏁 Script executed:
git ls-files | grep -i "documentindexwriter"Repository: opensearch-project/OpenSearch
Length of output: 146
🏁 Script executed:
git ls-files | grep -i "engine" | head -20Repository: opensearch-project/OpenSearch
Length of output: 1671
🏁 Script executed:
cat -n server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 3337
🏁 Script executed:
rg -l "implements DocumentIndexWriter" --type=javaRepository: opensearch-project/OpenSearch
Length of output: 219
🏁 Script executed:
rg -n "addDocuments|softUpdateDocuments" server/src/main/java/org/opensearch/index/engine/ --type=java -A 5 -B 2Repository: opensearch-project/OpenSearch
Length of output: 14758
🏁 Script executed:
grep -n "size.*mismatch\|validate.*size\|assert.*size" server/src/main/java/org/opensearch/index/engine/*.javaRepository: opensearch-project/OpenSearch
Length of output: 2051
🏁 Script executed:
rg -n "childWriterPendingNumDocs" server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -B 3 -A 3Repository: opensearch-project/OpenSearch
Length of output: 3966
🏁 Script executed:
git log --oneline -n 20 -- server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 144
🏁 Script executed:
git show HEAD:server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java 2>/dev/null | head -80Repository: opensearch-project/OpenSearch
Length of output: 2311
🏁 Script executed:
rg "implements DocumentIndexWriter" --type=javaRepository: opensearch-project/OpenSearch
Length of output: 350
Add javadoc and validation for the size parameter in addDocuments() and softUpdateDocuments().
The size parameter is critical for CompositeIndexWriter, which uses it to track pending documents via childWriterPendingNumDocs.addAndGet(size). However, there is no documentation or validation:
-
Missing javadoc: The parameter lacks documentation explaining it represents document count and must match the iterable's actual size.
-
No validation: If a caller passes an incorrect size,
CompositeIndexWriterwill accumulate wrong pending doc counts, corrupting indexing state. Meanwhile,LuceneIndexWritersilently ignores the parameter, creating inconsistent behavior. -
Breaking API change: As a public interface, this signature change breaks external implementations.
-
Ambiguous parameter name:
sizecould mean document count or byte size. Rename todocCountfor clarity.
Suggested improvements
Add javadoc:
/**
* Adds multiple documents to the index.
*
* @param docs the documents to add
* @param uid the unique identifier term
* @param size the number of documents in the iterable (must match actual count)
* @return the sequence number
* @throws IOException if an I/O error occurs
* @throws IllegalArgumentException if size doesn't match the actual document count
*/
long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException;Consider renaming for clarity:
-long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException;
+long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int docCount) throws IOException;Add validation in CompositeIndexWriter.addDocuments() and softUpdateDocuments():
if (size != Iterables.size(docs)) {
throw new IllegalArgumentException("size mismatch: expected " + size + " but got " + Iterables.size(docs));
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java at
line 55, The interface parameter `size` on
addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) and
softUpdateDocuments(...) is undocumented and unvalidated which can corrupt
CompositeIndexWriter's pending counts; add clear javadoc to both interface
methods explaining this parameter is the document count (rename to docCount only
if you accept a breaking change), then in CompositeIndexWriter.addDocuments and
CompositeIndexWriter.softUpdateDocuments validate that size equals the actual
iterable count (throw IllegalArgumentException on mismatch), and make
LuceneIndexWriter either perform the same validation or explicitly
document/handle the parameter consistently so all implementations behave the
same.
| /** | ||
| * Context Aware Segment field is not a part of an ingested document, so omitting it from Context Aware Segment | ||
| * validation. | ||
| */ | ||
| @Override | ||
| public void canDeriveSource() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for canDeriveSource method in parent classes
rg -n -B 2 -A 3 'canDeriveSource' --type=java | head -100Repository: opensearch-project/OpenSearch
Length of output: 13274
🏁 Script executed:
#!/bin/bash
# Search for method definition (not calls) in parent classes
rg -n 'public\s+(void|boolean)\s+canDeriveSource\s*\(' --type=javaRepository: opensearch-project/OpenSearch
Length of output: 743
🏁 Script executed:
#!/bin/bash
# Get the Javadoc and definition context for parent Mapper.canDeriveSource
sed -n '305,315p' server/src/main/java/org/opensearch/index/mapper/Mapper.java
echo "---"
# Get the class-level Javadoc for ContextAwareGroupingFieldMapper
head -50 server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaRepository: opensearch-project/OpenSearch
Length of output: 2300
🏁 Script executed:
#!/bin/bash
# Get lines around the canDeriveSource method in ContextAwareGroupingFieldMapper
sed -n '180,200p' server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
echo "---"
# Search for "Segment" in the file
grep -n -i "segment" server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaRepository: opensearch-project/OpenSearch
Length of output: 890
Update Javadoc terminology to match the class name.
The Javadoc mentions "Context Aware Segment" but this class handles "Context Aware Grouping" fields (as indicated by the class name ContextAwareGroupingFieldMapper, the content type "context_aware_grouping", and the class documentation). Update the Javadoc at lines 188 and 195 to say "Context Aware Grouping" instead of "Context Aware Segment" for consistency.
🤖 Prompt for AI Agents
In
@server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
around lines 187 - 192, Update the Javadoc wording to refer to "Context Aware
Grouping" instead of "Context Aware Segment" for consistency with the class
ContextAwareGroupingFieldMapper (and its content type "context_aware_grouping");
specifically, edit the comment above the canDeriveSource() method and the other
Javadoc occurrence in this class to replace "Context Aware Segment" with
"Context Aware Grouping".
| return compositeMappedFieldTypes.stream() | ||
| .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) | ||
| .collect(Collectors.toSet()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the MapperService.java file structure and find the field declaration
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
echo "=== File found: {} ==="
wc -l {}'Repository: opensearch-project/OpenSearch
Length of output: 232
🏁 Script executed:
# Check the field declaration around line 230
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
echo "=== Lines around 230 (field declaration) ==="
sed -n "220,240p" {} | cat -n
echo ""
echo "=== Lines around 552 (claimed internalMerge assignment) ==="
sed -n "545,560p" {} | cat -n'Repository: opensearch-project/OpenSearch
Length of output: 1848
🏁 Script executed:
# Find constructor and initialization
fd -t f MapperService.java | head -1 | xargs -I {} bash -c '
echo "=== Constructor search ==="
grep -n "MapperService\s*(" {} | head -5'Repository: opensearch-project/OpenSearch
Length of output: 129
🏁 Script executed:
# Find all call sites of getCompositeFieldTypes
rg -n 'getCompositeFieldTypes\s*\(' --type=java -C 3Repository: opensearch-project/OpenSearch
Length of output: 30002
🏁 Script executed:
# Check the MapperService constructor body to see if it initializes compositeMappedFieldTypes
sed -n '234,350p' server/src/main/java/org/opensearch/index/mapper/MapperService.java | cat -nRepository: opensearch-project/OpenSearch
Length of output: 5870
Fix null pointer exception in getCompositeFieldTypes().
The field compositeMappedFieldTypes is declared at line 230 without initialization and only assigned during internalMerge() at line 552. The method directly calls .stream() without a null check at line 694, which will throw NullPointerException if called before any merge operation (e.g., early calls from isCompositeIndexPresent() at line 690).
Additionally, the method name is misleading—it filters to return only CompositeDataCubeFieldType instances despite the return type being the broader CompositeMappedFieldType.
Proposed fix
public Set<CompositeMappedFieldType> getCompositeFieldTypes() {
+ if (compositeMappedFieldTypes == null) {
+ return Collections.emptySet();
+ }
return compositeMappedFieldTypes.stream()
.filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType)
.collect(Collectors.toSet());
}🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/mapper/MapperService.java around
lines 694 - 696, getCompositeFieldTypes() can NPE because
compositeMappedFieldTypes may be null before internalMerge(); update the method
to return Collections.emptySet() when compositeMappedFieldTypes is null and
restrict the return type/name to reflect that it only returns
CompositeDataCubeFieldType instances (e.g., rename to
getCompositeDataCubeFieldTypes() and change return type to
Set<CompositeDataCubeFieldType>), and update callers (like
isCompositeIndexPresent()) to use the new method; ensure you reference the
compositeMappedFieldTypes field and internalMerge() assignment when implementing
the null check and API rename.
|
❌ Gradle check result for b6df798: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Update stale comment referencing removedLookupMapLockAcquisitionException.The exception class has been successfully removed from code, but a documentation comment in
server/src/main/java/org/opensearch/index/IndexSettings.javaat line 515 still referencesLookupMapLockAcquisitionException. Update or remove this outdated reference.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1069-1083: IncrementingchildWriterPendingNumDocsindeleteInLucenemay cause accounting drift.
deleteInLuceneis called withaccumulatingIndexWriter(the parent), not a child writer. IncrementingchildWriterPendingNumDocshere conflates parent and child accounting. The parent writer'sgetPendingNumDocs()is already queried separately ingetPendingNumDocs()(line 747), so this increment would double-count deletes on the parent.#!/bin/bash # Verify how deleteInLucene is called and whether the increment is correct rg -n -A5 'deleteInLucene\(' server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
🤖 Fix all issues with AI agents
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java:
- Around line 154-204: Remove the unused AtomicBoolean run and simplify the
refresher thread to only countDown the latch and call
compositeIndexWriter.beforeRefresh(), then terminate; replace the current
refresher lambda with one that does not depend on run. Wrap the call to
compositeIndexWriter.deleteDocument(...) in a try/catch that rethrows the
exception after performing cleanup so the test still fails but deterministic
cleanup runs. In the finally block always release lock.getMapReadLock() and join
the refresher thread before invoking
compositeIndexWriter.afterRefresh()/beforeRefresh() sequences and
IOUtils.closeWhileHandlingException(compositeIndexWriter); ensure you reference
the existing symbols: run (remove), refresher (modify), latch,
compositeIndexWriter.deleteDocument, lock.getMapReadLock(),
compositeIndexWriter.beforeRefresh/afterRefresh, and
IOUtils.closeWhileHandlingException(compositeIndexWriter).
🧹 Nitpick comments (3)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
153-168: Consider improving exception handling in the test.The test validates that the new
canDeriveSource()andderiveSource()methods can be invoked without errors. However, the broad catch-all exception handler withfail(e.getMessage())may mask specific issues.♻️ Suggested refinement
- public void testContextAwareFieldMapperWithDerivedSource() throws IOException { + public void testContextAwareFieldMapperWithDerivedSource() { ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( "context_aware_grouping", fieldType, new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") ); LeafReader leafReader = mock(LeafReader.class); - try { - mapper.canDeriveSource(); - mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); - } catch (Exception e) { - fail(e.getMessage()); - } + // Verify canDeriveSource returns false (no-op implementation) + assertFalse(mapper.canDeriveSource()); + + // Verify deriveSource completes without throwing (no-op implementation) + assertDoesNotThrow(() -> mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0)); }This makes the test intent clearer and leverages JUnit 5's
assertDoesNotThrowif available, or simply remove the try-catch and let any unexpected exceptions fail the test naturally.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
498-532: Consider extracting the duplicated directory supplier into a helper method.All six new tragic exception tests create identical
FilterDirectoryimplementations with the sameOutOfMemoryErrorthrowing behavior. This duplication could be reduced by extracting a shared helper.Suggested refactor
Add a helper method in the test class:
private Supplier<Directory> createFailingDirectorySupplier() { return () -> new FilterDirectory(newDirectory()) { @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { IndexOutput out = super.createOutput(name, context); return new FilterIndexOutput("failing output", "test", out) { @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { throw new OutOfMemoryError("Simulated write failure"); } }; } }; }Then use it in each test:
- Supplier<Directory> dirSupplier = () -> new FilterDirectory(newDirectory()) { - @Override - public IndexOutput createOutput(String name, IOContext context) throws IOException { - ... - } - }; + Supplier<Directory> dirSupplier = createFailingDirectorySupplier();Also applies to: 534-583, 585-619, 621-670, 672-706, 708-757
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
494-501: Potential busy-wait if tryAcquire consistently returns closed lookups.The while loop retries
tryAcquirewhen the lookup is null or closed. IftryAcquirereturns null due to closed lookups repeatedly (e.g., during rapid refresh cycles), this could spin. Consider adding a yield or brief sleep, or use the timeout variant oftryAcquire.Suggested improvement
while (current == null || current.isClosed()) { // This function acquires a first read lock on a map which does not have any write lock present... current = this.current.mapReadLock.tryAcquire(); + if (current == null) { + Thread.yield(); + } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
🚧 Files skipped from review as they are similar to previous changes (3)
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
- server/src/main/java/org/opensearch/index/engine/IngestionEngine.java
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (4)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
server/src/main/java/org/opensearch/script/ContextAwareGroupingScript.java (1)
ContextAwareGroupingScript(22-44)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
- GitHub Check: Mend Security Check
🔇 Additional comments (30)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-730: LGTM - Clean removal of obsolete exception handling.The removal of
LookupMapLockAcquisitionExceptionretry logic and its helper method is appropriate, as the exception class has been removed from the codebase entirely. The remaining retry logic for version conflicts (lines 724-729) is intact and correct.server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!Import correctly added to support the streaming operation in
getCompositeFieldTypes().
694-696: The filtering toCompositeDataCubeFieldTypeis intentional and correct.The getCompositeFieldTypes() method filters to only CompositeDataCubeFieldType instances by design. Every caller throughout the codebase (StarTreeQueryContext, StarDateFieldMapper, DateHistogramAggregator, etc.) expects CompositeDataCubeFieldType and immediately casts the result. This is the public API for querying DataCube-based composites, while buildCompositeFieldLookup() internally manages all CompositeMappedFieldType variants for indexing purposes. This separation is intentional, not an inconsistency.
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
11-11: LGTM! New imports support the test.The imports for
LeafReader,XContentFactory, andIOExceptionare appropriately added to support the new test methodtestContextAwareFieldMapperWithDerivedSource.Also applies to: 13-13, 20-20
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)
55-55: LGTM! Size-aware API addition.The
addDocumentsmethod now accepts asizeparameter to enable size-aware write accounting. This aligns with the PR's objective to track pending document counts more accurately across multi-document operations.
59-67: LGTM! Consistent size parameter placement.The
softUpdateDocumentsmethod now includes thesizeparameter positioned before the varargssoftDeletesField. This placement is correct and maintains consistency with theaddDocumentssignature change.server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-206: LGTM! Clear codec initialization logic with helpful documentation.The comment effectively explains why codec initialization occurs at this point (associatedCriteria binding during IndexWriter initialization). The conditional logic for selecting between
CriteriaBasedCodecand the base codec based on theisContextAwareEnabledsetting is clear and correct.server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: LGTM! Correct size-aware write path for document additions.The
addDocsmethod correctly passesdocs.size()to the newaddDocumentssignature and increments thenumDocAppendscounter with the actual document count. This ensures accurate tracking of multi-document operations.
1249-1258: LGTM! Stale document handling updated correctly.The
addStaleDocsmethod consistently passesdocs.size()toaddDocumentsfor both single and multi-document paths, ensuring soft-deleted documents are tracked with size awareness.
1369-1390: LGTM! Update path correctly implements size-aware API.The
updateDocsmethod properly passesdocs.size()as the sixth parameter tosoftUpdateDocumentsand increments thenumDocUpdatescounter with the document count. The placement of the size parameter beforesoftDeletesFieldaligns with the API signature change.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (3)
17-17: LGTM! Imports support new test utilities.The imports for
IndexableField,Closeable, andAtomicBooleanare appropriately added to support the newFlushingIndexWriterFactorytest utility class.Also applies to: 81-81, 89-89
243-257: LGTM! Convenient configuration overload.The new
config(Store store)method provides a convenient overload that delegates to the existing configuration method with default parameters, improving test readability.
509-564: LGTM! Well-designed test utility for flush verification.The
FlushingIndexWriterFactoryis a useful test utility that:
- Wraps
IndexWriterto automatically flush after write operations- Tracks directories for proper cleanup via the
Closeableinterface- Supports conditional failing-directory injection via supplier pattern
- Overrides all relevant write methods (
addDocument,addDocuments,softUpdateDocument,softUpdateDocuments)This enables tests to verify size-aware write accounting and flush behavior without manual flush calls.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)
16-17: LGTM!Import additions for
CountDownLatchandAtomicBooleanare appropriate for the new concurrency test.
34-34: LGTM!The
addDocumentscalls are consistently updated to use the new three-argument signature withoperation.docs().size()as the size parameter.Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263
122-130: LGTM!The
softUpdateDocumentscalls are consistently updated to include theoperation.docs().size()parameter in the correct position.Also applies to: 227-235, 268-276
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
134-136: LGTM!The
sizeparameter is added to match theDocumentIndexWriterinterface. It's intentionally unused here sinceLuceneIndexWriterdelegates directly to Lucene'sIndexWriter, which handles its own pending document accounting internally.
144-154: LGTM!The
sizeparameter addition aligns with the interface update. The parameter is intentionally unused in this implementation since Lucene'sIndexWriter.softUpdateDocumentshandles document accounting internally.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (3)
40-40: LGTM!Import for
Supplieris appropriate for the newFlushingIndexWriterFactoryusage in tragic exception tests.
183-183: LGTM!The
addDocumentscalls are consistently updated to use the new three-argument signature.Also applies to: 231-231, 311-311, 334-334, 374-374
840-860: LGTM!The
softUpdateDocumentscalls are correctly updated with the size parameter in the right position.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)
131-144: LGTM!Good documentation explaining the purpose of
childWriterPendingNumDocsand acknowledging that temporary overshooting during refresh is acceptable since undershooting would be problematic.
348-360: LGTM!Good fix to check
lookup.isClosed()after acquiring the lock. This prevents returning a closed lookup to callers and properly releases the lock before returning null.
549-559: LGTM!The pending document accounting during refresh correctly:
- Accumulates pending docs from old child writers before they're closed
- Subtracts this count after
addIndexessucceeds, since the parent writer now accounts for these docs
719-742: LGTM!The
getFlushingBytesUtilcorrectly handlesAlreadyClosedExceptionby only re-throwing when the writer has a tragic exception. This prevents spurious failures during normal map rotation while still propagating actual tragic failures.
745-748: LGTM!The
getPendingNumDocscorrectly combines the child writer pending count with the parent writer's pending count.
770-784: LGTM!The
getTragicExceptionmethod properly checks both current and old maps before falling back to the accumulating writer, ensuring tragic exceptions from any child writer are detected.
853-880: LGTM!The rollback method correctly:
- Checks
isOpen()before attempting rollback on child writers- Catches
AlreadyClosedExceptionto handle race conditions gracefully- Processes both current and old maps
927-942: LGTM!The
addDocumentsmethod correctly incrementschildWriterPendingNumDocsby the size parameter after successfully adding documents to the child writer.
961-986: LGTM!The
softUpdateDocumentsmethod correctly uses the size parameter to incrementchildWriterPendingNumDocsand properly records the delete entry for version tracking.
| public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException { | ||
| final String id = "test"; | ||
| CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter( | ||
| config(), | ||
| createWriter(), | ||
| newSoftDeletesPolicy(), | ||
| softDeletesField, | ||
| indexWriterFactory | ||
| ); | ||
|
|
||
| Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); | ||
| try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { | ||
| compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); | ||
| } | ||
|
|
||
| CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); | ||
| CountDownLatch latch = new CountDownLatch(1); | ||
| AtomicBoolean run = new AtomicBoolean(true); | ||
| Thread refresher = new Thread(() -> { | ||
| latch.countDown(); | ||
| try { | ||
| compositeIndexWriter.beforeRefresh(); | ||
| } catch (Exception ignored) {} | ||
| }); | ||
|
|
||
| refresher.start(); | ||
| try { | ||
| latch.await(); | ||
| compositeIndexWriter.deleteDocument( | ||
| operation.uid(), | ||
| false, | ||
| newDeleteTombstoneDoc(id), | ||
| 1, | ||
| 2, | ||
| primaryTerm.get(), | ||
| softDeletesField | ||
| ); | ||
| } finally { | ||
| IOUtils.closeWhileHandlingException(lock.getMapReadLock()); | ||
| run.set(false); | ||
| refresher.join(); | ||
| compositeIndexWriter.afterRefresh(true); | ||
| compositeIndexWriter.beforeRefresh(); | ||
| compositeIndexWriter.afterRefresh(true); | ||
| try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { | ||
| assertEquals(0, directoryReader.numDocs()); | ||
| } | ||
|
|
||
| IOUtils.closeWhileHandlingException(compositeIndexWriter); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused AtomicBoolean run variable.
The run variable is declared and set to false in the finally block but is never read. This appears to be leftover from a pattern where the refresher thread would check run.get() in a loop.
Also, the test lacks proper exception handling—if compositeIndexWriter.deleteDocument throws, the finally block still attempts operations that may fail or behave unexpectedly.
Suggested fix
- AtomicBoolean run = new AtomicBoolean(true);
Thread refresher = new Thread(() -> {
latch.countDown();
try {
compositeIndexWriter.beforeRefresh();
} catch (Exception ignored) {}
});
refresher.start();
try {
latch.await();
compositeIndexWriter.deleteDocument(
operation.uid(),
false,
newDeleteTombstoneDoc(id),
1,
2,
primaryTerm.get(),
softDeletesField
);
} finally {
IOUtils.closeWhileHandlingException(lock.getMapReadLock());
- run.set(false);
refresher.join();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException { | |
| final String id = "test"; | |
| CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter( | |
| config(), | |
| createWriter(), | |
| newSoftDeletesPolicy(), | |
| softDeletesField, | |
| indexWriterFactory | |
| ); | |
| Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); | |
| try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { | |
| compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); | |
| } | |
| CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); | |
| CountDownLatch latch = new CountDownLatch(1); | |
| AtomicBoolean run = new AtomicBoolean(true); | |
| Thread refresher = new Thread(() -> { | |
| latch.countDown(); | |
| try { | |
| compositeIndexWriter.beforeRefresh(); | |
| } catch (Exception ignored) {} | |
| }); | |
| refresher.start(); | |
| try { | |
| latch.await(); | |
| compositeIndexWriter.deleteDocument( | |
| operation.uid(), | |
| false, | |
| newDeleteTombstoneDoc(id), | |
| 1, | |
| 2, | |
| primaryTerm.get(), | |
| softDeletesField | |
| ); | |
| } finally { | |
| IOUtils.closeWhileHandlingException(lock.getMapReadLock()); | |
| run.set(false); | |
| refresher.join(); | |
| compositeIndexWriter.afterRefresh(true); | |
| compositeIndexWriter.beforeRefresh(); | |
| compositeIndexWriter.afterRefresh(true); | |
| try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { | |
| assertEquals(0, directoryReader.numDocs()); | |
| } | |
| IOUtils.closeWhileHandlingException(compositeIndexWriter); | |
| } | |
| } | |
| public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException { | |
| final String id = "test"; | |
| CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter( | |
| config(), | |
| createWriter(), | |
| newSoftDeletesPolicy(), | |
| softDeletesField, | |
| indexWriterFactory | |
| ); | |
| Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); | |
| try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { | |
| compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); | |
| } | |
| CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); | |
| CountDownLatch latch = new CountDownLatch(1); | |
| Thread refresher = new Thread(() -> { | |
| latch.countDown(); | |
| try { | |
| compositeIndexWriter.beforeRefresh(); | |
| } catch (Exception ignored) {} | |
| }); | |
| refresher.start(); | |
| try { | |
| latch.await(); | |
| compositeIndexWriter.deleteDocument( | |
| operation.uid(), | |
| false, | |
| newDeleteTombstoneDoc(id), | |
| 1, | |
| 2, | |
| primaryTerm.get(), | |
| softDeletesField | |
| ); | |
| } finally { | |
| IOUtils.closeWhileHandlingException(lock.getMapReadLock()); | |
| refresher.join(); | |
| compositeIndexWriter.afterRefresh(true); | |
| compositeIndexWriter.beforeRefresh(); | |
| compositeIndexWriter.afterRefresh(true); | |
| try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { | |
| assertEquals(0, directoryReader.numDocs()); | |
| } | |
| IOUtils.closeWhileHandlingException(compositeIndexWriter); | |
| } | |
| } |
🤖 Prompt for AI Agents
In
@server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java
around lines 154 - 204, Remove the unused AtomicBoolean run and simplify the
refresher thread to only countDown the latch and call
compositeIndexWriter.beforeRefresh(), then terminate; replace the current
refresher lambda with one that does not depend on run. Wrap the call to
compositeIndexWriter.deleteDocument(...) in a try/catch that rethrows the
exception after performing cleanup so the test still fails but deterministic
cleanup runs. In the finally block always release lock.getMapReadLock() and join
the refresher thread before invoking
compositeIndexWriter.afterRefresh()/beforeRefresh() sequences and
IOUtils.closeWhileHandlingException(compositeIndexWriter); ensure you reference
the existing symbols: run (remove), refresher (modify), latch,
compositeIndexWriter.deleteDocument, lock.getMapReadLock(),
compositeIndexWriter.beforeRefresh/afterRefresh, and
IOUtils.closeWhileHandlingException(compositeIndexWriter).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Incomplete removal of LookupMapLockAcquisitionException infrastructure.The removal of retry logic from
onCompletemethod is incomplete. While the handling was removed fromTransportShardBulkAction, the exception's setting infrastructure remains active and orphaned:
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONsetting is still defined and registered inIndexSettings.java:519andIndexScopedSettings.java:181- The setting is actively retrieved in
IndexSettings.java:1149- Documentation comment in
IndexSettings.java:515still references the removed exceptionRemove or update these remaining references to complete the cleanup:
- Remove
INDEX_MAX_RETRY_ON_LOOKUP_MAP_LOCK_ACQUISITION_EXCEPTIONsetting definition and registration, or- Update documentation and ensure the setting is not vestigial
- Update tests that may depend on this setting
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
489-494: EnsureindexWriterFactorygets closed intearDown()now that it can beCloseable.
FlushingIndexWriterFactorytracks directories and requiresclose(), but the base class currently never closesindexWriterFactory, which risks leaking file handles / temp dirs and causing flaky tests (especially on Windows).Proposed fix (close factory safely in base tearDown)
@Override @After public void tearDown() throws Exception { - super.tearDown(); - IOUtils.close(store, () -> terminate(threadPool)); + try { + IOUtils.close( + () -> { + final IndexWriterFactory factory = indexWriterFactory; + if (factory instanceof Closeable) { + ((Closeable) factory).close(); + } + }, + store, + () -> terminate(threadPool) + ); + } finally { + super.tearDown(); + } }Also applies to: 509-564
server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
202-206: Add explicit null validation for associatedCriteria before CriteriaBasedCodec instantiation.When
isContextAwareEnabled()returns true,associatedCriteriais passed to theCriteriaBasedCodecconstructor without validation. AlthoughCriteriaBasedCodechandles null gracefully (checking at line 52 of itssegmentInfoFormat()method), the comments in this code (lines 198-201) state that criteria is "determined on a per-document basis and is only available within the InternalEngine," implying it should never be null in this path.Adding an explicit null check or assertion before instantiation would enforce this precondition and improve code clarity, ensuring that if context-aware is enabled, the required criteria is always present.
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1059-1073: Line 1072 incorrectly incrementschildWriterPendingNumDocsfor parent writer operations.The
deleteInLucenemethod operates onaccumulatingIndexWriter(parent), but the code at line 1072 incrementschildWriterPendingNumDocs. The explicit comment at lines 1038-1040 states "only increment this when addDeleteEntry for child writers are called," indicating this counter should not be incremented for parent writer operations. Since parent pending docs are tracked separately viaaccumulatingIndexWriter.getPendingNumDocs(), this increment appears to be either erroneous or represents unintended double-counting. Remove the increment at line 1072 or clarify the intent if this is deliberate.
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 494-501: The loop in CompositeIndexWriter that repeatedly calls
current.mapReadLock.tryAcquire() can spin forever if only closed maps are
produced; change it to bound the wait by either using a timed tryAcquire (e.g.,
tryAcquire(timeout, TimeUnit)) or by recording startTime and breaking after a
configurable timeout, and between attempts perform a short sleep or
Thread.yield() to avoid busy-spin; also check a shutdown/closed flag on
CompositeIndexWriter (or similar lifecycle indicator) and throw a clear
exception if shutdown is in progress so the caller can abort instead of looping
indefinitely.
🧹 Nitpick comments (5)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
694-696: The filtering logic is correct, but consider whether the performance optimization is necessary.The current implementation filters on every call to return only
CompositeDataCubeFieldTypeinstances. While the filtering is intentional and aligns with all production usage (all 9 callers expect or cast toCompositeDataCubeFieldType), the stream-filter-collect operation creates a new Set each time.However, this is likely not a critical issue since:
- The method is called only 9 times in production code, mostly during initialization
- It's not in hot loops or performance-sensitive paths
- The filtering overhead is minimal for small sets
Note that the filtering intentionally excludes
ContextAwareGroupingFieldTypeinstances from the result. This is correct becausebuildCompositeFieldLookup()(line 561) uses the unfilteredcompositeMappedFieldTypesfield to collect all field names, while callers ofgetCompositeFieldTypes()expect only DataCube field types. Verify that this behavioral change (narrowingisCompositeIndexPresent()to check only for DataCube types) aligns with the intended behavior.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: Avoid confusion fromconfig(Store store)parameter shadowing the fieldstore.This is fine functionally, but the parameter name makes call sites harder to read/scan in a base test. Consider renaming the parameter (e.g.,
engineStore) and/or making the overloadprotectedif it’s only intended for subclasses.
509-564: Harden directory tracking to avoid double-close (and makeclose()idempotent-ish).
directoriesis a plainList, so the sameDirectorycan be added multiple times (e.g., if supplier returns the same instance), andIOUtils.close(directories)may double-close and throw during cleanup. Consider de-dup + snapshot/clear on close.Proposed fix (de-dup + snapshot/clear)
protected static class FlushingIndexWriterFactory extends NativeLuceneIndexWriterFactory implements Closeable { private final Supplier<Directory> failingWriteDirectorySupplier; - private final List<Directory> directories; + private final List<Directory> directories; private final AtomicBoolean useFailingDirectorySupplier; FlushingIndexWriterFactory(Supplier<Directory> failingWriteDirectorySupplier, AtomicBoolean useFailingDirectorySupplier) { this.failingWriteDirectorySupplier = failingWriteDirectorySupplier; this.directories = new ArrayList<>(); this.useFailingDirectorySupplier = useFailingDirectorySupplier; } @@ @Override public void close() throws IOException { - IOUtils.close(directories); + // snapshot + clear prevents re-close on repeated close() calls and reduces risk of concurrent modification + final List<Directory> toClose = new ArrayList<>(directories); + directories.clear(); + IOUtils.close(toClose.stream().distinct().toList()); } }server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
134-136: Consider adding a clarifying comment for the unused size parameter.The
sizeparameter is added to match theDocumentIndexWriterinterface but is not used in this implementation. A brief comment explaining that this parameter is used by other implementations (e.g.,CompositeIndexWriter) for pending document tracking would improve maintainability.📝 Suggested comment addition
@Override public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException { + // size parameter unused here; used by CompositeIndexWriter for pending doc tracking return indexWriter.addDocuments(docs); }
144-154: Consider adding a clarifying comment for the unused size parameter.Similar to
addDocuments, thesizeparameter is added to match the interface but is not used in this implementation. A brief comment would improve code clarity.📝 Suggested comment addition
public void softUpdateDocuments( Term uid, Iterable<ParseContext.Document> docs, long version, long seqNo, long primaryTerm, int size, Field... softDeletesField ) throws IOException { + // size parameter unused here; used by CompositeIndexWriter for pending doc tracking indexWriter.softUpdateDocuments(uid, docs, softDeletesField); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
🚧 Files skipped from review as they are similar to previous changes (3)
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
- server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
- CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: Analyze (java)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: Mend Security Check
🔇 Additional comments (30)
server/src/main/java/org/opensearch/index/mapper/MapperService.java (1)
87-87: LGTM!The import is correctly added to support the filtering logic in
getCompositeFieldTypes().server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java (1)
198-201: LGTM: Clear explanatory comments.The inline comments effectively explain the rationale for initializing the codec at this location rather than in CodecService, which will help future maintainers understand this design decision.
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
55-55: LGTM! API signature updates are consistent.The addition of the
int sizeparameter to bothaddDocumentsandsoftUpdateDocumentsenables size-aware pending document tracking at the implementation level.Also applies to: 65-65
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (6)
183-183: LGTM! Test updates correctly pass document count.All test invocations have been properly updated to pass
operation.docs().size()as the size parameter, maintaining consistency with the new API signatures.Also applies to: 231-231, 334-334, 374-374, 420-420, 462-462, 472-472, 505-505, 511-511, 552-552, 588-588, 624-624, 676-676, 712-712, 748-748, 800-800, 836-836, 871-871
404-427: LGTM! Basic tragic exception validation.This test verifies that under normal conditions (no failures),
getTragicException()returns null as expected.
523-557: LGTM! Comprehensive tragic exception test coverage.These tests properly validate that:
AlreadyClosedExceptionis thrown when accessingramBytesUsed,getFlushingBytes, orgetTragicExceptionafter a tragic failure in child writers- Tragic exceptions are properly detected in both current and old writer maps
- Thread synchronization with
CountDownLatchensures proper test sequencingThe use of
FlushingIndexWriterFactorywith failing directories effectively simulates tragic failures.Also applies to: 559-608, 647-681, 683-732, 771-805, 807-856
610-645: LGTM! Tests verify metric collection with old writers.These tests confirm that
ramBytesUsedandgetFlushingBytescorrectly aggregate metrics from writers in the old map during refresh transitions.Also applies to: 734-769
911-947: LGTM! Rollback test with old writer.This test validates that rollback succeeds even when a writer exists in the old map during a concurrent refresh, ensuring proper cleanup of all writer states.
977-985: LGTM! softUpdateDocuments calls updated correctly.The test calls to
softUpdateDocumentsnow properly includeoperation.docs().size()as the size parameter.Also applies to: 989-997
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (13)
131-144: LGTM! Clear documentation for pending document tracking.The detailed documentation explains the purpose and behavior of
childWriterPendingNumDocs, including the acceptable temporary overshooting during refresh.
549-560: LGTM! Proper pending document accounting during refresh.The refresh logic correctly:
- Accumulates pending docs from old child writers before closing them
- Subtracts the accumulated count from the global counter after adding indexes to the parent
This ensures the pending doc count remains accurate across the refresh transition.
606-609: LGTM! Test utility method for acquiring read lock.Package-private accessor enables test scenarios that require holding a read lock on the current map, as demonstrated in the new test cases.
706-742: LGTM! Robust flushing bytes calculation with tragic exception handling.The refactored
getFlushingBytes()properly:
- Iterates through both current and old writer maps
- Catches
AlreadyClosedExceptionand only rethrows if a tragic exception exists- Aggregates flushing bytes from child writers and the parent
770-784: LGTM! Comprehensive tragic exception detection.The method now checks for tragic exceptions in both current and old child writers before checking the parent, ensuring any child writer failures are properly detected.
796-833: LGTM! Robust RAM bytes calculation with tragic exception handling.Similar to
getFlushingBytes, this properly handlesAlreadyClosedExceptionand aggregates RAM usage across all writers.
853-870: LGTM! Proper rollback of all child writers.The updated rollback ensures all child writers (both current and old) are properly rolled back before rolling back the parent writer, preventing resource leaks.
917-931: LGTM! Size-aware document addition with proper accounting.The method correctly:
- Delegates to the underlying IndexWriter's
addDocuments- Increments
childWriterPendingNumDocsby the provided size- Returns the sequence number
935-948: LGTM! Single document addition increments counter.Properly increments the pending doc counter by 1 for single document additions.
950-976: LGTM! Size-aware soft update with proper accounting.The soft update correctly increments
childWriterPendingNumDocsby the provided size after updating documents.
979-1003: LGTM! Single document soft update increments counter.Properly increments the pending doc counter by 1 for single document soft updates.
1020-1057: LGTM! Delete operation increments counter for child writers.The delete logic correctly increments
childWriterPendingNumDocsonly when delete entries are added to child writers (current or old), not for every delete operation.
352-356: The TOCTOU concern is not applicable to the current implementation.The read lock acquired in
tryAcquire()(line 349) is held throughout the caller's usage of the returnedDisposableIndexWriter. SinceCriteriaBasedIndexWriterLookup.closedis only set to true withinclose()(line 309), which requires a write lock on the underlyingmapLock, and write locks are only obtained during engine closure (per design comments at lines 497-499), the lookup cannot transition to closed while a read lock is held. The call site (lines 494-501) properly handles null returns via retry loop, and all consumer sites acquire and hold the lock via try-with-resources (e.g., lines 924-930).Note: The timeout variant
tryAcquire(TimeValue timeout)(line 366) lacks thelookup.isClosed()check present in the no-arg variant, creating an inconsistency—though this variant appears unused in the codebase.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (3)
34-34: LGTM! Test calls updated with document count.All
addDocumentsinvocations correctly passoperation.docs().size()as the third parameter.Also applies to: 76-76, 114-114, 166-166, 219-219, 263-263
122-130: LGTM! softUpdateDocuments calls updated correctly.All
softUpdateDocumentsinvocations now includeoperation.docs().size()as the size parameter.Also applies to: 227-235, 268-276
154-204: LGTM! Comprehensive test for delete with old child writer.This test validates the delete operation when:
- A document exists in a child writer
- A concurrent refresh moves the writer to the old map
- The delete is processed while the writer is in the old state
The test properly uses:
acquireNewReadLock()to hold the current map during refreshCountDownLatchfor thread coordination- Proper resource cleanup with
IOUtils.closeWhileHandlingExceptionserver/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)
239-245: LGTM! Proper usage of size-aware addDocuments API.The multi-document path correctly passes
docs.size()to the new API, while single-document operations continue using the originaladdDocumentmethod.
247-260: LGTM! Proper usage of size-aware softUpdateDocuments API.The update logic correctly:
- Passes
docs.size()for multi-document updates- Uses
softUpdateDocument(without size) for single documentsserver/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: LGTM: Size parameter correctly added for multi-document tracking.The change correctly passes
docs.size()to enable pending document tracking. The single-document optimization path is preserved, and the counter increment correctly handles both cases.
1249-1258: LGTM: Stale document handling correctly updated.The size parameter is correctly propagated for multi-document stale operations. The soft-delete field is properly added to all documents before indexing.
1369-1390: LGTM: Update document path correctly modified.The size parameter is properly added to the
softUpdateDocumentscall for multi-document updates. The single-document optimization is preserved, and the counter correctly usesdocs.size().
| while (current == null || current.isClosed()) { | ||
| // This function acquires a first read lock on a map which does not have any write lock present. Current keeps | ||
| // on getting rotated during refresh, so there will be one current on which read lock can be obtained. | ||
| // Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was | ||
| // never applied on this map as write lock gets only during closing time. We are doing this instead of acquire, | ||
| // because acquire can also apply a read lock in case refresh completed and map is closed. | ||
| current = this.current.mapReadLock.tryAcquire(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Risk: Potential infinite loop waiting for non-closed map.
The while loop continues until it finds a non-closed current map. If the map rotation consistently produces closed maps (e.g., during shutdown or repeated failures), this could spin indefinitely without yielding or timeout.
🔄 Consider adding timeout or yield
boolean success = false;
CriteriaBasedIndexWriterLookup current = null;
+int attempts = 0;
+final int MAX_ATTEMPTS = 1000;
try {
- while (current == null || current.isClosed()) {
+ while ((current == null || current.isClosed()) && attempts < MAX_ATTEMPTS) {
// This function acquires a first read lock on a map which does not have any write lock present. Current keeps
// on getting rotated during refresh, so there will be one current on which read lock can be obtained.
// Validate that no write lock is applied on the map and the map is not closed. Idea here is write lock was
// never applied on this map as write lock gets only during closing time. We are doing this instead of acquire,
// because acquire can also apply a read lock in case refresh completed and map is closed.
current = this.current.mapReadLock.tryAcquire();
+ if (current == null || current.isClosed()) {
+ attempts++;
+ Thread.yield(); // Allow other threads to progress
+ }
}
+
+ if (current == null || current.isClosed()) {
+ throw new IllegalStateException("Unable to acquire non-closed lookup after " + MAX_ATTEMPTS + " attempts");
+ }
DisposableIndexWriter writer = current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier);Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 494 - 501, The loop in CompositeIndexWriter that repeatedly calls
current.mapReadLock.tryAcquire() can spin forever if only closed maps are
produced; change it to bound the wait by either using a timed tryAcquire (e.g.,
tryAcquire(timeout, TimeUnit)) or by recording startTime and breaking after a
configurable timeout, and between attempts perform a short sleep or
Thread.yield() to avoid busy-spin; also check a shutdown/closed flag on
CompositeIndexWriter (or similar lifecycle indicator) and throw a clear
exception if shutdown is in progress so the caller can abort instead of looping
indefinitely.
|
❌ Gradle check result for fd34feb: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
1059-1073: Remove the pending docs increment from deleteInLucene, which incorrectly counts parent writer operations.The
deleteInLucenemethod operates onaccumulatingIndexWriter(the parent writer) but incrementschildWriterPendingNumDocsat line 1072. This causes double-counting because:
- Parent writer operations are tracked internally by
accumulatingIndexWriter.getPendingNumDocs()getPendingNumDocs()returns the sum:childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs()- Incrementing
childWriterPendingNumDocsfor parent writer operations adds them a second timeAll other methods in the class (addDocuments, addDocument, softUpdateDocuments, softUpdateDocument) operate on child writers and increment
childWriterPendingNumDocs. The comments at lines 1039 and 1050 explicitly state "only increment this when addDeleteEntry for child writers are called." ThedeleteInLucenemethod breaks this pattern by incrementing for parent writer operations. Remove line 1072.
🤖 Fix all issues with AI agents
In @CHANGELOG.md:
- Line 34: The changelog entry "Fix indexing regression and bug fixes for
grouping criteria.
([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" uses an
inconsistent PR reference format; update that entry to include the hash symbol
so the link reads
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))", i.e.,
replace "([20145](" with "([#20145](" for the PR reference to match other
entries.
In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java:
- Around line 153-168: The test method
testContextAwareFieldMapperWithDerivedSource leaks the XContentBuilder and never
finishes the JSON object; change creation of the builder from
XContentFactory.jsonBuilder().startObject() to use a try-with-resources (or
explicitly close) and call endObject() before passing it to
mapper.deriveSource(...), ensuring the XContentBuilder is closed in all paths so
the resource is not leaked when invoking mapper.canDeriveSource() and
mapper.deriveSource(...).
🧹 Nitpick comments (11)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (1)
188-189: Documentation terminology inconsistency.The javadoc comments refer to "Context Aware Segment field" but the class is named
ContextAwareGroupingFieldMapper. Consider updating the documentation to use consistent terminology (e.g., "Context Aware Grouping field") for clarity.Also applies to: 195-196
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
162-167: Consider using more idiomatic JUnit test pattern.The try-catch with
fail()pattern is not the most idiomatic approach. Since the test method already declaresthrows IOException, you can either:
- Let exceptions propagate naturally (remove try-catch entirely), or
- Use
assertDoesNotThrow()if you specifically want to assert that no exception occurs.Additionally, the test could benefit from positive assertions about the behavior, not just the absence of exceptions.
♻️ Alternative implementation
public void testContextAwareFieldMapperWithDerivedSource() throws IOException { ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( "context_aware_grouping", fieldType, new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") ); LeafReader leafReader = mock(LeafReader.class); - try { - mapper.canDeriveSource(); - mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); - } catch (Exception e) { - fail(e.getMessage()); - } + // Since these are no-op methods, simply verify they don't throw exceptions + mapper.canDeriveSource(); + try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) { + mapper.deriveSource(builder, leafReader, 0); + builder.endObject(); + } + // Test passes if no exception is thrown }server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
693-697: Consider caching the filtered result for better performance.The method streams and filters
compositeMappedFieldTypeson every invocation. If called frequently during indexing operations, this could impact performance.♻️ Optimization: Cache the filtered set
Modify the class to maintain a separate cached field for CompositeDataCubeFieldType instances:
Add a new field after line 230:
private volatile Set<CompositeMappedFieldType> compositeMappedFieldTypes; private volatile Set<CompositeMappedFieldType> compositeDataCubeFieldTypes;Update the
internalMergemethod at line 552 to populate both:// initialize composite fields post merge this.compositeMappedFieldTypes = getCompositeFieldTypesFromMapper(); this.compositeDataCubeFieldTypes = compositeMappedFieldTypes.stream() .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) .collect(Collectors.toSet()); buildCompositeFieldLookup();Then simplify the method:
public Set<CompositeMappedFieldType> getCompositeFieldTypes() { - return compositeMappedFieldTypes.stream() - .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) - .collect(Collectors.toSet()); + return compositeDataCubeFieldTypes; }
693-697: Consider renaming or documenting the narrowed return type.The method name
getCompositeFieldTypes()suggests it returns all composite field types, but it now returns onlyCompositeDataCubeFieldTypeinstances. This could lead to confusion for developers using this API.📝 Suggested improvements
Option 1: Rename the method (breaking change):
-public Set<CompositeMappedFieldType> getCompositeFieldTypes() { +public Set<CompositeMappedFieldType> getCompositeDataCubeFieldTypes() { return compositeMappedFieldTypes.stream() .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) .collect(Collectors.toSet()); }Option 2: Add JavaDoc (non-breaking):
+/** + * Returns the set of composite field types that are also CompositeDataCubeFieldType instances. + * This is used to identify fields that are part of data cube composite indexes. + * + * @return Set of CompositeMappedFieldType instances that implement CompositeDataCubeFieldType + */ public Set<CompositeMappedFieldType> getCompositeFieldTypes() { return compositeMappedFieldTypes.stream() .filter(compositeMappedFieldType -> compositeMappedFieldType instanceof CompositeDataCubeFieldType) .collect(Collectors.toSet()); }server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
183-183: API signature updates are consistent and correct.All calls to
addDocuments()have been consistently updated to includeoperation.docs().size()as the third parameter, aligning with the new size-aware API.For a minor optimization in test code, consider caching
operation.docs().size()to avoid repeated calls, though this is not critical for tests:Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); -compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); +int docCount = operation.docs().size(); +compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), docCount);Also applies to: 231-231, 334-334, 374-374, 420-420, 462-462, 472-472, 505-505, 511-511, 552-552, 588-588, 624-624, 676-676, 712-712, 748-748, 800-800, 836-836, 871-871, 925-925, 976-976
523-556: Significant code duplication across tragic-exception tests.The tragic-exception test methods share nearly identical
FilterDirectorysetup code that throwsOutOfMemoryError. This duplication increases maintenance burden and makes the tests harder to update consistently.♻️ Consider extracting common test infrastructure
Create a helper method to reduce duplication:
private Supplier<Directory> createFailingDirectorySupplier() { return () -> new FilterDirectory(newDirectory()) { @Override public IndexOutput createOutput(String name, IOContext context) throws IOException { IndexOutput out = super.createOutput(name, context); return new FilterIndexOutput("failing output", "test", out) { @Override public void writeBytes(byte[] b, int offset, int length) throws IOException { throw new OutOfMemoryError("Simulated write failure"); } }; } }; } private CompositeIndexWriter createCompositeWriterWithFailingFactory(Supplier<Directory> dirSupplier) throws IOException { FlushingIndexWriterFactory factory = new FlushingIndexWriterFactory(dirSupplier, new AtomicBoolean(true)); CompositeIndexWriter writer = new CompositeIndexWriter( config(), createWriter(), newSoftDeletesPolicy(), softDeletesField, factory ); writer.getConfig().setMaxBufferedDocs(2); return writer; }Then simplify tests:
public void testRAMBytesUsedWithTragicExceptionOnCurrent() throws Exception { Supplier<Directory> dirSupplier = createFailingDirectorySupplier(); FlushingIndexWriterFactory factory = new FlushingIndexWriterFactory(dirSupplier, new AtomicBoolean(true)); CompositeIndexWriter compositeIndexWriter = createCompositeWriterWithFailingFactory(dirSupplier); Engine.Index operation = indexForDoc(createParsedDoc(String.valueOf("-1"), null, DEFAULT_CRITERIA)); try { compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); } catch (Error ignored) {} assertThrows(AlreadyClosedException.class, compositeIndexWriter::ramBytesUsed); IOUtils.closeWhileHandlingException(compositeIndexWriter, factory); }Also applies to: 559-607, 647-681, 683-732, 771-805, 807-856
404-427: Consider clarifying test name.The test name
testGetTragicExceptionWithExceptionis potentially confusing since no exception occurs in this test—it verifies the happy path wheregetTragicException()returns null. Consider renaming totestGetTragicExceptionWithoutExceptionortestGetTragicExceptionNormalOperationfor clarity.
129-139: Empty catch block may hide test failures.The empty catch block at line 137 could suppress genuine errors during concurrent indexing operations, making test failures harder to diagnose.
Improve error handling
Consider tracking or logging exceptions to identify issues:
Thread computeThread = new Thread(() -> { while (stopped.get() == false) { try { CompositeIndexWriter.LiveIndexWriterDeletesMap currentMap = mapRef.get(); currentMap.computeIndexWriterIfAbsentForCriteria("test-criteria", supplier, new ShardId("foo", "_na_", 1)); computeCount.incrementAndGet(); indexedDocs.release(); } catch (Exception e) { - + // Expected during map rotation, but log unexpected exceptions + if (!(e instanceof IllegalStateException)) { + throw new AssertionError("Unexpected exception during compute", e); + } } } });
192-199: Empty catch block may hide test failures.The empty catch block at line 197 could suppress
IOExceptionduring refresh operations in the concurrent test, potentially masking issues.Consider at minimum logging the exception or tracking that it occurred, especially for debugging test failures.
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
133-136: Clarify whatsizemeans (and optionally validate it) since it’s unused here.
Right nowsizeis accepted but ignored; if it’s doc-count, consider renaming the local param todocCountand guarding against negatives to catch caller bugs early.Proposed tweak
- public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int size) throws IOException { + public long addDocuments(Iterable<ParseContext.Document> docs, Term uid, int docCount) throws IOException { + if (docCount < 0) { + throw new IllegalArgumentException("docCount must be >= 0"); + } return indexWriter.addDocuments(docs); }
143-154: Same comment forsoftUpdateDocuments(..., int size, ...): rename todocCount(and optionally validate).
This wrapper ignoresversion/seqNo/primaryTerm/docCountand delegates to Lucene; that’s fine, but the unusedsizeparam is easy to misread as “bytes” rather than “doc count”.Proposed tweak
public void softUpdateDocuments( Term uid, Iterable<ParseContext.Document> docs, long version, long seqNo, long primaryTerm, - int size, + int docCount, Field... softDeletesField ) throws IOException { + if (docCount < 0) { + throw new IllegalArgumentException("docCount must be >= 0"); + } indexWriter.softUpdateDocuments(uid, docs, softDeletesField); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
🚧 Files skipped from review as they are similar to previous changes (2)
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
- server/src/main/java/org/opensearch/index/engine/InternalEngine.java
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Mend Security Check
🔇 Additional comments (27)
server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java (2)
11-12: LGTM! Imports support the new method signatures.The added imports for
LeafReaderandXContentBuilderare correctly used in the newderiveSourcemethod signature.
187-199: The @OverRide annotations are valid. The methodscanDeriveSource()andderiveSource(XContentBuilder, LeafReader, int)are declared in theMapperbase class (lines 309 and 319), so the overrides are correct.Consider updating the javadoc to use consistent terminology: change "Context Aware Segment" to "Context Aware Grouping" to match the class name and field type.
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
11-11: LGTM! Imports support the new test method.The added imports for
LeafReader,XContentFactory, andIOExceptionare correctly used in the new test method.Also applies to: 13-13, 20-20
server/src/main/java/org/opensearch/index/mapper/MapperService.java (2)
87-87: LGTM!The import is correctly added to support the streaming operation in
getCompositeFieldTypes().
693-697: This filtering is intentional and does not introduce a breaking change.The method now correctly filters to return only
CompositeDataCubeFieldTypeinstances, excludingContextAwareGroupingFieldType. Evidence confirms this is the intended behavior:
ContextAwareGroupingFieldTypeis a separate "grouping criteria" field type with zero usages in the star tree or composite index modules- All production callers of
getCompositeFieldTypes()immediately cast the result toCompositeDataCubeFieldTypeor expect only that type- The change aligns with the PR's stated goal of "bug fixes for grouping criteria"—this separation correctly prevents grouping fields from being processed as data cube fields
No verification of callers is needed; this filtering is a fix rather than a breaking change.
Likely an incorrect or invalid review comment.
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: The retry logic shown handles version conflicts, notLookupMapLockAcquisitionException.The
isConflictException()method (line 755-757) checks only forVersionConflictEngineException. The retry logic at lines 724-730 correctly retries on version conflicts for update operations. There is noLookupMapLockAcquisitionExceptionclass in the codebase, and there is noisLookupMapLockAcquisitionException()method—the claims about their removal in the original review are incorrect. The only reference to this exception is a stale comment inIndexSettings.javadescribing a retry setting. The code itself is correct.Likely an incorrect or invalid review comment.
server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (1)
55-55: LGTM: Clean API extension for document count tracking.The addition of the
sizeparameter toaddDocumentsandsoftUpdateDocumentsenables efficient document count propagation through the indexing stack without requiring callers to recompute sizes.Also applies to: 65-65
server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (1)
239-245: LGTM: Correct propagation of document counts.The changes correctly pass
docs.size()to the updatedaddDocumentsandsoftUpdateDocumentsmethods for multi-document operations, while single-document paths remain unchanged.Also applies to: 247-260
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (9)
131-144: LGTM: Well-documented pending document counter.The
childWriterPendingNumDocsfield and its comprehensive documentation clearly explain the tracking mechanism and acknowledge the acceptable temporary overshoot during refresh transitions.
352-356: LGTM: Defensive check prevents use-after-close.The
isClosed()check after acquiring the lock properly handles the race where a lookup is closed between lock acquisition and use, correctly releasing the lock before returning null.
545-563: LGTM: Correct pending document accounting during refresh.The logic properly tracks pending documents from old child writers before closing them, then adjusts the global counter after
addIndexesto prevent double-counting once documents are transferred to the parent writer.
705-742: LGTM: Robust exception handling for flushing bytes calculation.The refactored
getFlushingBytesUtilcorrectly handlesAlreadyClosedExceptionby only rethrowing when a tragic exception exists, preventing spurious failures while ensuring real corruption issues are surfaced.
744-748: LGTM: Correct aggregation of pending documents.
770-784: LGTM: Comprehensive tragic exception checking.The method correctly checks for tragic exceptions across current child writers, old child writers, and the accumulating parent writer, returning the first exception encountered.
786-833: LGTM: Consistent exception handling for RAM usage calculation.The refactoring mirrors
getFlushingBytesUtilwith the same robust exception handling pattern, maintaining consistency across resource tracking methods.
853-870: LGTM: Comprehensive rollback prevents resource leaks.The updated rollback logic correctly rolls back all child writers (current and old) before the parent writer, ensuring complete cleanup as documented in the comments.
494-501: The loop termination guarantees are sound as written.The while loop correctly handles map rotation during refresh. The
tryAcquire()method returns null when the lock cannot be acquired and returns the current lookup when locked (or closes it if already closed). SinceliveIndexWriterDeletesMapis volatile and rotates during refresh—creating a new current map that is not closed—the loop will eventually acquire a read lock on a stable, non-closed lookup and terminate.server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
243-257: LGTM: Convenient test configuration overload.
509-564: LGTM: Well-implemented test utility for flush behavior.The
FlushingIndexWriterFactoryprovides a clean test harness for verifying flush-related behavior by wrapping IndexWriter operations. The directory tracking and proper cleanup viaIOUtils.closeensure no resource leaks in tests.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
34-34: LGTM: Test calls updated for new API signatures.All test methods correctly pass
operation.docs().size()to the updatedaddDocumentsandsoftUpdateDocumentsmethods, maintaining existing test semantics.Also applies to: 76-76, 114-114, 122-130, 219-219, 227-235, 263-263, 268-276
154-204: LGTM: Good coverage of concurrent delete scenario.The new test effectively exercises the race condition where a delete operation targets an old child writer during refresh, using proper thread synchronization and safe cleanup. The test verifies correct eventual consistency after refresh completes.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (5)
40-40: LGTM - necessary import for new test infrastructure.The
Supplierimport is appropriately added to support the newFlushingIndexWriterFactorypattern used in tragic-exception tests.
977-997: softUpdateDocuments() signature updates are correct.The calls to
softUpdateDocuments()have been properly updated to include the document size parameter, maintaining consistency with the size-aware API pattern.
537-557: Resource management is adequate.The tragic-exception tests properly use
IOUtils.closeWhileHandlingException()to ensure cleanup ofCompositeIndexWriterandFlushingIndexWriterFactoryinstances, even when exceptions occur during test execution.Also applies to: 573-608, 661-681, 697-732, 785-805, 821-856
591-607: Verify thread coordination logic in refresh tests.These tests hold a write lock while spawning a refresher thread that calls
beforeRefresh(), which likely requires the same lock. The test pattern may not achieve the intended scenario of testing "writer on old map" if the refresher thread is blocked waiting for the lock held by the main thread.Consider verifying whether
beforeRefresh()successfully transitions the writer to the old map or if it remains blocked. You might add logging or assertions to confirm the refresher thread's state:AtomicBoolean refreshStarted = new AtomicBoolean(false); Thread refresher = new Thread(() -> { latch.countDown(); refreshStarted.set(true); try { compositeIndexWriter.beforeRefresh(); } catch (Exception ignored) {} }); refresher.start(); latch.await(); Thread.sleep(50); // Give refresher time to attempt beforeRefresh // Add assertion to verify intended state was reachedAlso applies to: 627-644, 715-731, 751-768, 839-855, 928-946
309-312: LGTM - formatting improvements enhance readability.The multi-line formatting of
expectThrows()calls improves code readability while maintaining functional correctness.Also applies to: 884-887
server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (1)
168-183: The soft-delete marker invariant is correctly maintained. WhendeleteDocument()is called withisStaleOperation=true, thedocparameter always contains the soft-delete field. Both production code paths (InternalEngine and IngestionEngine) explicitly calldoc.add(softDeletesField)before invokingdeleteDocument(), and all test cases follow the same pattern. Therefore,addDocument(doc)always adds a soft-deleted document, preventing any risk of accidentally creating a searchable live document.
| - Fix SearchPhaseExecutionException to properly initCause ([#20320](https://github.com/opensearch-project/OpenSearch/pull/20320)) | ||
| - Fix `cluster.remote.<cluster_alias>.server_name` setting no populating SNI ([#20321](https://github.com/opensearch-project/OpenSearch/pull/20321)) | ||
| - Fix X-Opaque-Id header propagation (along with other response headers) for streaming Reactor Netty 4 transport ([#20371](https://github.com/opensearch-project/OpenSearch/pull/20371)) | ||
| - Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent PR reference format in changelog entry.
The PR reference at line 34 is missing the # symbol that is consistently used in other entries in the same section. Update to match the existing format: ([#20145](...) instead of ([20145](...).
🔧 Proposed fix
- Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))
+ Fix indexing regression and bug fixes for grouping criteria. ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - Fix indexing regression and bug fixes for grouping criteria. ([20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) | |
| - Fix indexing regression and bug fixes for grouping criteria. ([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145)) |
🤖 Prompt for AI Agents
In @CHANGELOG.md at line 34, The changelog entry "Fix indexing regression and
bug fixes for grouping criteria.
([20145](https://github.com/opensearch-project/OpenSearch/pull/20145))" uses an
inconsistent PR reference format; update that entry to include the hash symbol
so the link reads
"([#20145](https://github.com/opensearch-project/OpenSearch/pull/20145))", i.e.,
replace "([20145](" with "([#20145](" for the PR reference to match other
entries.
| public void testContextAwareFieldMapperWithDerivedSource() throws IOException { | ||
| ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); | ||
| ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( | ||
| "context_aware_grouping", | ||
| fieldType, | ||
| new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") | ||
| ); | ||
| LeafReader leafReader = mock(LeafReader.class); | ||
|
|
||
| try { | ||
| mapper.canDeriveSource(); | ||
| mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); | ||
| } catch (Exception e) { | ||
| fail(e.getMessage()); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix resource leak and complete XContentBuilder lifecycle.
The test has two issues:
- Resource leak: The
XContentBuildercreated on Line 164 is never closed. Although this is a test, it's still a best practice to properly close resources. - Incomplete builder: The builder is started with
startObject()but never closed withendObject().
🔧 Proposed fix
public void testContextAwareFieldMapperWithDerivedSource() throws IOException {
ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null);
ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper(
"context_aware_grouping",
fieldType,
new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping")
);
LeafReader leafReader = mock(LeafReader.class);
- try {
+ try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) {
mapper.canDeriveSource();
- mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0);
- } catch (Exception e) {
- fail(e.getMessage());
+ mapper.deriveSource(builder, leafReader, 0);
+ builder.endObject();
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public void testContextAwareFieldMapperWithDerivedSource() throws IOException { | |
| ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); | |
| ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( | |
| "context_aware_grouping", | |
| fieldType, | |
| new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") | |
| ); | |
| LeafReader leafReader = mock(LeafReader.class); | |
| try { | |
| mapper.canDeriveSource(); | |
| mapper.deriveSource(XContentFactory.jsonBuilder().startObject(), leafReader, 0); | |
| } catch (Exception e) { | |
| fail(e.getMessage()); | |
| } | |
| } | |
| public void testContextAwareFieldMapperWithDerivedSource() throws IOException { | |
| ContextAwareGroupingFieldType fieldType = new ContextAwareGroupingFieldType(Collections.emptyList(), null); | |
| ContextAwareGroupingFieldMapper mapper = new ContextAwareGroupingFieldMapper( | |
| "context_aware_grouping", | |
| fieldType, | |
| new ContextAwareGroupingFieldMapper.Builder("context_aware_grouping") | |
| ); | |
| LeafReader leafReader = mock(LeafReader.class); | |
| try (XContentBuilder builder = XContentFactory.jsonBuilder().startObject()) { | |
| mapper.canDeriveSource(); | |
| mapper.deriveSource(builder, leafReader, 0); | |
| builder.endObject(); | |
| } | |
| } |
🤖 Prompt for AI Agents
In
@server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java
around lines 153 - 168, The test method
testContextAwareFieldMapperWithDerivedSource leaks the XContentBuilder and never
finishes the JSON object; change creation of the builder from
XContentFactory.jsonBuilder().startObject() to use a try-with-resources (or
explicitly close) and call endObject() before passing it to
mapper.deriveSource(...), ensuring the XContentBuilder is closed in all paths so
the resource is not leaked when invoking mapper.canDeriveSource() and
mapper.deriveSource(...).
|
❌ Gradle check result for e9991cc: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: RS146BIJAY <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
server/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.java (1)
717-753: Clean removal of LookupMapLockAcquisitionException handling, but update stale documentation.The removal from the
onCompletemethod is complete and correct. The retry logic for version conflicts (lines 724-729) and response processing flow properly. TheisConflictException()method correctly handles onlyVersionConflictEngineException.However, a stale reference remains:
server/src/main/java/org/opensearch/index/IndexSettings.java:515contains a comment mentioningLookupMapLockAcquisitionException. Update this comment to remove the reference to the now-deleted exception.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (3)
486-513: Leaked map read-lock in writer lookup path (can deadlock refresh).
LiveIndexWriterDeletesMap.computeIndexWriterIfAbsentForCriteria(...)acquiresthis.current.mapReadLock.tryAcquire()in a loop but never releases it on the success path (Lines 494-505). That will preventbeforeRefresh()from acquiring the corresponding write lock (Line 532), potentially hanging refresh forever. The same applies togetAssociatedIndexWriterForCriteria(...)callers likeaddDocuments/addDocument/softUpdate*which don’t close that lock. citeturn0search0Proposed fix (always release the read lock)
DisposableIndexWriter computeIndexWriterIfAbsentForCriteria( String criteria, CheckedBiFunction<String, CriteriaBasedIndexWriterLookup, DisposableIndexWriter, IOException> indexWriterSupplier, ShardId shardId ) { - boolean success = false; CriteriaBasedIndexWriterLookup current = null; try { while (current == null || current.isClosed()) { current = this.current.mapReadLock.tryAcquire(); } - - DisposableIndexWriter writer = current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier); - success = true; - return writer; + return current.computeIndexWriterIfAbsentForCriteria(criteria, indexWriterSupplier); } finally { - if (success == false && current != null) { - assert current.mapReadLock.isHeldByCurrentThread() == true; - current.mapReadLock.close(); - } + if (current != null) { + assert current.mapReadLock.isHeldByCurrentThread(); + current.mapReadLock.close(); + } } }Also applies to: 916-932, 934-948, 950-1003
646-666: Leaked map read-lock on successfulgetIndexWriterForIdFromLookupreturn.
getIndexWriterForIdFromLookup(...)acquiresindexWriterLookup.mapReadLock(Line 650-651) and only releases it when returningnull. When it returns aDisposableIndexWriter, the lock stays held (Line 654-657), and subsequent code acquires the lock again and only releases once, leaving a permanent hold. citeturn0search0Proposed fix (always release before returning)
DisposableIndexWriter getIndexWriterForIdFromLookup(BytesRef uid, CriteriaBasedIndexWriterLookup indexWriterLookup) { - boolean isCriteriaNotNull = false; try { indexWriterLookup.mapReadLock.acquire(); String criteria = indexWriterLookup.getCriteriaForDoc(uid); if (criteria != null) { DisposableIndexWriter disposableIndexWriter = indexWriterLookup.getIndexWriterForCriteria(criteria); if (disposableIndexWriter != null) { - isCriteriaNotNull = true; return disposableIndexWriter; } } return null; } finally { - if (isCriteriaNotNull == false) { - indexWriterLookup.mapReadLock.close(); - } + indexWriterLookup.mapReadLock.close(); } }
545-563: Reorder to read pending docs before closing the writer.Calling
getPendingNumDocs()afterclose()will throwAlreadyClosedExceptionbecause Lucene's IndexWriter enforces the closed state. This will cause the pending-doc reconciliation to fail. Move line 553 before line 552 to capture the pending document count before closing.Proposed fix
for (CompositeIndexWriter.DisposableIndexWriter childDisposableWriter : markForRefreshIndexWritersMap.values()) { directoryToCombine.add(childDisposableWriter.getIndexWriter().getDirectory()); - childDisposableWriter.getIndexWriter().close(); pendingNumDocsByOldChildWriter.addAndGet(childDisposableWriter.getIndexWriter().getPendingNumDocs()); + childDisposableWriter.getIndexWriter().close(); }server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
114-165: Concurrency tests likely too weak / swallow failures.
testConcurrentIndexingDuringRefresh:run.set(false)is called immediately after starting threads (Line 202-204), so the loops may not execute meaningfully.testConcurrentComputeIndexWriterWithMapRotation: emptycatch (Exception e) {}in the compute thread hides the actual failure mode (Line 135-138).Consider adding a minimum-duration/iteration barrier and failing on unexpected exceptions.
Also applies to: 166-209
🤖 Fix all issues with AI agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java:
- Around line 131-145: The deleteInLucene method incorrectly increments
childWriterPendingNumDocs when it performs deletions on the parent
accumulatingIndexWriter; remove that increment (or guard it) so
childWriterPendingNumDocs is only modified for child-level writer operations.
Specifically, in deleteInLucene avoid calling
childWriterPendingNumDocs.increment (or only call it when the operation targets
a child IndexWriter rather than accumulatingIndexWriter) so getPendingNumDocs()
no longer double-counts parent operations; update any accompanying comments to
reflect that childWriterPendingNumDocs tracks only child-level modifications.
🧹 Nitpick comments (6)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
509-564: Consider thread-safety for the directories list.The
directorieslist is accessed without synchronization increateWriterandclose. IfcreateWritercan be called from multiple threads during test execution, this could lead to race conditions.🔒 Proposed fix to use thread-safe collection
protected static class FlushingIndexWriterFactory extends NativeLuceneIndexWriterFactory implements Closeable { private final Supplier<Directory> failingWriteDirectorySupplier; - private final List<Directory> directories; + private final List<Directory> directories = Collections.synchronizedList(new ArrayList<>()); private final AtomicBoolean useFailingDirectorySupplier; FlushingIndexWriterFactory(Supplier<Directory> failingWriteDirectorySupplier, AtomicBoolean useFailingDirectorySupplier) { this.failingWriteDirectorySupplier = failingWriteDirectorySupplier; - this.directories = new ArrayList<>(); this.useFailingDirectorySupplier = useFailingDirectorySupplier; }Alternatively, if test execution guarantees single-threaded access, the current implementation is acceptable.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
154-204: Consider improving exception handling and removing unused variable.The new concurrent delete test has a couple of minor issues:
- Line 176: The
beforeRefreshexception is silently ignored, which could mask real failures during test execution.- Line 171: The
AtomicBoolean runvariable is created but never used in the test logic.♻️ Proposed improvements
CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); CountDownLatch latch = new CountDownLatch(1); - AtomicBoolean run = new AtomicBoolean(true); Thread refresher = new Thread(() -> { latch.countDown(); try { compositeIndexWriter.beforeRefresh(); - } catch (Exception ignored) {} + } catch (Exception e) { + fail("beforeRefresh failed: " + e.getMessage()); + } });Alternatively, if silently catching the exception is intentional for this test scenario, consider adding a comment explaining why.
156-162: Inconsistent resource management pattern.Unlike other tests in this class (e.g., lines 24-60), this test initializes
compositeIndexWriteroutside the try-finally block. If an exception occurs between initialization (line 156) and entering the finally block (line 191), the writer won't be properly closed.♻️ Align with the pattern used in other tests
public void testDeleteWithDocumentInOldChildWriter() throws IOException, InterruptedException { final String id = "test"; - CompositeIndexWriter compositeIndexWriter = new CompositeIndexWriter( - config(), - createWriter(), - newSoftDeletesPolicy(), - softDeletesField, - indexWriterFactory - ); - - Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); - try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { - compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); - } - - CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); - CountDownLatch latch = new CountDownLatch(1); - AtomicBoolean run = new AtomicBoolean(true); - Thread refresher = new Thread(() -> { - latch.countDown(); - try { - compositeIndexWriter.beforeRefresh(); - } catch (Exception ignored) {} - }); - - refresher.start(); + CompositeIndexWriter compositeIndexWriter = null; try { - latch.await(); - compositeIndexWriter.deleteDocument( - operation.uid(), - false, - newDeleteTombstoneDoc(id), - 1, - 2, - primaryTerm.get(), - softDeletesField + compositeIndexWriter = new CompositeIndexWriter( + config(), + createWriter(), + newSoftDeletesPolicy(), + softDeletesField, + indexWriterFactory ); + + Engine.Index operation = indexForDoc(createParsedDoc(id, null, DEFAULT_CRITERIA)); + try (Releasable ignore1 = compositeIndexWriter.acquireLock(operation.uid().bytes())) { + compositeIndexWriter.addDocuments(operation.docs(), operation.uid(), operation.docs().size()); + } + + CompositeIndexWriter.CriteriaBasedIndexWriterLookup lock = compositeIndexWriter.acquireNewReadLock(); + CountDownLatch latch = new CountDownLatch(1); + Thread refresher = new Thread(() -> { + latch.countDown(); + try { + compositeIndexWriter.beforeRefresh(); + } catch (Exception ignored) {} + }); + + refresher.start(); + try { + latch.await(); + compositeIndexWriter.deleteDocument( + operation.uid(), + false, + newDeleteTombstoneDoc(id), + 1, + 2, + primaryTerm.get(), + softDeletesField + ); + } finally { + IOUtils.closeWhileHandlingException(lock.getMapReadLock()); + refresher.join(); + compositeIndexWriter.afterRefresh(true); + compositeIndexWriter.beforeRefresh(); + compositeIndexWriter.afterRefresh(true); + try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { + assertEquals(0, directoryReader.numDocs()); + } + } } finally { - IOUtils.closeWhileHandlingException(lock.getMapReadLock()); - run.set(false); - refresher.join(); - compositeIndexWriter.afterRefresh(true); - compositeIndexWriter.beforeRefresh(); - compositeIndexWriter.afterRefresh(true); - try (DirectoryReader directoryReader = DirectoryReader.open(compositeIndexWriter.getAccumulatingIndexWriter())) { - assertEquals(0, directoryReader.numDocs()); + if (compositeIndexWriter != null) { + IOUtils.closeWhileHandlingException(compositeIndexWriter); } - - IOUtils.closeWhileHandlingException(compositeIndexWriter); } }Also applies to: 202-202
server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (2)
316-375:tryAcquire(TimeValue)should mirror the “closed lookup” guard.
CriteriaBasedWriterLock.tryAcquire()now checkslookup.isClosed()and releases (Lines 352-355), buttryAcquire(TimeValue)does not (Lines 366-374). That’s a potential inconsistency under rotation/close timing.
853-870: Rollback should be best-effort across writers to avoid partial leaks.As written, a single child rollback exception can prevent rolling back remaining writers + parent. Consider collecting/suppressing exceptions so you always attempt to rollback everything.
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
404-427: Good coverage for tragic-exception/closed-writer behavior, but consider tightening assertions.A lot of the new tests intentionally ignore
Error(e.g.,catch (Error ignored) {}) and then assert closed/ACE behavior later; that’s fine, but where possible, assert that a tragic exception was actually recorded (to reduce false positives).Also applies to: 523-857, 911-947
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
CHANGELOG.mdserver/src/main/java/org/opensearch/OpenSearchServerException.javaserver/src/main/java/org/opensearch/action/bulk/TransportShardBulkAction.javaserver/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/IngestionEngine.javaserver/src/main/java/org/opensearch/index/engine/InternalEngine.javaserver/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.javaserver/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.javaserver/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.javaserver/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.javaserver/src/main/java/org/opensearch/index/mapper/MapperService.javaserver/src/test/java/org/opensearch/ExceptionSerializationTests.javaserver/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.javaserver/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.javaserver/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.javaserver/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.javatest/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
💤 Files with no reviewable changes (5)
- server/src/main/java/org/opensearch/OpenSearchServerException.java
- server/src/test/java/org/opensearch/ExceptionSerializationTests.java
- server/src/test/java/org/opensearch/action/bulk/TransportShardBulkActionTests.java
- server/src/main/java/org/opensearch/index/engine/LookupMapLockAcquisitionException.java
- test/framework/src/main/java/org/opensearch/search/aggregations/AggregatorTestCase.java
🚧 Files skipped from review as they are similar to previous changes (4)
- CHANGELOG.md
- server/src/main/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapper.java
- server/src/main/java/org/opensearch/index/mapper/MapperService.java
- server/src/main/java/org/opensearch/index/engine/NativeLuceneIndexWriterFactory.java
🧰 Additional context used
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (1)
server/src/main/java/org/opensearch/script/ContextAwareGroupingScript.java (1)
ContextAwareGroupingScript(22-44)
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (1)
libs/common/src/main/java/org/opensearch/common/util/io/IOUtils.java (1)
IOUtils(58-317)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: gradle-check
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Mend Security Check
🔇 Additional comments (17)
server/src/test/java/org/opensearch/index/mapper/ContextAwareGroupingFieldMapperTests.java (2)
11-11: LGTM: Imports are appropriate.The new imports support the derived source testing in the new test method.
Also applies to: 13-13, 20-20
153-168: The test correctly validates no-op override methods and does not require the suggested assertions.The
canDeriveSource()method returnsvoid(not a value to assert), andderiveSource()is an intentional no-op override that does nothing with the builder. These methods override parent class methods to prevent them from throwingUnsupportedOperationException, as documented in the javadoc comments. The test appropriately verifies that both methods can be called without throwing exceptions, which is the correct behavior for no-op overrides.The XContentBuilder closure is the only minor style point, but it's immaterial since the builder is unused by the no-op method.
Likely an incorrect or invalid review comment.
server/src/test/java/org/opensearch/index/engine/CriteriaBasedCompositeIndexWriterBaseTests.java (2)
17-17: LGTM!The new imports support the FlushingIndexWriterFactory implementation and are all appropriately used.
Also applies to: 81-81, 89-89
243-257: LGTM!The convenience overload correctly delegates to the existing config method with appropriate preset parameters.
server/src/main/java/org/opensearch/index/engine/IngestionEngine.java (2)
239-245: LGTM!The multi-document path correctly propagates the document count via the new
docs.size()parameter, while the single-document path remains unchanged as expected.
247-260: LGTM!The document count is correctly passed to
softUpdateDocumentsbefore the varargs parameter. Single-document updates appropriately use the existing API without the size parameter.server/src/main/java/org/opensearch/index/engine/DocumentIndexWriter.java (2)
55-55: LGTM!The new
sizeparameter enables explicit document count tracking for multi-document operations, supporting improved RAM accounting and pending document management.
59-67: LGTM!The
sizeparameter is correctly positioned before the varargssoftDeletesFieldparameter, maintaining Java syntax requirements while enabling consistent document count tracking across update operations.server/src/main/java/org/opensearch/index/engine/LuceneIndexWriter.java (2)
133-136: Note: size parameter unused in delegation.The
sizeparameter is accepted per theDocumentIndexWriterinterface but not passed to the underlying LuceneIndexWriter. Since standard Lucene writers handle their own document counting, this is acceptable. The parameter is utilized by other implementations likeCompositeIndexWriterfor custom accounting.
143-154: Note: size parameter unused in delegation.Consistent with
addDocuments, thesizeparameter is accepted for interface compliance but not propagated to Lucene's nativesoftUpdateDocuments. This is appropriate for this wrapper implementation.server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForUpdateAndDeletesTests.java (2)
16-17: LGTM!The new imports support the concurrent delete test scenario added at line 154.
34-34: LGTM!All test call sites correctly updated to use the new size-aware API signatures, consistently passing
operation.docs().size()as the document count parameter.Also applies to: 76-76, 114-114, 122-130, 219-219, 227-235, 263-263, 268-276
server/src/main/java/org/opensearch/index/engine/InternalEngine.java (3)
1240-1247: Pass-through docCount looks correct for multi-doc appends.
indexWriter.addDocuments(docs, uid, docs.size())aligns with the actual number of Lucene docs being added and matchesnumDocAppends.inc(docs.size()).
1249-1258: Pass-through docCount looks correct for stale multi-doc appends.Same as normal append:
docs.size()is the right count to forward when all docs are added (even with soft-deletes field injected).
1369-1390: Pass-through docCount looks correct for multi-doc soft updates.Forwarding
docs.size()tosoftUpdateDocuments(...)is consistent with hownumDocUpdatesis incremented and with the intent of per-op doc accounting.server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java (1)
704-743: DefensiveAlreadyClosedExceptionhandling for RAM/flush stats is sensible.The “ignore ACE unless tragic” approach in
getFlushingBytesUtil(...)andramBytesUsedUtil(...)matches the test intent and avoids failing benignly during refresh/close races.Also applies to: 769-784, 795-834
server/src/test/java/org/opensearch/index/engine/CompositeIndexWriterForAppendTests.java (1)
166-207: Test updates for newsizeparameter look consistent.Most call sites pass
operation.docs().size()which is the right value for multi-doc operations, and the updatedsoftUpdateDocuments(..., size, ...)calls match the new signature.Also applies to: 210-270, 318-343, 358-402, 429-490, 492-520, 858-910, 965-1009
| /** | ||
| * pendingNumDocs is used to track pendingNumDocs for child level IndexWriters. Since pendingNumDocs is incremented | ||
| * (by one) only in DocumentsWriterPerThread#reserveOneDoc for any index or update operation, we keep incrementing | ||
| * pendingNumDocs by one for each of these operations. We increment this value whenever we call following functions | ||
| * on childWriter: | ||
| * - softUpdateDocument | ||
| * - softUpdateDocuments | ||
| * - addDocuments | ||
| * - addDocument | ||
| * | ||
| * This value may overshoot during refresh temporarily due to double counting few documents in both old child | ||
| * IndexWriters and parent which should ok as undershooting pendingNumDocs can be problematic. | ||
| */ | ||
| private final AtomicLong childWriterPendingNumDocs = new AtomicLong(); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
find . -name "CompositeIndexWriter.java" -type fRepository: opensearch-project/OpenSearch
Length of output: 149
🏁 Script executed:
wc -l ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 154
🏁 Script executed:
sed -n '131,145p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 866
🏁 Script executed:
sed -n '740,755p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 472
🏁 Script executed:
sed -n '1059,1080p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 892
🏁 Script executed:
sed -n '1000,1060p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 3346
🏁 Script executed:
rg 'childWriterPendingNumDocs.incrementAndGet\(\)' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -nRepository: opensearch-project/OpenSearch
Length of output: 400
🏁 Script executed:
sed -n '940,950p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 521
🏁 Script executed:
sed -n '991,1001p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 1021
🏁 Script executed:
sed -n '1034,1054p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 1483
🏁 Script executed:
sed -n '900,950p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 2145
🏁 Script executed:
sed -n '800,850p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 2378
🏁 Script executed:
rg 'addDeleteEntryToWriter' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java -B 5 -A 10Repository: opensearch-project/OpenSearch
Length of output: 3395
🌐 Web query:
Lucene IndexWriter getPendingNumDocs() method semantics Lucene 9 documentation
💡 Result:
IndexWriter.getPendingNumDocs() returns the current total number of documents in the index including documents that are in the process of being added (i.e., reserved but not yet fully committed/visible). The API is marked experimental and its semantics may change across releases. [1][2]
Sources:
- Lucene IndexWriter javadoc (core docs). [1]
- Lucene core API index listing (getPendingNumDocs). [2]
🏁 Script executed:
sed -n '1055,1075p' ./server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.javaRepository: opensearch-project/OpenSearch
Length of output: 769
childWriterPendingNumDocs incorrectly increments for parent writer operations in deleteInLucene.
deleteInLucene(...) modifies accumulatingIndexWriter (the parent) but still increments childWriterPendingNumDocs (line 1072). Since getPendingNumDocs() returns childWriterPendingNumDocs.get() + accumulatingIndexWriter.getPendingNumDocs() (lines 747-748), the operations on the parent writer get counted twice: once in accumulatingIndexWriter.getPendingNumDocs() and again in childWriterPendingNumDocs. Per the javadoc (lines 131-145), childWriterPendingNumDocs should only track child-level operations, not parent operations. This causes inflated pending doc counts and can trigger false document limit rejections.
Proposed fix
private void deleteInLucene(
Term uid,
boolean isStaleOperation,
IndexWriter currentWriter,
Iterable<? extends IndexableField> doc,
Field... softDeletesField
) throws IOException {
if (isStaleOperation) {
currentWriter.addDocument(doc);
} else {
currentWriter.softUpdateDocument(uid, doc, softDeletesField);
}
-
- childWriterPendingNumDocs.incrementAndGet();
}Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @server/src/main/java/org/opensearch/index/engine/CompositeIndexWriter.java
around lines 131 - 145, The deleteInLucene method incorrectly increments
childWriterPendingNumDocs when it performs deletions on the parent
accumulatingIndexWriter; remove that increment (or guard it) so
childWriterPendingNumDocs is only modified for child-level writer operations.
Specifically, in deleteInLucene avoid calling
childWriterPendingNumDocs.increment (or only call it when the operation targets
a child IndexWriter rather than accumulatingIndexWriter) so getPendingNumDocs()
no longer double-counts parent operations; update any accompanying comments to
reflect that childWriterPendingNumDocs tracks only child-level modifications.
Description
Fixing indexing regression and bug fixes for grouping criteria. For testing grouping criteria changes, enabled the grouping criteria on local and tested with setting criteria. Wil raise the changes for integ test enablement for CAS in a separate PR as that require decent changes in integ test as well.
Related Issues
#19919
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Summary by CodeRabbit
Bug Fixes
Improvements
Tests
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.