LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception#20193
Conversation
📝 WalkthroughWalkthroughSelect LeafReader base when wrapping for live-doc handling to respect derived-source readers; propagate ScriptService into MapperService test factories; add context-aware scripting test helpers and new derived-source snapshot tests; update changelog. Changes
Sequence Diagram(s)sequenceDiagram
participant DirReader as DirectoryReaderWithAllLiveDocs
participant Leaf as LeafReader
participant Unwrap as (Filter)LeafReader unwrapping
participant Segment as SegmentReader
participant Live as LeafReaderWithLiveDocs
DirReader->>Leaf: iterate leaves
alt isDerivedSourceEnabled(Leaf) == true
Leaf-->>DirReader: derived-source enabled
DirReader->>Live: wrap using original Leaf as base
else
Leaf->>Unwrap: unwrap FilterLeafReader(s) if present
Unwrap-->>Segment: obtain underlying SegmentReader
DirReader->>Live: wrap using SegmentReader as base
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧹 Recent nitpick comments
📜 Recent review detailsConfiguration used: Organization UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
🧰 Additional context used🧠 Learnings (3)📚 Learning: 2026-01-13T17:40:27.167ZApplied to files:
📚 Learning: 2026-01-13T17:40:34.780ZApplied to files:
📚 Learning: 2025-12-02T22:44:14.799ZApplied to files:
🧬 Code graph analysis (2)server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (2)
🪛 LanguageToolCHANGELOG.md[grammar] ~35-~35: Use a hyphen to join words. (QB_NEW_EN_HYPHEN) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
🔇 Additional comments (10)
✏️ Tip: You can disable this entire section by setting Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ast-grep (0.40.5)server/src/test/java/org/opensearch/index/engine/InternalEngineTests.javaComment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (2)
8314-8318: Helper-based EngineConfig construction for derived source looks correct but can be tightenedThe refactor to
createEngineConfigWithMapperSupplierForDerivedSource(Store, boolean)and its call sites (contextAwareEnabled=falsefor the existing tests andtruefor the context-aware variant) is consistent: settings toggleINDEX_DERIVED_SOURCE_SETTINGand, when requested,INDEX_CONTEXT_AWARE_ENABLED_SETTING, and the mapping defines a storedvaluefield with an optionalcontext_aware_groupingblock.Two small follow-ups to consider:
- Keep MapperService and IndexSettings aligned:
createEngineConfigWithMapperSupplierForDerivedSourcebuilds a freshIndexSettingsinstance butcreateMapperServiceForContextAwareIndex()likely uses a default set of settings. If that factory doesn’t already honor the derived‑source and context‑aware flags, consider wiring it to the sameIndexSettings(or aSettingsinstance derived from it) to avoid subtle divergence between the mapper and engine config.- Reduce duplication in mapping construction: the
valuefield definition is duplicated between the two branches. A tiny helper that builds the base_doc.properties.valueobject and optionally attachescontext_aware_groupingwould keep the mapping definition DRY and easier to evolve.Functionally this looks fine; these are mainly maintainability and consistency tweaks.
Also applies to: 8548-8552, 8843-8922
8377-8538: Context-aware derived source snapshot test is sound; consider asserting on derived_sourcetooThe new
testNewChangesSnapshotWithDeleteAndUpdateWithDerivedSourceAndContextAwareEnabledexercises a good mix of inserts, deletes, and updates with derived source + context-aware enabled, and validates:
- all operations are present in the
newChangesSnapshotrange,- deletes truly remove documents, and
- updates are visible via stored
valuecontent.Since the underlying bug here is about keeping the derived-source wrappers around in Lucene (so
_sourceis non-null and correct), you may want to also assert on theTranslog.Indexpayload for at least some operations, e.g. comparingindexOp.source().utf8ToString()against an expected JSON representation for created/updated docs. That would more directly guard the contract thatLuceneChangesSnapshotcontinues to surface the derived source correctly in the context-aware path.If you decide to add such an assertion, you can mirror the style used in
testNewChangesSnapshotWithDerivedSourceto avoid duplication.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
server/src/main/java/org/opensearch/common/lucene/Lucene.java(1 hunks)server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java(4 hunks)test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java(3 hunks)test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java(6 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧬 Code graph analysis (1)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (1)
server/src/main/java/org/opensearch/script/ScriptModule.java (1)
ScriptModule(54-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: detect-breaking-change
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
🔇 Additional comments (9)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
934-955: Using the originalleafhere correctly preserves wrapper readers (fixing derived‑source behavior)Switching the delegate from
segmentReadertoleafwhen constructingLeafReaderWithLiveDocskeeps any existingLeafReaderwrappers (e.g., derived‑source / context‑aware readers) in the chain while still applyinghardLiveDocsand the correctednumDocscomputed from the underlyingSegmentReader. This also makes the behavior consistent with thehardLiveDocs == nullbranch, which already wrappedleaf.Assuming our wrappers are doc‑ID preserving (as they should be for
LeafReaderleaves of aDirectoryReader), this looks like the right fix and should avoid droppingSubReaderWrapperstate during recovery after non‑aborting exceptions.server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8753-8753: Locking the context-aware feature flag on these tests is appropriateAdding
@LockFeatureFlag(CONTEXT_AWARE_MIGRATION_EXPERIMENTAL_FLAG)to the composite IndexWriter failure tests matches their reliance onINDEX_CONTEXT_AWARE_ENABLED_SETTINGand ensures they won’t silently start running with the wrong behavior when the feature flag is toggled.Just double-check that the
LockFeatureFlagannotation type is correctly imported/visible to this class (I don’t see a new import in this diff), so these tests continue to compile once the feature is wired in.Also applies to: 8798-8798
test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java (2)
65-82: LGTM! Clean method overloading for ScriptService injection.The delegation pattern maintains backward compatibility while enabling ScriptService injection for context-aware index testing.
85-112: LGTM! ScriptService properly propagated to MapperService constructor.The ScriptService parameter is correctly passed through to the final MapperService instantiation.
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (5)
126-132: LGTM! Necessary imports for scripting infrastructure.These imports support the new
ContextAwareCustomScriptPluginandcreateMapperServiceForContextAwareIndex()method.
396-400: LGTM! Minimal test helper for grouping criteria documents.The method follows the existing pattern of test document helpers and provides a focused utility for context-aware index tests.
1626-1655: LGTM! Properly configured MapperService factory for context-aware index testing.The method correctly wires up the
ScriptModule,ScriptService, andMapperServicewith the context-aware index setting enabled. This enables testing scenarios that require derived source functionality.
1661-1690: LGTM! Well-structured mock script plugin for testing.The
ContextAwareCustomScriptPlugincorrectly implements theScriptPlugininterface and provides the necessary mock scripts for context-aware index testing. Using"painless"as the script language name appropriately mimics production behavior in tests.
146-166: Verify thetoListimport is used in this file.The
CollectionandHashMapimports are clearly used by the new code. However, thetoListstatic import on line 166 requires verification to confirm it is actually used somewhere in this file, as it is not visible in the provided code snippet.
854b133 to
8f33904
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
CHANGELOG.md (1)
103-103: Minor typo in changelog entry.The entry reads "incase" but should be "in case" (two words).
Apply this diff to fix the typo:
-- LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception ([#20193](https://github.com/opensearch-project/OpenSearch/pull/20193)) +- LeafReader should not remove SubReaderWrappers in case IndexWriter encounters a non aborting Exception ([#20193](https://github.com/opensearch-project/OpenSearch/pull/20193))server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (3)
8313-8374: Avoid double-closingStorewhen using try-with-resources plusIOUtils.closeIn these three tests you wrap
Storein try-with-resources and also pass the samestoreintoIOUtils.close(engine, store)in the innerfinally. That results in two close attempts for the sameStoreinstance.While OpenSearch
Storeis generally defensive about repeated closes, this pattern is unnecessary and can hide real close-time issues behind swallowedAlreadyClosedExceptions.Consider only closing
enginein the inner finally and letting try-with-resources handle theStore, e.g.:- try (Store store = createStore()) { + try (Store store = createStore()) { EngineConfig engineConfig = createEngineConfigWithMapperSupplierForDerivedSource(store, false); InternalEngine engine = null; try { engine = createEngine(engineConfig); // ... } finally { - IOUtils.close(engine, store); + IOUtils.close(engine); } }(and similarly for the other two tests).
Also applies to: 8385-8536, 8547-8698
8377-8538: Reduce duplication between derived-source delete/update snapshot tests
testNewChangesSnapshotWithDeleteAndUpdateWithDerivedSourceAndContextAwareEnabledandtestNewChangesSnapshotWithDeleteAndUpdateWithDerivedSourceshare almost all of their structure: randomized index/delete/update history,operationsbookkeeping, and snapshot assertions, differing mainly in thecontextAwareEnabledflag and which document factory is used.To keep these tests easier to maintain, consider extracting a shared helper like:
private void assertNewChangesSnapshotWithDeleteAndUpdateDerivedSource(boolean contextAwareEnabled) { ... }and have each test call it with
true/false. That way any future bugfix to the scenario only needs to be updated in one place.Also applies to: 8540-8700
8843-8922: Clarify mapper service choice increateEngineConfigWithMapperSupplierForDerivedSourceThis helper always uses
createMapperServiceForContextAwareIndex()regardless of thecontextAwareEnabledflag, and only toggles index settings/mapping (extracontext_aware_groupingblock +INDEX_CONTEXT_AWARE_ENABLED_SETTING) based on that flag.If
createMapperServiceForContextAwareIndex()has additional behavior or assumptions specific to context-aware indices, using it for the non-context-aware path may unintentionally change the baseline derived-source tests’ behavior compared to the old implementation.Two suggestions:
- Either guard the mapper-service choice on the flag (using the “normal” mapper service for
contextAwareEnabled == false, and the context-aware variant only whentrue), or- Add a brief comment explaining that this helper intentionally reuses the context-aware mapper service even when
contextAwareEnabledis false, so future readers don’t assume it’s a bug.This is mostly about clarity and avoiding surprises in future refactors of the mapper creation path.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.md(1 hunks)server/src/main/java/org/opensearch/common/lucene/Lucene.java(1 hunks)server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java(4 hunks)test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java(3 hunks)test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧬 Code graph analysis (1)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (2)
test/framework/src/main/java/org/opensearch/script/MockScriptEngine.java (1)
MockScriptEngine(74-825)server/src/main/java/org/opensearch/script/ScriptModule.java (1)
ScriptModule(54-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: Analyze (java)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, macos-15)
🔇 Additional comments (6)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (1)
954-954: LGTM! Core fix correctly preserves wrapper chain.This change is the essential fix for the issue. By passing
leaf(the original reader with wrappers intact) instead ofsegmentReader(the unwrapped SegmentReader), the code now preserves SubReaderWrappers like DerivedSourceDirectoryReader. This prevents the MissingHistoryOperationsException during translog replay when derived source is enabled, as the wrapper needed to extract document source is no longer removed.test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (4)
126-132: LGTM! Imports correctly support new test infrastructure.The added imports (Plugin, ScriptPlugin, MockScriptEngine, ScriptContext, ScriptEngine, ScriptModule, ScriptService, Collection, HashMap, toList) are all used by the new test infrastructure for context-aware scripting support.
Also applies to: 146-146, 149-149, 166-166
396-400: LGTM! Test helper method follows established patterns.The
testDocumentWithGroupingCriteria()method is a straightforward test helper that creates a document with grouping criteria set, consistent with other document creation helpers in this file.
1626-1655: LGTM! Test infrastructure for context-aware indexing.The
createMapperServiceForContextAwareIndex()method provides essential test infrastructure for validating context-aware indexing scenarios with derived source. The method correctly:
- Enables context-aware settings via
INDEX_CONTEXT_AWARE_ENABLED_SETTING- Sets up a ScriptService with the custom test plugin
- Configures the MapperService appropriately for testing
This supports validation of the fix for preserving SubReaderWrappers.
1661-1690: LGTM! Test script plugin implementation is appropriate.The
ContextAwareCustomScriptPluginprovides mock script behavior for testing context-aware functionality. The implementation:
- Correctly extends Plugin and implements ScriptPlugin
- Provides appropriate mock scripts for testing ("ctx.op='delete'" and "String.valueOf(grouping_criteria)")
- Uses MockScriptEngine with the "painless" language identifier
- Follows established patterns for test script plugins
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8307-8375: Derived-source snapshot happy-path coverage looks goodThis test exercises
newChangesSnapshotover a fully derived-source index and asserts that eachTranslog.Indexoperation has a non-null, correctly-shaped source ({"value":"test"}) and that seqNos are contiguous from 0..N-1. That directly guards the regression from the linked issue and provides good safety around the unwrap/wrapper behavior.
|
❌ Gradle check result for 8f33904: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
da8dfc4 to
cb859ee
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8944-9022:createEngineConfigWithMapperSupplierForDerivedSource(...)creates a MapperService with mismatched settings.The call to
createMapperServiceForContextAwareIndex()creates an IndexMetadata with hardcoded settings (context-aware enabled, but no derived-source setting), while the EngineConfig is constructed withindexSettingsthat includes bothINDEX_DERIVED_SOURCE_SETTINGandINDEX_CONTEXT_AWARE_ENABLED_SETTING(conditionally). This mismatch means the MapperService does not reflect the full configuration intended for the engine. Consider:
- Make the MapperService creation conditional on
contextAwareEnabled(usecreateMapperServiceForContextAwareIndex()when true, otherwisecreateMapperService())- Ideally, pass the newly constructed
indexSettingsinto the mapper service factory to ensure consistencyOptionally, specify
"lang": "painless"in the script block for clarity, though Painless is the default for inline mapping scripts.
♻️ Duplicate comments (3)
CHANGELOG.md (1)
35-35: Minor typographical and grammatical corrections needed.The changelog entry has two minor issues:
- "incase" should be "in case" (two words)
- "non aborting" should be "non-aborting" (hyphenated compound adjective)
🔧 Suggested fix
-- LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193)) +- LeafReader should not remove SubReaderWrappers in case IndexWriter encounters a non-aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193))server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (2)
8641-8801: Same: validate derived source from the snapshot, not viaengine.get(...).This non-context-aware test still validates “updated” docs via
engine.get(...), which can hide issues whereTranslog.Index#source()returned bynewChangesSnapshot()is null/incorrect.Sketch change (same idea as the context-aware variant)
try (Translog.Snapshot snapshot = engine.newChangesSnapshot("test", 0, operations.size() - 1, true, true)) { int count = 0; + final int updatesStartSeqNo = numDocs + deletedDocs.size(); Translog.Operation operation; while ((operation = snapshot.next()) != null) { if (operation instanceof Translog.Index) { Translog.Index indexOp = (Translog.Index) operation; String docId = indexOp.id(); - if (updatedDocs.contains(docId)) { - // Verify updated content using get - try ( - Engine.GetResult get = engine.get( - new Engine.Get(true, true, docId, newUid(docId)), - engine::acquireSearcher - ) - ) { - assertTrue("Document " + docId + " should exist", get.exists()); - StoredFields storedFields = get.docIdAndVersion().reader.storedFields(); - org.apache.lucene.document.Document document = storedFields.document(get.docIdAndVersion().docId); - assertEquals( - "Document " + docId + " should have updated value", - "updated", - document.getField("value").stringValue() - ); - } - } + assertNotNull("snapshot index op must have derived source", indexOp.source()); + final boolean isUpdateOp = indexOp.seqNo() >= updatesStartSeqNo; + if (isUpdateOp) { + assertThat(indexOp.source().utf8ToString(), containsString("\"updated\"")); + assertTrue("update op docId should be in updated set", updatedDocs.contains(docId)); + } else { + assertThat(indexOp.source().utf8ToString(), containsString("\"test\"")); + } } else if (operation instanceof Translog.Delete) {Also applies here: avoid closing the try-with-resources
storeagain infinally.
8478-8639: Snapshot test should assertTranslog.Index#source()directly (currently can miss the regression).The “updated doc” verification uses
engine.get(...)+ stored fields, which can still pass even ifnewChangesSnapshot()produced aTranslog.Indexwith missing/stale source (the bug this PR is about). This is the same concern raised previously.Suggested direction: validate per-op derived source from the snapshot
- // Test snapshot with all operations + // Test snapshot with all operations try (Translog.Snapshot snapshot = engine.newChangesSnapshot("test", 0, operations.size() - 1, true, true)) { int count = 0; + final int updatesStartSeqNo = numDocs + deletedDocs.size(); Translog.Operation operation; while ((operation = snapshot.next()) != null) { if (operation instanceof Translog.Index) { Translog.Index indexOp = (Translog.Index) operation; String docId = indexOp.id(); - if (updatedDocs.contains(docId)) { - // Verify updated content using get - try ( - Engine.GetResult get = engine.get( - new Engine.Get(true, true, docId, newUid(docId)), - engine::acquireSearcher - ) - ) { - assertTrue("Document " + docId + " should exist", get.exists()); - StoredFields storedFields = get.docIdAndVersion().reader.storedFields(); - org.apache.lucene.document.Document document = storedFields.document(get.docIdAndVersion().docId); - assertEquals( - "Document " + docId + " should have updated value", - "updated", - document.getField("value").stringValue() - ); - } - } + assertNotNull("snapshot index op must have derived source", indexOp.source()); + final boolean isUpdateOp = indexOp.seqNo() >= updatesStartSeqNo; + if (isUpdateOp) { + assertThat(indexOp.source().utf8ToString(), containsString("\"updated\"")); + assertTrue("update op docId should be in updated set", updatedDocs.contains(docId)); + } else { + assertThat(indexOp.source().utf8ToString(), containsString("\"test\"")); + } } else if (operation instanceof Translog.Delete) { String docId = ((Translog.Delete) operation).id(); assertTrue("Document " + docId + " should be in deleted set", deletedDocs.contains(docId)); @@ count++; }Also: this method has the same try-with-resources +
IOUtils.close(engine, store)double-close/shadowing pattern as the other test—consider switching toIOUtils.close(engine)only (let try-with-resources closestore).
🧹 Nitpick comments (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8408-8476: Make derived-source assertion less brittle + avoid double-closing the localStore.
assertEquals("{\"value\":\"test\"}", indexOp.source().utf8ToString())is fragile (ordering/whitespace/extra fields). Prefer asserting non-null and matching the field value more loosely (or parse JSON).- The
Store storein try-with-resources is also closed infinally(IOUtils.close(engine, store)), which is risky for ref-counted resources and makes failures harder to diagnose. Also, localstore/engineshadow the test class fields.Proposed tweak (within this method)
- try (Store store = createStore()) { + try (Store store = createStore()) { EngineConfig engineConfig = createEngineConfigWithMapperSupplierForDerivedSource(store, false); InternalEngine engine = null; try { engine = createEngine(engineConfig); @@ while ((operation = snapshot.next()) != null) { @@ Translog.Index indexOp = (Translog.Index) operation; - assertEquals( - "Document " + indexOp.id() + " should have updated value", - "{\"value\":\"test\"}", - indexOp.source().utf8ToString() - ); + assertNotNull("derived source must not be null", indexOp.source()); + assertThat(indexOp.source().utf8ToString(), containsString("\"value\"")); + assertThat(indexOp.source().utf8ToString(), containsString("\"test\"")); count++; } @@ } finally { - IOUtils.close(engine, store); + IOUtils.close(engine); } }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdserver/src/main/java/org/opensearch/common/lucene/Lucene.javaserver/src/test/java/org/opensearch/index/engine/InternalEngineTests.javatest/framework/src/main/java/org/opensearch/index/MapperTestUtils.javatest/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: In OpenSearch's CodecService, when registries are processed during construction (via EnginePlugin::getAdditionalCodecs), the defaultCodec supplier passed to CodecRegistry.getCodecs() must remain dynamic (this::defaultCodec) rather than captured upfront, because registries can replace the default codec during iteration. The contract is that registries should not invoke the supplier during getCodecs() execution since CodecService is still initializing.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.
Applied to files:
server/src/main/java/org/opensearch/common/lucene/Lucene.java
🧬 Code graph analysis (2)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (1)
server/src/main/java/org/opensearch/script/ScriptModule.java (1)
ScriptModule(54-127)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
server/src/main/java/org/opensearch/common/lucene/index/DerivedSourceLeafReader.java (1)
DerivedSourceLeafReader(25-68)server/src/main/java/org/opensearch/index/codec/CriteriaBasedCodec.java (1)
CriteriaBasedCodec(26-72)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...pers incase IndexWriter encounters a non aborting Exception ([#20193](https://git...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: Analyze (java)
🔇 Additional comments (8)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
955-959: LGTM! Conditional wrapping preserves derived source functionality.The logic correctly preserves the original
LeafReader(withDerivedSourceLeafReaderwrapper intact) when derived source is enabled, ensuring the source field can be properly reconstructed during translog replay. When derived source is not enabled, it falls back to the previous behavior using the unwrappedSegmentReader.
966-980: Recursive unwrapping logic is correct.The helper method properly traverses the reader wrapper chain to detect
DerivedSourceLeafReader. The recursive handling of bothFilterLeafReaderandFilterCodecReaderensures complete coverage of potential wrapper compositions.One consideration: the method could potentially encounter deep nesting in edge cases, but typical reader chains are shallow, so this is not a practical concern.
test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java (2)
65-83: LGTM! Clean method overloading pattern.The refactoring maintains backward compatibility by providing overloads that default
scriptServicetonull, while enabling new test scenarios to inject aScriptServicewhen needed for context-aware indexing tests.
85-112: LGTM! ScriptService properly wired into MapperService.The
ScriptServiceparameter is correctly propagated through the method chain and passed to theMapperServiceconstructor, enabling script-based functionality in tests.test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (4)
126-132: LGTM! Required imports for scripting test infrastructure.The new imports support the
ContextAwareCustomScriptPluginandcreateMapperServiceForContextAwareIndexadditions.
395-399: LGTM! Simple test utility method.Creates a minimal document with grouping criteria for context-aware indexing tests.
1625-1651: LGTM! Properly wires context-aware scripting for tests.The method correctly:
- Configures
IndexSettingswithINDEX_CONTEXT_AWARE_ENABLED_SETTINGenabled- Creates a
ScriptServiceviaScriptModulewith the custom plugin- Passes the
ScriptServicetoMapperTestUtils.newMapperServiceThis enables comprehensive testing of context-aware derived-source scenarios.
1657-1686: LGTM! Well-structured mock script plugin for tests.The
ContextAwareCustomScriptPluginproperly implementsScriptPluginand provides mock scripts for testing context-aware indexing scenarios. The use of"painless"as the script language name ensures compatibility with production code paths that expect this language.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #20193 +/- ##
============================================
+ Coverage 73.32% 73.37% +0.04%
+ Complexity 71862 71848 -14
============================================
Files 5793 5792 -1
Lines 328644 328632 -12
Branches 47313 47309 -4
============================================
+ Hits 240990 241118 +128
+ Misses 68324 68199 -125
+ Partials 19330 19315 -15 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
be52eec to
a17a9a5
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
9137-9214: PassIndexSettingsto mapper initialization to ensure it matches the engine config's settings.The
createMapperServiceForContextAwareIndex()helper is parameterless and hardcoded to create settings with onlyINDEX_CONTEXT_AWARE_ENABLED_SETTING=true, but this method buildsindexSettingswith bothINDEX_DERIVED_SOURCE_SETTING=trueandINDEX_CONTEXT_AWARE_ENABLED_SETTING=true. TheMapperServicecreated will not reflect the derived-source configuration, causing the test to run without actually exercising the intended scenario.Additionally, line 9201 initializes
CodecServicewithconfig.getIndexSettings()instead of the locally builtindexSettings, risking unintended drift.Suggested fix:
- Refactor
createMapperServiceForContextAwareIndex()to acceptIndexSettingsas a parameter, or create a new overload- Update line 9201 to use
indexSettingsinstead ofconfig.getIndexSettings()
🤖 Fix all issues with AI agents
In `@server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java`:
- Around line 8658-8683: The test’s custom FilterDirectoryReader used in
createEngineForWrapper currently returns the raw DirectoryReader from
doWrapDirectoryReader, which strips the wrapper on reopen and weakens the
wrapper-preservation test; modify the anonymous FilterDirectoryReader
implementation of doWrapDirectoryReader to re-wrap the incoming DirectoryReader
(apply the same FilterDirectoryReader wrapper logic used on initial creation,
i.e., return a new FilterDirectoryReader(in, same SubReaderWrapper) that
preserves getReaderCacheHelper behavior) so that wrappers survive reopen/refresh
and the test actually exercises wrapper retention.
♻️ Duplicate comments (2)
CHANGELOG.md (1)
35-35: Minor typographical corrections needed.The changelog entry has two minor issues:
- "incase" should be "in case" (two words)
- "non aborting" should be "non-aborting" (hyphenated compound adjective)
🔧 Suggested fix
-- LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193)) +- LeafReader should not remove SubReaderWrappers in case IndexWriter encounters a non-aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193))server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8578-8639: Assert onTranslog.Index#source()(notengine.get(...)) so the test actually covers derived-source snapshot correctness.Right now, “updated” verification uses
engine.get(...)+ stored fields, which can still pass even ifnewChangesSnapshot(...)returns aTranslog.Indexwith missing/stalesource()(the regression you’re targeting). Also, docs that were later updated will have multipleTranslog.Indexops in the snapshot; usingengine.get(...)reads only the latest state and can’t validate per-op source.Proposed change (validate snapshot op source directly)
while ((operation = snapshot.next()) != null) { if (operation instanceof Translog.Index) { Translog.Index indexOp = (Translog.Index) operation; String docId = indexOp.id(); - if (updatedDocs.contains(docId)) { - // Verify updated content using get - try ( - Engine.GetResult get = engine.get( - new Engine.Get(true, true, docId, newUid(docId)), - engine::acquireSearcher - ) - ) { - assertTrue("Document " + docId + " should exist", get.exists()); - StoredFields storedFields = get.docIdAndVersion().reader.storedFields(); - org.apache.lucene.document.Document document = storedFields.document(get.docIdAndVersion().docId); - assertEquals( - "Document " + docId + " should have updated value", - "updated", - document.getField("value").stringValue() - ); - } - } + assertNotNull("snapshot index op must have derived source", indexOp.source()); + final Map<String, Object> m = + XContentHelper.convertToMap(indexOp.source(), false, MediaTypeRegistry.JSON).v2(); + + // Distinguish original index vs update by version (original: 0..numDocs-1, update: >= numDocs + numDocsToDelete) + final boolean isUpdateOp = indexOp.version() >= (long) numDocs + (long) numDocsToDelete; + assertEquals("value mismatch for docId=" + docId, isUpdateOp ? "updated" : "test", m.get("value")); } else if (operation instanceof Translog.Delete) { String docId = ((Translog.Delete) operation).id(); assertTrue("Document " + docId + " should be in deleted set", deletedDocs.contains(docId)); // Verify document is actually deleted try ( Engine.GetResult get = engine.get( new Engine.Get(true, false, docId, newUid(docId)), engine::acquireSearcher ) ) { assertFalse("Document " + docId + " should not exist", get.exists()); } } count++; }Also applies to: 8766-8827
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdserver/src/main/java/org/opensearch/common/lucene/Lucene.javaserver/src/test/java/org/opensearch/index/engine/InternalEngineTests.javatest/framework/src/main/java/org/opensearch/index/MapperTestUtils.javatest/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: In OpenSearch's CodecService, when registries are processed during construction (via EnginePlugin::getAdditionalCodecs), the defaultCodec supplier passed to CodecRegistry.getCodecs() must remain dynamic (this::defaultCodec) rather than captured upfront, because registries can replace the default codec during iteration. The contract is that registries should not invoke the supplier during getCodecs() execution since CodecService is still initializing.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.
Applied to files:
server/src/main/java/org/opensearch/common/lucene/Lucene.java
🧬 Code graph analysis (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
server/src/main/java/org/opensearch/common/lucene/index/DerivedSourceDirectoryReader.java (1)
DerivedSourceDirectoryReader(25-60)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...pers incase IndexWriter encounters a non aborting Exception ([#20193](https://git...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: Analyze (java)
🔇 Additional comments (11)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (3)
87-87: LGTM!Import is correctly added for
DerivedSourceLeafReaderto support the derived source detection logic.
955-959: LGTM! Core fix correctly preserves wrapper chain for derived source.This conditional correctly addresses the issue: when
DerivedSourceLeafReaderis in the wrapper chain, the originalleafis used (preserving the wrapper), otherwise thesegmentReaderis used (existing behavior). This ensures the source field can be reconstructed during translog replay.
966-984: LGTM! Helper method correctly detects derived source in wrapper chain.The method properly handles the recursive unwrapping of
FilterLeafReaderto detectDerivedSourceLeafReader. The Javadoc accurately explains whyFilterCodecReaderis not recursively handled (it cannot wrapDerivedSourceLeafReadersince the latter is anIndexReader, not aCodecReader).test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (5)
44-46: LGTM!All new imports are necessary for the new test utilities:
FilterDirectoryReader/FilterLeafReaderfor wrapper handling,OpenSearchExceptionfor error handling increateEngineForWrapper, script-related imports for context-aware indexing tests, andCheckedFunction/Collection/HashMapfor utility method signatures.Also applies to: 67-67, 76-76, 119-119, 131-137, 151-151, 154-154
400-404: LGTM!Clean helper method for creating test documents with grouping criteria.
719-753: LGTM!The
createEngineForWrappermethod properly creates an engine with customDirectoryReaderwrapping capability. This enables tests to exercise the derived source wrapper behavior. The pattern follows existingcreateEnginemethods while adding the wrapper hook viaIndexShard.wrapSearcher.
1666-1692: LGTM!The
createMapperServiceForContextAwareIndexmethod properly wires up aMapperServicewithScriptServicesupport for context-aware indexing tests. The setup correctly enables theINDEX_CONTEXT_AWARE_ENABLED_SETTINGand uses the customContextAwareCustomScriptPluginfor script execution.
1698-1727: LGTM!The
ContextAwareCustomScriptPluginproperly implementsScriptPluginwith mock scripts for testing context-aware indexing. The scripts for delete operation and grouping criteria are appropriately simple for test purposes.test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java (3)
51-51: LGTM!Import for
ScriptServiceis correctly added to support the new overloads.
65-83: LGTM!Clean refactoring that maintains backward compatibility: the original 4-parameter overload now delegates to the new 5-parameter overload with
nullforscriptService. Tests that don't need script support continue to work unchanged.
85-112: LGTM!The method signature is updated to accept
ScriptServiceand correctly passes it to theMapperServiceconstructor at line 110. This enables context-aware indexing tests that require script execution capability.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Outdated
Show resolved
Hide resolved
a17a9a5 to
290d88d
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
CHANGELOG.md (1)
35-35: Minor typographical corrections needed.The entry has two minor issues:
- "incase" → "in case" (two words)
- "non aborting" → "non-aborting" (hyphenated compound adjective)
🔎 Suggested correction
-- LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193)) +- LeafReader should not remove SubReaderWrappers in case IndexWriter encounters a non-aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193))server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (2)
8483-8644: Snapshot test likely misses the regression: assertTranslog.Index#source()directly (not justengine.get()).The “updated” assertions go through
engine.get(...), so the test can pass even ifengine.newChangesSnapshot(...)returned an index op with a null/stalesource()(the bug this PR is about). Consider asserting((Translog.Index) operation).source()is non-null and matches expected JSON for both the initial and updated ops (e.g., distinguish byoperation.seqNo()/operation.version()).
8658-8683: Wrapper-preservation test is weakened:doWrapDirectoryReaderreturningindrops wrappers on reopen.If the goal is to prove wrappers (including derived-source) survive refresh/reopen and aren’t stripped by unwrap logic,
doWrapDirectoryReader(...)should re-wrapin(similar toDerivedSourceDirectoryReader#doWrapDirectoryReader), otherwise the wrapper can silently disappear and the test won’t exercise the intended behavior.
🧹 Nitpick comments (1)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (1)
719-753: Consider extracting duplicate store initialization logic.The store initialization logic (lines 720-731) is duplicated from
createEngine()(lines 761-773). Consider extracting this into a private helper method to improve maintainability.Also, minor formatting: there are extra blank lines at lines 732 and 746-747 that could be removed for consistency.
♻️ Suggested refactoring approach
private void initializeStoreIfNeeded(EngineConfig config) throws IOException { final Store store = config.getStore(); final Directory directory = store.directory(); if (Lucene.indexExists(directory) == false) { store.createEmpty(config.getIndexSettings().getIndexVersionCreated().luceneVersion); final String translogUuid = Translog.createEmptyTranslog( config.getTranslogConfig().getTranslogPath(), SequenceNumbers.NO_OPS_PERFORMED, shardId, primaryTerm.get() ); store.associateIndexWithNewTranslog(translogUuid); } }Then use this helper in both
createEngineForWrapper()andcreateEngine().
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdserver/src/main/java/org/opensearch/common/lucene/Lucene.javaserver/src/test/java/org/opensearch/index/engine/InternalEngineTests.javatest/framework/src/main/java/org/opensearch/index/MapperTestUtils.javatest/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🚧 Files skipped from review as they are similar to previous changes (1)
- server/src/main/java/org/opensearch/common/lucene/Lucene.java
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: In OpenSearch's CodecService, when registries are processed during construction (via EnginePlugin::getAdditionalCodecs), the defaultCodec supplier passed to CodecRegistry.getCodecs() must remain dynamic (this::defaultCodec) rather than captured upfront, because registries can replace the default codec during iteration. The contract is that registries should not invoke the supplier during getCodecs() execution since CodecService is still initializing.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧬 Code graph analysis (2)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (2)
server/src/main/java/org/opensearch/common/lucene/index/DerivedSourceDirectoryReader.java (1)
DerivedSourceDirectoryReader(25-60)server/src/main/java/org/opensearch/index/engine/Engine.java (1)
Operation(1502-1602)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (3)
test/framework/src/main/java/org/opensearch/script/MockScriptEngine.java (1)
MockScriptEngine(74-825)server/src/main/java/org/opensearch/script/ScriptModule.java (1)
ScriptModule(54-127)test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java (1)
MapperTestUtils(63-161)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...pers incase IndexWriter encounters a non aborting Exception ([#20193](https://git...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: detect-breaking-change
🔇 Additional comments (4)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (3)
400-404: LGTM!Simple and clear test utility method that follows the existing pattern of document creation helpers.
1666-1692: LGTM!The method properly configures a context-aware MapperService with ScriptService wiring, following the existing pattern of
createMapperService(). The use ofContextAwareCustomScriptPluginand the newMapperTestUtils.newMapperServiceoverload is appropriate.
1698-1727: Verify the intentional use of "painless" as the script language name.The
NAMEconstant is set to"painless", which is the same identifier as the actual Painless scripting engine. This appears intentional for mimicking Painless behavior in tests withMockScriptEngine, but please confirm this won't cause conflicts if tests are run with the actual Painless plugin loaded.test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java (1)
65-111: LGTM!Clean, backward-compatible refactoring that adds
ScriptServiceinjection capability while preserving existing API contracts:
- 4-arg version delegates to 5-arg with
null(preserves backward compatibility)- 5-arg version provides the new entry point for
ScriptServiceinjection- 6-arg version properly wires the
scriptServicetoMapperServiceconstructor
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Show resolved
Hide resolved
|
❌ Gradle check result for 290d88d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
290d88d to
c13f01f
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8834-8994: Same snapshot assertion gap exists in the non-context-aware derived-source test.
For the bug this PR targets, validatingTranslog.Index#source()is the key signal; validating viaengine.get()can mask the regression.
♻️ Duplicate comments (4)
CHANGELOG.md (1)
35-35: Minor typographical corrections needed.The changelog entry has minor issues: "incase" should be "in case" (two words), and "non aborting" should be "non-aborting" (hyphenated compound adjective).
🔧 Suggested fix
-- LeafReader should not remove SubReaderWrappers incase IndexWriter encounters a non aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193)) +- LeafReader should not remove SubReaderWrappers in case IndexWriter encounters a non-aborting Exception ([`#20193`](https://github.com/opensearch-project/OpenSearch/pull/20193))server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (3)
9137-9215: EngineConfig factory: avoid unconditional context-aware MapperService + ensure config consistency.
createMapperServiceForContextAwareIndex()is used even whencontextAwareEnabledis false.CodecService(and potentiallyTranslogConfig) are built fromconfig.getIndexSettings()while the EngineConfig is built with a differentindexSettingsinstance—this can make the test setup internally inconsistent/brittle.Proposed change (conditional mapper service + align CodecService with the new IndexSettings)
- if (contextAwareEnabled == true) { + if (contextAwareEnabled) { settingBuilder.put(IndexSettings.INDEX_CONTEXT_AWARE_ENABLED_SETTING.getKey(), true); mapping = XContentFactory.jsonBuilder() .startObject() .startObject("_doc") .startObject("properties") .startObject("value") .field("type", "text") .field("store", true) .endObject() .endObject() .startObject("context_aware_grouping") .array("fields", "value") .startObject("script") .field("source", "String.valueOf(grouping_criteria)") .endObject() .endObject() .endObject() .endObject(); } else { mapping = XContentFactory.jsonBuilder() .startObject() .startObject("_doc") .startObject("properties") .startObject("value") .field("type", "text") .field("store", true) .endObject() .endObject() .endObject() .endObject(); } @@ - final MapperService mapperService = createMapperServiceForContextAwareIndex(); + final MapperService mapperService = contextAwareEnabled ? createMapperServiceForContextAwareIndex() : createMapperService(); mapperService.merge("_doc", new CompressedXContent(mapping.toString()), MapperService.MergeReason.MAPPING_UPDATE); @@ - .codecService(new CodecService(null, config.getIndexSettings(), logger)) + .codecService(new CodecService(null, indexSettings, logger))
8480-8641: Snapshot assertions should validateTranslog.Index#source()directly (notengine.get(...)).
Right now the “updated” verification can still pass even ifnewChangesSnapshot(...)returns index ops with missing/stalesource().Proposed change (assert snapshot content rather than engine state)
while ((operation = snapshot.next()) != null) { if (operation instanceof Translog.Index) { Translog.Index indexOp = (Translog.Index) operation; String docId = indexOp.id(); + assertNotNull("snapshot index op must have source (derived)", indexOp.source()); if (updatedDocs.contains(docId)) { - // Verify updated content using get - try ( - Engine.GetResult get = engine.get( - new Engine.Get(true, true, docId, newUid(docId)), - engine::acquireSearcher - ) - ) { - assertTrue("Document " + docId + " should exist", get.exists()); - StoredFields storedFields = get.docIdAndVersion().reader.storedFields(); - org.apache.lucene.document.Document document = storedFields.document(get.docIdAndVersion().docId); - assertEquals( - "Document " + docId + " should have updated value", - "updated", - document.getField("value").stringValue() - ); - } + final Map<String, Object> m = + XContentHelper.convertToMap(indexOp.source(), false, MediaTypeRegistry.JSON).v2(); + assertEquals("updated", m.get("value")); } } else if (operation instanceof Translog.Delete) { String docId = ((Translog.Delete) operation).id(); assertTrue("Document " + docId + " should be in deleted set", deletedDocs.contains(docId));
8643-8832:FilterDirectoryReader#doWrapDirectoryReaderreturninginlikely drops the wrapper on reopen (weakening the test).
If the goal is to ensure SubReaderWrappers survive and aren’t stripped by an unwrap, the wrapper needs to be re-applied on reopen; alsogetReaderCacheHelper()should typically delegate to the currentin, not the originally-capturedreader.Proposed change (re-wrap on reopen + delegate cache helper correctly)
- engine = createEngineForWrapper( - engineConfig, - reader -> new FilterDirectoryReader(reader, new FilterDirectoryReader.SubReaderWrapper() { + final FilterDirectoryReader.SubReaderWrapper subReaderWrapper = new FilterDirectoryReader.SubReaderWrapper() { `@Override` public LeafReader wrap(LeafReader reader) { return new FilterLeafReader(reader) { `@Override` public CacheHelper getCoreCacheHelper() { return in.getCoreCacheHelper(); } `@Override` public CacheHelper getReaderCacheHelper() { return in.getReaderCacheHelper(); } }; } - }) { + }; + + engine = createEngineForWrapper( + engineConfig, + reader -> new FilterDirectoryReader(reader, subReaderWrapper) { `@Override` protected DirectoryReader doWrapDirectoryReader(DirectoryReader in) throws IOException { - return in; + return new FilterDirectoryReader(in, subReaderWrapper) { + `@Override` + protected DirectoryReader doWrapDirectoryReader(DirectoryReader in) throws IOException { + return this; + } + + `@Override` + public CacheHelper getReaderCacheHelper() { + return in.getReaderCacheHelper(); + } + }; } `@Override` public CacheHelper getReaderCacheHelper() { - return reader.getReaderCacheHelper(); + return in.getReaderCacheHelper(); } } );
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdserver/src/main/java/org/opensearch/common/lucene/Lucene.javaserver/src/test/java/org/opensearch/index/engine/InternalEngineTests.javatest/framework/src/main/java/org/opensearch/index/MapperTestUtils.javatest/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🚧 Files skipped from review as they are similar to previous changes (1)
- test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.
Applied to files:
server/src/main/java/org/opensearch/common/lucene/Lucene.java
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: In OpenSearch's CodecService, when registries are processed during construction (via EnginePlugin::getAdditionalCodecs), the defaultCodec supplier passed to CodecRegistry.getCodecs() must remain dynamic (this::defaultCodec) rather than captured upfront, because registries can replace the default codec during iteration. The contract is that registries should not invoke the supplier during getCodecs() execution since CodecService is still initializing.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🧬 Code graph analysis (2)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
server/src/main/java/org/opensearch/common/lucene/index/DerivedSourceLeafReader.java (1)
DerivedSourceLeafReader(25-68)server/src/main/java/org/opensearch/index/codec/CriteriaBasedCodec.java (1)
CriteriaBasedCodec(26-72)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (1)
test/framework/src/main/java/org/opensearch/script/MockScriptEngine.java (1)
MockScriptEngine(74-825)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...pers incase IndexWriter encounters a non aborting Exception ([#20193](https://git...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: Analyze (java)
🔇 Additional comments (7)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
955-959: LGTM - Conditional wrapping preserves DerivedSourceLeafReader when needed.The change correctly preserves the original leaf reader (with its
DerivedSourceLeafReaderwrapper intact) when derived source is enabled, while maintaining the previous behavior of usingsegmentReaderotherwise. This ensures the source field can be properly reconstructed during translog replay for indices with Derived Source enabled.
966-984: Well-structured recursive detection of DerivedSourceLeafReader.The method correctly traverses the reader chain to detect whether
DerivedSourceLeafReaderis present. The Javadoc appropriately explains whyFilterCodecReaderdoesn't need special handling (it cannot wrapDerivedSourceLeafReadersince that's anIndexReader, not aCodecReader).The recursive approach via
FilterLeafReader.getDelegate()correctly handles arbitrarily nested wrappers.test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (4)
398-402: LGTM - Simple test utility method.The method correctly creates a document with grouping criteria set for context-aware indexing tests.
717-752: LGTM - Test utility for creating engine with custom searcher wrapper.The method correctly creates an
InternalEnginethat wraps searchers with a customDirectoryReaderwrapper. This enables testing of derived source behavior during translog replay. The exception handling appropriately wrapsIOExceptioninOpenSearchExceptionfor the test context.Minor: There's an extra blank line at line 733 before the closing brace.
1665-1691: LGTM - Proper test setup for context-aware indexing with script support.The method correctly configures a
MapperServicefor context-aware index testing by:
- Enabling
INDEX_CONTEXT_AWARE_ENABLED_SETTING- Creating a
ScriptServicewith the customContextAwareCustomScriptPlugin- Using the
MapperTestUtils.newMapperServiceoverload that accepts aScriptService
1697-1726: LGTM - Well-structured test script plugin for context-aware testing.The
ContextAwareCustomScriptPluginfollows the established pattern for test script plugins in OpenSearch:
- Uses
MockScriptEnginewith the "painless" language name for compatibility with production scripts- Provides necessary scripts for context-aware grouping tests (
ctx.op='delete'andString.valueOf(grouping_criteria))- Properly implements empty defaults for
nonDeterministicPluginScripts()andpluginContextCompilers()server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8416-8477: Good: derived-source snapshot test now uses the derived-source-aware EngineConfig factory.
This keeps the test wiring aligned with the newer context-aware toggle.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
❌ Gradle check result for c13f01f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
c13f01f to
50d61ec
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (1)
8416-8476: Avoid double-closingstoreinside atry (Store store = ...)block.The inner
finally { IOUtils.close(engine, store); }will closestoreearly, and then the outer try-with-resources will close it again. Prefer only closingenginein the innerfinally.Proposed fix
- } finally { - IOUtils.close(engine, store); - } + } finally { + IOUtils.close(engine); + }Also, asserting
indexOp.source().utf8ToString()equals a hardcoded JSON string is a bit brittle; consider parsing and comparing the map (or at least asserting it contains"value":"test").
🤖 Fix all issues with AI agents
In `@server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java`:
- Around line 8488-8640: The test double-closes Store by using try (Store store
= createStore()) and then calling IOUtils.close(engine, store) in the finally
block; remove Store from explicit close there and only close engine (or let
try-with-resources close store) so ownership is not conflicted—specifically,
update the finally block that calls IOUtils.close(engine, store) to call
IOUtils.close(engine) (or IOUtils.close(engine) while leaving the
try-with-resources for Store intact) and ensure engine is null-checked before
closing.
- Around line 8651-8831: The test double-closes the Store by using
try-with-resources for Store (Store store = createStore()) and then passing both
engine and store to IOUtils.close(engine, store) in the finally; change the
finally to only close the engine (IOUtils.close(engine)) so the
try-with-resources still closes store automatically, leaving the engine variable
closed explicitly (reference symbols: Store store = createStore(), engine,
IOUtils.close(engine, store)).
♻️ Duplicate comments (5)
CHANGELOG.md (1)
35-35: Typographical corrections needed.This has already been flagged: "incase" should be "in case" and "non aborting" should be "non-aborting" (hyphenated compound adjective).
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (4)
8841-8992: Same snapshot-assertion gap and store double-close pattern repeats here.
This test also verifies “updated” docs viaengine.get(...)instead of snapshot opsource(), and closesstoretwice.
8480-8641: Snapshot assertions don’t validateTranslog.Index#source()(can miss the regression).
The “updated” verification usesengine.get(...)+ stored fields, which can still pass even ifengine.newChangesSnapshot(...)returned aTranslog.Indexwithnull/wrongsource().Action: assert
((Translog.Index) operation).source()is non-null and matches"test"vs"updated"based on op identity (e.g., seqNo/version).
8651-8683:doWrapDirectoryReaderreturninginlikely drops the wrapper on reopen (weakens wrapper-preservation test).
If the intent is to prove wrappers survive refresh/reopen and aren’t removed by unwrapping, the wrapper should be re-applied indoWrapDirectoryReader.
9137-9186: Use the non-context-awareMapperServicewhencontextAwareEnabled == false.
Right now it always usescreateMapperServiceForContextAwareIndex(), which can make the “contextAware disabled” test setup diverge from production wiring.Proposed fix
- final MapperService mapperService = createMapperServiceForContextAwareIndex(); + final MapperService mapperService = contextAwareEnabled ? createMapperServiceForContextAwareIndex() : createMapperService();
🧹 Nitpick comments (1)
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (1)
717-752: Engine factory with custom searcher wrapper enables derived source testing.The implementation correctly wraps the searcher using
IndexShard.wrapSearcher, which is the same mechanism used in production code. TheIOExceptiontoOpenSearchExceptionconversion is appropriate sinceacquireSearcherdeclaresEngineException.Minor style observation: there's an extra blank line at line 733.
🧹 Optional: Remove extra blank line
store.associateIndexWithNewTranslog(translogUuid); - }
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
CHANGELOG.mdserver/src/main/java/org/opensearch/common/lucene/Lucene.javaserver/src/test/java/org/opensearch/index/engine/InternalEngineTests.javatest/framework/src/main/java/org/opensearch/index/MapperTestUtils.javatest/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
🚧 Files skipped from review as they are similar to previous changes (1)
- test/framework/src/main/java/org/opensearch/index/MapperTestUtils.java
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-01-13T17:40:34.780Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:34.780Z
Learning: In OpenSearch's CodecService, when registries are processed during construction (via EnginePlugin::getAdditionalCodecs), the defaultCodec supplier passed to CodecRegistry.getCodecs() must remain dynamic (this::defaultCodec) rather than captured upfront, because registries can replace the default codec during iteration. The contract is that registries should not invoke the supplier during getCodecs() execution since CodecService is still initializing.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2025-12-02T22:44:14.799Z
Learnt from: prudhvigodithi
Repo: opensearch-project/OpenSearch PR: 20112
File: server/src/internalClusterTest/java/org/opensearch/search/slice/SearchSliceIT.java:73-81
Timestamp: 2025-12-02T22:44:14.799Z
Learning: In OpenSearch integration tests extending OpenSearchIntegTestCase, using `LuceneTestCase.SuppressCodecs("*")` triggers special handling that selects a random production codec from the CODECS array, while `SuppressCodecs("Asserting")` or other specific codec suppressions still allow Lucene's default codec randomization which may include the asserting codec. Use `SuppressCodecs("*")` when you need to completely avoid asserting codecs (e.g., for cross-thread StoredFieldsReader usage) while maintaining production codec test coverage.
Applied to files:
test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java
📚 Learning: 2026-01-13T17:40:27.167Z
Learnt from: reta
Repo: opensearch-project/OpenSearch PR: 20411
File: server/src/main/java/org/opensearch/index/codec/CodecService.java:112-133
Timestamp: 2026-01-13T17:40:27.167Z
Learning: Avoid capturing or evaluating a supplier (e.g., this::defaultCodec) upfront when passing it to a registry during object construction. If registries may replace defaults during iteration (as in EnginePlugin.getAdditionalCodecs), pass the supplier itself and only resolve it at use time. This ensures dynamic behavior is preserved during initialization and prevents premature binding of defaults in codecs/registry setup. This pattern should apply to similar initialization paths in Java server code where registries may mutate defaults during construction.
Applied to files:
server/src/main/java/org/opensearch/common/lucene/Lucene.java
🧬 Code graph analysis (1)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (2)
server/src/main/java/org/opensearch/common/lucene/index/DerivedSourceLeafReader.java (1)
DerivedSourceLeafReader(25-68)server/src/main/java/org/opensearch/index/codec/CriteriaBasedCodec.java (1)
CriteriaBasedCodec(26-72)
🪛 LanguageTool
CHANGELOG.md
[grammar] ~35-~35: Use a hyphen to join words.
Context: ...pers incase IndexWriter encounters a non aborting Exception ([#20193](https://git...
(QB_NEW_EN_HYPHEN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
- GitHub Check: gradle-check
- GitHub Check: assemble (21, windows-latest)
- GitHub Check: precommit (21, windows-latest)
- GitHub Check: detect-breaking-change
- GitHub Check: precommit (21, windows-2025, true)
- GitHub Check: assemble (25, ubuntu-24.04-arm)
- GitHub Check: precommit (25, ubuntu-latest)
- GitHub Check: precommit (25, macos-15)
- GitHub Check: assemble (21, ubuntu-24.04-arm)
- GitHub Check: assemble (25, ubuntu-latest)
- GitHub Check: precommit (21, macos-15)
- GitHub Check: precommit (21, ubuntu-latest)
- GitHub Check: precommit (21, macos-15-intel)
- GitHub Check: assemble (25, windows-latest)
- GitHub Check: precommit (25, macos-15-intel)
- GitHub Check: precommit (25, ubuntu-24.04-arm)
- GitHub Check: assemble (21, ubuntu-latest)
- GitHub Check: precommit (21, ubuntu-24.04-arm)
- GitHub Check: precommit (25, windows-latest)
- GitHub Check: Analyze (java)
🔇 Additional comments (9)
server/src/main/java/org/opensearch/common/lucene/Lucene.java (3)
87-87: Import addition looks appropriate.The import is needed for the
instanceofcheck in the newisDerivedSourceEnabledhelper method.
955-959: Conditional wrapping correctly preserves the DerivedSourceLeafReader wrapper.When derived source is enabled, using
leaf(the original reader) instead ofsegmentReaderpreserves the wrapper chain. This ensures thatDerivedSourceLeafReader.storedFields()remains accessible during translog replay, fixing the null source field issue.
965-984: Recursive wrapper detection logic is sound.The method correctly traverses the
FilterLeafReaderchain to detectDerivedSourceLeafReader. The Javadoc accurately explains thatFilterCodecReadercannot wrapDerivedSourceLeafReadersince it only acceptsCodecReaderdelegates, whileDerivedSourceLeafReaderextendsSequentialStoredFieldsLeafReader(aFilterLeafReader).The default
return falseat line 983 is a safe fallback for unexpected reader types.test/framework/src/main/java/org/opensearch/index/engine/EngineTestCase.java (4)
65-65: New imports support the test infrastructure additions.The imports for
OpenSearchException,CheckedFunction,IndexShard, script-related classes, and collection types are all utilized by the new test helper methods.
398-402: Simple test helper for documents with grouping criteria.Follows the pattern of existing
testDocument*helper methods.
1665-1691: MapperService factory for context-aware index testing.The method correctly wires up the
ScriptServicewithContextAwareCustomScriptPluginand configures the index settings withINDEX_CONTEXT_AWARE_ENABLED_SETTING. This enables testing of derived source scenarios that require script evaluation.
1697-1726: Context-aware script plugin provides test scripts for derived source scenarios.The plugin correctly implements
ScriptPluginwith aMockScriptEnginebackend. The two registered scripts (ctx.op='delete'andString.valueOf(grouping_criteria)) cover the test cases needed for context-aware indexing functionality.server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java (2)
49-52: New Lucene filter imports look appropriate for the wrapper-preservation tests.
9190-9214: This concern is unfounded. While the code does mix settings objects, CodecService does not use theIndexSettingsparameter whenmapperServiceis null. Since the test passesnullas the mapper service,CodecServiceonly instantiates basicLucene103Codecinstances that have no dependency onIndexSettings. The choice betweenconfig.getIndexSettings()and the newindexSettingshas zero functional impact in this context.Likely an incorrect or invalid review comment.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Show resolved
Hide resolved
server/src/test/java/org/opensearch/index/engine/InternalEngineTests.java
Show resolved
Hide resolved
|
❌ Gradle check result for 50d61ec: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
9809a69 to
d3c0f7e
Compare
|
❌ Gradle check result for d3c0f7e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
…ounters a non aborting Exception Signed-off-by: RS146BIJAY <rishavsagar4b1@gmail.com>
d3c0f7e to
4f9ec4c
Compare
Description
In OpenSearch, during recovery flow, documents are queried using a DirectoryReader to be replayed as operations during translog replay. Now incase OpenSearch IndexWriter encounters any non aborted exception, underlying segments can contain both hard and soft deleted documents. In this case, OpenSearch unwraps a LeafReader and uses it for querying documents which are replayed as operation for Translog replay.
Since this unwrapping process will remove all wrappers from LeafReader, it becomes an issue for features like DerivedSource where we rely on a DerivedSourceDirectoryReader wrapper over LeafReader to extract source for a document. For Derived Source enabled index, with wrapper removed, a MissingHistoryOperationsException will be thrown during Translog replay as source field will be null for the documents, failing recovery.
Related Issues
#19851