Add support for steps to change the target index name for later steps#142955
Conversation
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
There was a problem hiding this comment.
Pull request overview
This PR introduces a mechanism for DLM steps to declare possible output index name patterns so that subsequent steps can target the index actually produced by earlier steps (e.g., clone/no-clone outcomes), along with validation when registering actions and accompanying tests.
Changes:
- Add
DlmStep#possibleOutputIndexNamePatterns()with a default identity implementation. - Update
DataStreamLifecycleServiceto resolve the target index for step execution based on prior-step output patterns. - Add plugin-time validation (
DataStreamsPlugin.verifyActions) and new unit tests for index resolution + validation behavior.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java |
Adds the new default API for declaring possible output index names. |
modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java |
Resolves an execution index using prior-step patterns and adjusts step selection logic to work with an index that may change. |
modules/data-streams/src/main/java/org/elasticsearch/datastreams/DataStreamsPlugin.java |
Adds startup-time validation for registered DLM actions/steps. |
modules/data-streams/src/test/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleServiceTests.java |
Adds tests for index output resolution and the default pattern behavior. |
modules/data-streams/src/test/java/org/elasticsearch/datastreams/DataStreamsPluginTests.java |
Adds tests for action/step validation, including failure cases. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Outdated
Show resolved
Hide resolved
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Show resolved
Hide resolved
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Outdated
Show resolved
Hide resolved
.../data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java
Outdated
Show resolved
Hide resolved
…ms/lifecycle/transitions/DlmStep.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
dakrone
left a comment
There was a problem hiding this comment.
I left a few minor comments. I think the interface is overly complicated though, we could have a method that takes a String and returns a List<String>, because the calculation should be deterministic if I understand it correctly.
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Show resolved
Hide resolved
| */ | ||
| default List<Function<String, String>> possibleOutputIndexNamePatterns() { | ||
| return List.of(Function.identity()); | ||
| } |
There was a problem hiding this comment.
I think a list of functions makes it seem like this is going to be expensive and adds complexity. Do we have to make this use functions? Could it be List<String> possibleOutputIndexNamePatterns(String indexName) instead?
There was a problem hiding this comment.
I have switched this to use a list of String. I originally used a function list as it makes the validation less messy in the plugin. Now you have to pass a dummy value to check the returned list if not empty.
What do you think? Worth the mess in the plugin for a faster run / simpler interface overall?
There was a problem hiding this comment.
I think it's worth it to have a cleaner interface and avoid the functions.
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Outdated
Show resolved
Hide resolved
…-next-step-name-awareness
dakrone
left a comment
There was a problem hiding this comment.
LGTM, I left a comment about documentation and an optional one about using Streams.
.../data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java
Outdated
Show resolved
Hide resolved
| */ | ||
| default List<Function<String, String>> possibleOutputIndexNamePatterns() { | ||
| return List.of(Function.identity()); | ||
| } |
There was a problem hiding this comment.
I think it's worth it to have a cleaner interface and avoid the functions.
...treams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java
Outdated
Show resolved
Hide resolved
…cations * upstream/main: (60 commits) Use batches for other bulk vector benchmarks (elastic#143167) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388 Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387 [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081) Add cluster formation doc to DistributedArchitectureGuide (elastic#143318) Fix flattened root block loader null expectation (elastic#143238) Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189) ESQL: Add split coalescing for many small files (elastic#143335) Unmute mixed-cluster spatial parse warning test (elastic#143186) Fix zero-size estimate in BytesRefBlock null test (elastic#143258) Make DataType and DataFormat top-level enums (elastic#143312) Add support for steps to change the target index name for later steps (elastic#142955) Set mayContainDuplicates flag to test deduplication (elastic#143375) ESQL: Fix Driver search load millis as nanos bug (elastic#143267) Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378 ESQL: Forbid MV_EXPAND before full text functions (elastic#143249) ESQL: Fix unresolved name pattern (elastic#143210) Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026) Add prefetching to x64 bulk vector implementations (elastic#142387) Make large segment vector tests resilient to memory constraints (elastic#143366) ...
…elastic#142955) * Add support for steps to change the target index name for later steps * Fix mutated index name when finding first incomplete step * Update modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * PR Changes * Additional PR changes --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This PR adds a few functions: