Skip to content

Add support for steps to change the target index name for later steps#142955

Merged
lukewhiting merged 7 commits intoelastic:mainfrom
lukewhiting:est-2361-next-step-name-awareness
Mar 2, 2026
Merged

Add support for steps to change the target index name for later steps#142955
lukewhiting merged 7 commits intoelastic:mainfrom
lukewhiting:est-2361-next-step-name-awareness

Conversation

@lukewhiting
Copy link
Copy Markdown
Contributor

@lukewhiting lukewhiting commented Feb 24, 2026

This PR adds a few functions:

  • The ability for a step to specify a number of index name patterns that it indicate the name future steps should use instead of the actual backing index name
    • For example a clone step might output "my-index-clone" or "my-index" depending on if the clone is needed and actually happens
    • In this case, the next step after the clone would look to see which index was actually created (if any) and target that
  • It adds a default implementation that just assumes the output is the original backing index name
  • It adds validation to ensure that when actions are registered in the plugin, no steps give an empty list for the possible index patterns

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a mechanism for DLM steps to declare possible output index name patterns so that subsequent steps can target the index actually produced by earlier steps (e.g., clone/no-clone outcomes), along with validation when registering actions and accompanying tests.

Changes:

  • Add DlmStep#possibleOutputIndexNamePatterns() with a default identity implementation.
  • Update DataStreamLifecycleService to resolve the target index for step execution based on prior-step output patterns.
  • Add plugin-time validation (DataStreamsPlugin.verifyActions) and new unit tests for index resolution + validation behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java Adds the new default API for declaring possible output index names.
modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleService.java Resolves an execution index using prior-step patterns and adjusts step selection logic to work with an index that may change.
modules/data-streams/src/main/java/org/elasticsearch/datastreams/DataStreamsPlugin.java Adds startup-time validation for registered DLM actions/steps.
modules/data-streams/src/test/java/org/elasticsearch/datastreams/lifecycle/DataStreamLifecycleServiceTests.java Adds tests for index output resolution and the default pattern behavior.
modules/data-streams/src/test/java/org/elasticsearch/datastreams/DataStreamsPluginTests.java Adds tests for action/step validation, including failure cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lukewhiting and others added 2 commits February 24, 2026 14:45
…ms/lifecycle/transitions/DlmStep.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@seanzatzdev seanzatzdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dakrone dakrone self-requested a review February 26, 2026 15:00
Copy link
Copy Markdown
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few minor comments. I think the interface is overly complicated though, we could have a method that takes a String and returns a List<String>, because the calculation should be deterministic if I understand it correctly.

*/
default List<Function<String, String>> possibleOutputIndexNamePatterns() {
return List.of(Function.identity());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a list of functions makes it seem like this is going to be expensive and adds complexity. Do we have to make this use functions? Could it be List<String> possibleOutputIndexNamePatterns(String indexName) instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have switched this to use a list of String. I originally used a function list as it makes the validation less messy in the plugin. Now you have to pass a dummy value to check the returned list if not empty.

What do you think? Worth the mess in the plugin for a faster run / simpler interface overall?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth it to have a cleaner interface and avoid the functions.

@lukewhiting lukewhiting requested a review from dakrone February 27, 2026 11:56
Copy link
Copy Markdown
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left a comment about documentation and an optional one about using Streams.

*/
default List<Function<String, String>> possibleOutputIndexNamePatterns() {
return List.of(Function.identity());
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's worth it to have a cleaner interface and avoid the functions.

@lukewhiting lukewhiting enabled auto-merge (squash) March 2, 2026 11:38
@lukewhiting lukewhiting merged commit 9f722df into elastic:main Mar 2, 2026
35 checks passed
@lukewhiting lukewhiting deleted the est-2361-next-step-name-awareness branch March 2, 2026 13:06
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 2, 2026
…cations

* upstream/main: (60 commits)
  Use batches for other bulk vector benchmarks (elastic#143167)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.MvJoinKeyOnTheLookupIndexAfterStats} elastic#143388
  Mute org.elasticsearch.snapshots.ConcurrentSnapshotsIT testBackToBackQueuedDeletes elastic#143387
  [Inference API] Parse endpoint metadata from persisted endpoints (elastic#143081)
  Add cluster formation doc to DistributedArchitectureGuide (elastic#143318)
  Fix flattened root block loader null expectation (elastic#143238)
  Unmute ValueSourceReaderTypeConversionTests testLoadAll (elastic#143189)
  ESQL: Add split coalescing for many small files (elastic#143335)
  Unmute mixed-cluster spatial parse warning test (elastic#143186)
  Fix zero-size estimate in BytesRefBlock null test (elastic#143258)
  Make DataType and DataFormat top-level enums (elastic#143312)
  Add support for steps to change the target index name for later steps (elastic#142955)
  Set mayContainDuplicates flag to test deduplication (elastic#143375)
  ESQL: Fix Driver search load millis as nanos bug (elastic#143267)
  Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:lookup-join.LookupJoinWithMixPushableAndUnpushableFilters} elastic#143378
  ESQL: Forbid MV_EXPAND before full text functions (elastic#143249)
  ESQL: Fix unresolved name pattern (elastic#143210)
  Implement boxplot queryDSL aggregation for exponential_histograms (elastic#143026)
  Add prefetching to x64 bulk vector implementations (elastic#142387)
  Make large segment vector tests resilient to memory constraints (elastic#143366)
  ...
tballison pushed a commit to tballison/elasticsearch that referenced this pull request Mar 3, 2026
…elastic#142955)

* Add support for steps to change the target index name for later steps

* Fix mutated index name when finding first incomplete step

* Update modules/data-streams/src/main/java/org/elasticsearch/datastreams/lifecycle/transitions/DlmStep.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* PR Changes

* Additional PR changes

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants