Skip to content

[Workload Management][Rule Based Autotagging] Scroll API support in autotagging#20151

Open
laminelam wants to merge 5 commits intoopensearch-project:mainfrom
laminelam:feature/scroll_autogaging_1127
Open

[Workload Management][Rule Based Autotagging] Scroll API support in autotagging#20151
laminelam wants to merge 5 commits intoopensearch-project:mainfrom
laminelam:feature/scroll_autogaging_1127

Conversation

@laminelam
Copy link
Contributor

@laminelam laminelam commented Dec 2, 2025

Resolves #8362

Description

[Describe what this change achieves]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • New Features

    • Scroll requests are now tagged by workload management rules and propagate workload headers.
  • Improvements

    • Scroll identifiers now carry original index expressions so index metadata is preserved across scrolls.
    • Scroll request handling was extended to extract original indices from request metadata for consistent processing.
  • Tests

    • Added tests covering tagging and metadata behavior for scroll flows.

✏️ Tip: You can customize this high-level summary in your review settings.

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@laminelam laminelam requested a review from a team as a code owner December 2, 2025 13:58
@github-actions github-actions bot added enhancement Enhancement or improvement to existing feature or request good first issue Good for newcomers labels Dec 2, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review
📝 Walkthrough

Walkthrough

Adds propagation of the original indices through scroll IDs, exposes original indices via a new TransportOriginalIndicesAction and ActionRequestMetadata.originalIndices(), updates search/scroll parsing to carry original indices, and extends workload-management tagging to handle SearchScrollRequest (tests added; one test duplicated).

Changes

Cohort / File(s) Summary
Scroll ID persistence
server/src/main/java/org/opensearch/action/search/ParsedScrollId.java, server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java, server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java
Add originalIndices field to ParsedScrollId; version-gated serialization/deserialization of original indices in scroll ID (INDICES_IN_SCROLL_ID_VERSION); include request.indices() when building scroll ID.
Scroll request parsing & caching
server/src/main/java/org/opensearch/action/search/SearchScrollRequest.java
Cache lazily parsed ParsedScrollId in SearchScrollRequest.parseScrollId() (transient parsedScrollId field) to avoid reparsing.
Transport action original-indices support
server/src/main/java/org/opensearch/action/search/TransportSearchScrollAction.java, server/src/main/java/org/opensearch/action/support/TransportOriginalIndicesAction.java, server/src/main/java/org/opensearch/action/support/ActionRequestMetadata.java
New TransportOriginalIndicesAction interface; TransportSearchScrollAction implements it and exposes originalIndices(request) by using parsed scroll ID; ActionRequestMetadata.originalIndices() added to retrieve original indices from transport action metadata.
Workload management tagging
plugins/workload-management/src/main/java/org/opensearch/plugin/wlm/AutoTaggingActionFilter.java, plugins/workload-management/src/test/java/org/opensearch/plugin/wlm/AutoTaggingActionFilterTests.java
AutoTaggingActionFilter now recognizes SearchScrollRequest, extracts original index expressions from request metadata, and attaches an AttributeExtractor for RuleAttribute.INDEX_PATTERN; tests added to cover scroll request flow.
Integration tests
plugins/workload-management/src/internalClusterTest/java/org/opensearch/plugin/wlm/WlmAutoTaggingIT.java
Adds testScrollRequestsAreAlsoTagged() (duplicated insertion present in patch).
Unit test updates
server/src/test/java/org/opensearch/action/search/ParsedScrollIdTests.java, server/src/test/java/org/opensearch/action/search/SearchScrollAsyncActionTests.java
Update test usages to pass new String[] originalIndices parameter to ParsedScrollId constructor.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Transport as TransportSearchScrollAction
    participant SSR as SearchScrollRequest
    participant PSID as ParsedScrollId
    participant Metadata as ActionRequestMetadata
    participant ATAF as AutoTaggingActionFilter
    participant WLS as RuleProcessingService

    Client->>Transport: send SearchRequest -> initial search (build scrollId with indices)
    Transport-->>Client: response with scrollId

    Client->>Transport: SearchScrollRequest (scrollId)
    Transport->>SSR: request.parseScrollId()
    SSR->>PSID: parse/cached parse
    PSID-->>SSR: ParsedScrollId (includes originalIndices)
    SSR-->>Transport: parsed result

    Transport->>Metadata: attach originalIndices to request metadata
    Transport->>ATAF: pass request through filter
    ATAF->>Metadata: read originalIndices
    ATAF->>WLS: evaluateLabel(using INDEX_PATTERN attribute)
    WLS-->>ATAF: workload group label
    ATAF-->>Transport: set workload header / proceed
    Transport-->>Client: scroll response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

Search

Suggested reviewers

  • andrross
  • sachinpkale
  • saratvemulapalli

Poem

🐰 I hopped the scroll and kept the trace,

indices safe inside the ID's embrace,
Tags now follow each scrolling quest,
Workloads sorted — the rabbit's impressed! ✨

Pre-merge checks and finishing touches

❌ Failed checks (3 warnings)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete. It includes only a broken issue link and basic checklist items but lacks the required 'Description' section explaining what the change achieves. Add a detailed description explaining the Scroll API support implementation, changes made to autotagging logic, and how scroll requests are now handled alongside regular search requests.
Linked Issues check ⚠️ Warning The linked issue #8362 concerns HDFS fixture BouncyCastle exclusion, but the PR implements Scroll API support for workload management autotagging—completely unrelated objectives. Link the correct issue (likely related to scroll API autotagging support) that reflects the actual changes in this PR, not an unrelated HDFS fixture backport.
Docstring Coverage ⚠️ Warning Docstring coverage is 3.70% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately and concisely describes the main change: adding Scroll API support to workload management autotagging, which aligns with the changeset modifications.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing Scroll API support in autotagging across workload management, filter logic, and search action components, with no out-of-scope modifications detected.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/src/main/java/org/opensearch/action/search/SearchScrollRequest.java (1)

101-111: Bug: Cache is not invalidated when scrollId is changed.

If scrollId(String) is called after parseScrollId() has already been invoked, the cached parsedScrollId will become stale and return results for the old scroll ID.

Apply this diff to clear the cache when scrollId changes:

 public SearchScrollRequest scrollId(String scrollId) {
     this.scrollId = scrollId;
+    this.parsedScrollId = null;
     return this;
 }
🧹 Nitpick comments (5)
plugins/workload-management/src/internalClusterTest/java/org/opensearch/plugin/wlm/WlmAutoTaggingIT.java (1)

442-477: Scroll auto‑tagging IT is correct; consider clearing scroll context as a small improvement

This test is well-structured and mirrors the existing WLM auto-tagging tests: it enables WLM, creates a workload group and rule, indexes data, and then verifies that both the initial search (with setScroll(...)) and the subsequent prepareSearchScroll(...) call increase completions. The assertions on scrollId and monotonic increases in getCompletions(...) make sense and exercise the new scroll-tagging path end-to-end.

One minor improvement you might consider:

  • After finishing with the scroll, clear the scroll context (e.g., client().prepareClearScroll().addScrollId(scrollId).get();), possibly in a try/finally, to avoid leaving scrolls open across potential assertBusy retries. It’s not a blocker for a small IT but would make the test more self-contained and resource-friendly.
server/src/test/java/org/opensearch/action/search/SearchScrollAsyncActionTests.java (1)

481-485: Constructor update is correct, but consider adding test coverage for non-empty originalIndices.

The update to match the new ParsedScrollId constructor signature is correct. However, this test file doesn't exercise the new originalIndices functionality. Consider adding a test case that verifies scroll behavior when originalIndices is populated, to ensure the autotagging flow works end-to-end.

server/src/main/java/org/opensearch/action/search/ParsedScrollId.java (1)

56-79: Consider defensive copies for array field in public API.

Since ParsedScrollId is annotated with @PublicApi, external callers could mutate the originalIndices array after construction or after calling getOriginalIndices(). Consider defensive copying for better encapsulation.

-    ParsedScrollId(String source, String type, SearchContextIdForNode[] context, String[] originalIndices) {
+    ParsedScrollId(String source, String type, SearchContextIdForNode[] context, String[] originalIndices) {
         this.source = source;
         this.type = type;
         this.context = context;
-        this.originalIndices = originalIndices;
+        this.originalIndices = originalIndices != null ? originalIndices.clone() : null;
     }
     public String[] getOriginalIndices() {
-        return originalIndices;
+        return originalIndices != null ? originalIndices.clone() : null;
     }
server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java (1)

137-146: Verify the deserialization logic aligns with the position check.

The conditional reading of originalIndices is correct, but ensure the position check on line 148 (in.getPosition() != bytes.length) still functions as intended:

  • When originalIndices are absent, position should equal bytes.length before line 148.
  • When originalIndices are present, they must be fully consumed so position equals bytes.length.

Consider adding a unit test that verifies scroll IDs with and without originalIndices pass the position check.

plugins/workload-management/src/main/java/org/opensearch/plugin/wlm/AutoTaggingActionFilter.java (1)

100-119: Consider extracting the anonymous AttributeExtractor to improve readability.

The inline anonymous AttributeExtractor implementation is functionally correct but could be refactored into a named helper method (e.g., createResolvedIndicesExtractor) to improve code clarity and testability.

Example refactor:

private static AttributeExtractor<String> createResolvedIndicesExtractor(Set<String> indexNames) {
    return new AttributeExtractor<>() {
        @Override
        public Attribute getAttribute() {
            return RuleAttribute.INDEX_PATTERN;
        }

        @Override
        public Iterable<String> extract() {
            return indexNames;
        }

        @Override
        public LogicalOperator getLogicalOperator() {
            return LogicalOperator.AND;
        }
    };
}

Then replace lines 104-119 with:

-            attributeExtractors.add(new AttributeExtractor<>() {
-                @Override
-                public Attribute getAttribute() {
-                    return RuleAttribute.INDEX_PATTERN;
-                }
-
-                @Override
-                public Iterable<String> extract() {
-                    return names;
-                }
-
-                @Override
-                public LogicalOperator getLogicalOperator() {
-                    return LogicalOperator.AND;
-                }
-            });
+            attributeExtractors.add(createResolvedIndicesExtractor(names));
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2b6d266 and 2cc9bb7.

📒 Files selected for processing (10)
  • plugins/workload-management/src/internalClusterTest/java/org/opensearch/plugin/wlm/WlmAutoTaggingIT.java (1 hunks)
  • plugins/workload-management/src/main/java/org/opensearch/plugin/wlm/AutoTaggingActionFilter.java (3 hunks)
  • plugins/workload-management/src/test/java/org/opensearch/plugin/wlm/AutoTaggingActionFilterTests.java (2 hunks)
  • server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java (1 hunks)
  • server/src/main/java/org/opensearch/action/search/ParsedScrollId.java (2 hunks)
  • server/src/main/java/org/opensearch/action/search/SearchScrollRequest.java (2 hunks)
  • server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java (3 hunks)
  • server/src/main/java/org/opensearch/action/search/TransportSearchScrollAction.java (4 hunks)
  • server/src/test/java/org/opensearch/action/search/ParsedScrollIdTests.java (1 hunks)
  • server/src/test/java/org/opensearch/action/search/SearchScrollAsyncActionTests.java (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java (1)
server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java (1)
  • TransportSearchHelper (55-160)
plugins/workload-management/src/test/java/org/opensearch/plugin/wlm/AutoTaggingActionFilterTests.java (1)
server/src/main/java/org/opensearch/action/support/ActionRequestMetadata.java (1)
  • ActionRequestMetadata (21-61)
server/src/main/java/org/opensearch/action/search/SearchScrollRequest.java (1)
server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java (1)
  • TransportSearchHelper (55-160)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: gradle-check
  • GitHub Check: precommit (25, macos-15-intel)
  • GitHub Check: precommit (21, macos-15)
  • GitHub Check: precommit (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, windows-2025, true)
  • GitHub Check: precommit (25, macos-15)
  • GitHub Check: precommit (21, windows-latest)
  • GitHub Check: precommit (25, windows-latest)
  • GitHub Check: assemble (25, windows-latest)
  • GitHub Check: precommit (25, ubuntu-latest)
  • GitHub Check: precommit (21, macos-15-intel)
  • GitHub Check: precommit (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (25, ubuntu-24.04-arm)
  • GitHub Check: precommit (21, ubuntu-latest)
  • GitHub Check: detect-breaking-change
  • GitHub Check: assemble (25, ubuntu-latest)
  • GitHub Check: assemble (21, ubuntu-24.04-arm)
  • GitHub Check: assemble (21, ubuntu-latest)
  • GitHub Check: assemble (21, windows-latest)
  • GitHub Check: Analyze (java)
🔇 Additional comments (11)
server/src/main/java/org/opensearch/action/search/AbstractSearchAsyncAction.java (1)

777-779: LGTM! Indices are now correctly propagated to scroll ID construction.

The change correctly passes request.indices() to buildScrollId, enabling the scroll ID to carry original index information for autotagging purposes. The TransportSearchHelper.buildScrollId method properly handles null/empty arrays with version gating.

server/src/test/java/org/opensearch/action/search/ParsedScrollIdTests.java (1)

53-58: LGTM! Constructor call updated correctly.

The test correctly uses the new constructor signature. The hasLocalIndices() test remains valid since it doesn't depend on originalIndices. Consider adding a separate test for getOriginalIndices() to verify it returns the expected values.

server/src/main/java/org/opensearch/action/search/TransportSearchScrollAction.java (3)

54-56: LGTM! TransportIndicesResolvingAction implementation enables autotagging for scroll requests.

The class now correctly implements TransportIndicesResolvingAction<SearchScrollRequest>, enabling the workload management system to resolve indices for scroll requests and apply appropriate tags.


87-87: Good change to use cached parsing.

Using request.parseScrollId() instead of TransportSearchHelper.parseScrollId(request.scrollId()) leverages the lazy caching mechanism, avoiding redundant parsing if resolveIndices() was called earlier in the request lifecycle.


123-143: Robust implementation with proper defensive handling.

The resolveIndices() implementation correctly:

  1. Guards against null/empty scroll ID strings
  2. Handles null parsed result
  3. Returns OptionallyResolvedIndices.unknown() for missing originalIndices
  4. Catches all exceptions to prevent failures from blocking request processing

One minor note: the null check on line 132 (parsed == null) should never be true given the current parseScrollId() implementation throws on parse failure rather than returning null. However, keeping it is good defensive programming.

server/src/main/java/org/opensearch/action/search/TransportSearchHelper.java (4)

65-67: LGTM!

Delegating to the new overload with null for originalIndices maintains backward compatibility for existing callers.


151-151: Verify the ParsedScrollId constructor signature.

The code at line 151 calls ParsedScrollId(scrollId, type, context, originalIndices) with four parameters. Ensure that the ParsedScrollId class constructor has been updated to accept all four parameters, including String[] originalIndices as the fourth parameter.


63-63: Verify that Version.V_3_3_2 exists and is the intended version.

This constant gates a wire format change for including originalIndices in scroll IDs. Ensure that Version.V_3_3_2 is defined in the Version class and that this is the correct minimum version for the feature.


88-99: No action required — scroll IDs are designed as opaque, versioned tokens.

The code correctly uses version gating to conditionally serialize originalIndices. OpenSearch's documented design explicitly treats scroll IDs as opaque, internally-versioned structures that are not intended to be cross-version compatible. Scroll IDs should never be parsed or shared across versions; they are opaque tokens whose internal format can change arbitrarily. Each version reads exactly what it wrote, and the conditional serialization ensures older nodes do not encounter unexpected fields.

plugins/workload-management/src/main/java/org/opensearch/plugin/wlm/AutoTaggingActionFilter.java (1)

88-90: LGTM: Request type validation extended correctly.

The validation logic now correctly includes both SearchRequest and SearchScrollRequest, allowing scroll operations to be tagged by the workload management system.

plugins/workload-management/src/test/java/org/opensearch/plugin/wlm/AutoTaggingActionFilterTests.java (1)

98-115: Additional test coverage should verify fallback behavior and validate extracted indices.

The test correctly verifies the happy path for SearchScrollRequest with resolved indices and workload tagging. However, without access to the codebase for verification, I cannot confirm:

  1. Whether a fallback scenario test is actually needed (requires reviewing lines 122-125 in AutoTaggingActionFilter.java)
  2. Whether the API references in the suggested example (OptionallyResolvedIndices.unknown()) are valid
  3. Whether the ResolvedIndices.of() factory method exists as used

The suggestion to verify actual indices passed to evaluateLabel via argument captors remains valid and should be considered if the fallback scenario is indeed a code path that needs testing.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 2, 2025

❌ Gradle check result for 2cc9bb7: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@laminelam
Copy link
Contributor Author

@kaushalmahi12
When you have time, would you please take a look into this?
Thx

@opensearch-trigger-bot
Copy link
Contributor

This PR is stalled because it has been open for 30 days with no activity.

@opensearch-trigger-bot opensearch-trigger-bot bot added the stalled Issues that have stalled label Jan 5, 2026
@laminelam
Copy link
Contributor Author

Hi @andrross
Can you please help on this?

@opensearch-trigger-bot opensearch-trigger-bot bot removed the stalled Issues that have stalled label Jan 6, 2026
@andrross
Copy link
Member

andrross commented Jan 6, 2026

@kaushalmahi12 Can you take a look?

@kaushalmahi12
Copy link
Contributor

kaushalmahi12 commented Jan 13, 2026

I will look into it this week. @dzane17 Can you do the first round of review on this ?

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 7497b6a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@github-actions
Copy link
Contributor

❌ Gradle check result for 7c00443: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 7c00443: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

remove TransportOriginalIndicesAction

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 1, 2026

❌ Gradle check result for ce35d03: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Copy link
Contributor

@jainankitk jainankitk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checks failed due to spotless failure:

* What went wrong:
Execution failed for task ':plugins:workload-management:spotlessJavaCheck'.
> The following files had format violations:
      src/internalClusterTest/java/org/opensearch/plugin/wlm/WlmAutoTaggingIT.java
          @@ -467,24 +467,16 @@
           
           ············try·{
           ················int·afterInitialSearch·=·getCompletions(workloadGroupId);
          -················assertTrue(
          -····················"Expected·completions·to·increase·after·initial·search·with·scroll",
          -····················afterInitialSearch·>·completionsBefore
          -················);

Please Run './gradlew spotlessApply' to fix all violations. Also, can you add entry to the changelog?

Signed-off-by: Lamine Idjeraoui <lidjeraoui@apple.com>
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

❌ Gradle check result for 71f2255: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@kaushalmahi12
Copy link
Contributor

Thanks @laminelam for making this change just fix the the gradle checks, rest of it looks good to me.

@laminelam
Copy link
Contributor Author

Thanks @laminelam for making this change just fix the the gradle checks, rest of it looks good to me.

Thanks @kaushalmahi12
I already fixed the spotless issue (thx @jainankitk for pointing it out).
When the PR is ready for merge will update the change log

Copy link
Member

@andrross andrross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just need to rebase, fix the version guard, and add the changelog entry.


private static final String INCLUDE_CONTEXT_UUID = "include_context_uuid";

public static final Version INDICES_IN_SCROLL_ID_VERSION = Version.V_3_4_0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be V_3_6_0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement Enhancement or improvement to existing feature or request good first issue Good for newcomers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Workload Management][Rule Based Autotagging] Scroll API support in autotagging

5 participants