Skip to content

[One Workflow] HITL reliable resume in chat and in execs UI#264605

Merged
h88 merged 8 commits intoelastic:mainfrom
h88:1wf-hitl-race
May 4, 2026
Merged

[One Workflow] HITL reliable resume in chat and in execs UI#264605
h88 merged 8 commits intoelastic:mainfrom
h88:1wf-hitl-race

Conversation

@h88
Copy link
Copy Markdown
Contributor

@h88 h88 commented Apr 21, 2026

Users (and agents) get fewer false second approval moments, faster resume pickup and a working Resume button on back-to-back pauses.

  • Engine: after resume is scheduled, nudge Task Manager the same way as cancel so the resume task is picked up sooner
  • Agent-facing state: expose waiting_step_id so the model can tell (same vs new waiting step) when polling status
  • Prompt: resume + workflow tool text explains async resume behaviour & suggests status polling instead of inventing a second HITL without evidence
  • WF exec UI: re-enable resume when the paused step execution id changes (new wait).
    Scope

Note: No change to resume API contract or step semantics

Testing workflow:

name: consecutive hitl flow
enabled: true
triggers:
  - type: manual

steps:
  - name: hello_world_step
    type: console
    with:
      message: "hola"

  - name: waitforinput
    type: waitForInput
    with:
      message: first wait - do you approve
      schema:
        properties:
          approved:
            type: boolean
            title: approval

  - name: waitforinput2
    type: waitForInput
    with:
      message: second wait - do you approve
      schema:
        properties:
          approved:
            type: boolean
            title: approval

  - name: over_items
    type: foreach
    foreach: '["alpha", "beta", "gamma"]'
    steps:
      - name: per_item_hitl
        type: waitForInput
        with:
          message: "Approve this iteration?"
image

Fixes https://github.com/elastic/security-team/issues/16933

@h88 h88 self-assigned this Apr 21, 2026
@h88 h88 added release_note:skip Skip the PR/issue when compiling release notes backport:version Backport to applied version labels Team:One Workflow Team label for One Workflow (Workflow automation) v9.4.0 labels Apr 21, 2026
@h88 h88 requested a review from Copilot April 21, 2026 08:44
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Improves reliability of HITL resume by making the engine pick up resume tasks faster, exposing an identifier to distinguish stale vs new waits when polling, updating agent/tool guidance, and re-enabling the Resume UI on back-to-back pauses.

Changes:

  • Nudge Task Manager after resume scheduling (similar to cancel) to reduce pickup latency.
  • Add waiting_input.waiting_step_id to workflow execution state and update tool text to guide correct polling behavior.
  • Update workflows exec UI to re-enable Resume when the active paused step execution changes, with tests.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
x-pack/platform/plugins/shared/agent_builder_platform/server/tools/resume_workflow_execution.ts Updates the tool prompt to explain async resume + polling/stale snapshot behavior.
x-pack/platform/plugins/shared/agent_builder/server/services/tools/tool_types/workflow/tool_type.ts Updates workflow tool help text to document waiting_step_id.
x-pack/platform/packages/shared/agent-builder/agent-builder-genai-utils/tools/utils/workflows/get_execution_state.ts Adds waiting_step_id into waiting_input in the derived execution state.
x-pack/platform/packages/shared/agent-builder/agent-builder-genai-utils/tools/utils/workflows/get_execution_state.test.ts Updates tests to assert waiting_step_id is present.
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_step_execution_details.tsx Threads waitingStepExecutionId through to Resume UI.
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_overview.tsx Threads waitingStepExecutionId through to Resume UI.
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_detail.tsx Tracks active wait step execution row id and passes it down.
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_detail.test.tsx Updates comment to align with renamed identifier (no behavior change).
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/resume_execution_button.tsx Resets “submitted” state when the active wait step execution changes.
src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/resume_execution_button.test.tsx Adds coverage for re-enabling after waitingStepExecutionId changes.
src/platform/plugins/shared/workflows_execution_engine/server/plugin.ts Forces idle tasks to run after resume to speed up resume pickup.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@elasticmachine
Copy link
Copy Markdown
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] Agent Builder API Smoke Tests #1 / Agent Builder - LLM Smoke tests EIS Models (dynamically configured) Connector: openai-gpt-5.4 returns an answer for a simple message

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
workflowsManagement 2.3MB 2.3MB +203.0B

cc @h88

@h88 h88 marked this pull request as ready for review May 3, 2026 20:33
@h88 h88 requested review from a team as code owners May 3, 2026 20:33
…nId indication

=> cap post-resume HITL polling by poll count instead of time

- Replace ±90s wall-clock guidance with ~5 status polls so the model has a discrete stop condition.
- Drop inter-poll (wait a few seconds) wording.
@h88 h88 enabled auto-merge (squash) May 4, 2026 09:45
@kibanamachine
Copy link
Copy Markdown
Contributor

kibanamachine commented May 4, 2026

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #187 / Agent Builder agents Edit agent should edit agent name

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
workflowsManagement 2.3MB 2.3MB +228.0B

History

cc @h88

@h88 h88 merged commit 87004a5 into elastic:main May 4, 2026
24 checks passed
@kibanamachine
Copy link
Copy Markdown
Contributor

Starting backport for target branches: 9.4

https://github.com/elastic/kibana/actions/runs/25316411854

@kibanamachine
Copy link
Copy Markdown
Contributor

💚 All backports created successfully

Status Branch Result
9.4

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request May 4, 2026
…64605) (#267459)

# Backport

This will backport the following commits from `main` to `9.4`:
- [[One Workflow] HITL reliable resume in chat and in execs UI
(#264605)](#264605)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT
[{"author":{"name":"Hashem","email":"11019955+h88@users.noreply.github.com"},"sourceCommit":{"committedDate":"2026-05-04T11:28:28Z","message":"[One
Workflow] HITL reliable resume in chat and in execs UI
(#264605)\n\nUsers (and agents) get fewer false second approval moments,
faster\nresume pickup and a working Resume button on back-to-back
pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the
same way as\ncancel so the resume task is picked up sooner\n-
Agent-facing state: expose waiting_step_id so the model can tell
(same\nvs new waiting step) when polling status\n- Prompt: resume +
workflow tool text explains async resume behaviour &\nsuggests status
polling instead of inventing a second HITL without\nevidence\n- WF exec
UI: re-enable resume when the paused step execution id changes\n(new
wait).\nScope\n\n_Note: No change to resume API contract or step
semantics_\n\nFixes
https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64","branchLabelMapping":{"^v9.5.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","Team:One
Workflow","v9.4.0","v9.5.0"],"title":"[One Workflow] HITL reliable
resume in chat and in execs
UI","number":264605,"url":"https://github.com/elastic/kibana/pull/264605","mergeCommit":{"message":"[One
Workflow] HITL reliable resume in chat and in execs UI
(#264605)\n\nUsers (and agents) get fewer false second approval moments,
faster\nresume pickup and a working Resume button on back-to-back
pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the
same way as\ncancel so the resume task is picked up sooner\n-
Agent-facing state: expose waiting_step_id so the model can tell
(same\nvs new waiting step) when polling status\n- Prompt: resume +
workflow tool text explains async resume behaviour &\nsuggests status
polling instead of inventing a second HITL without\nevidence\n- WF exec
UI: re-enable resume when the paused step execution id changes\n(new
wait).\nScope\n\n_Note: No change to resume API contract or step
semantics_\n\nFixes
https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64"}},"sourceBranch":"main","suggestedTargetBranches":["9.4"],"targetPullRequestStates":[{"branch":"9.4","label":"v9.4.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.5.0","branchLabelMappingKey":"^v9.5.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/264605","number":264605,"mergeCommit":{"message":"[One
Workflow] HITL reliable resume in chat and in execs UI
(#264605)\n\nUsers (and agents) get fewer false second approval moments,
faster\nresume pickup and a working Resume button on back-to-back
pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the
same way as\ncancel so the resume task is picked up sooner\n-
Agent-facing state: expose waiting_step_id so the model can tell
(same\nvs new waiting step) when polling status\n- Prompt: resume +
workflow tool text explains async resume behaviour &\nsuggests status
polling instead of inventing a second HITL without\nevidence\n- WF exec
UI: re-enable resume when the paused step execution id changes\n(new
wait).\nScope\n\n_Note: No change to resume API contract or step
semantics_\n\nFixes
https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64"}}]}]
BACKPORT-->

Co-authored-by: Hashem <11019955+h88@users.noreply.github.com>
seanrathier pushed a commit to seanrathier/kibana that referenced this pull request May 4, 2026
…264605)

Users (and agents) get fewer false second approval moments, faster
resume pickup and a working Resume button on back-to-back pauses.

- Engine: after resume is scheduled, nudge Task Manager the same way as
cancel so the resume task is picked up sooner
- Agent-facing state: expose waiting_step_id so the model can tell (same
vs new waiting step) when polling status
- Prompt: resume + workflow tool text explains async resume behaviour &
suggests status polling instead of inventing a second HITL without
evidence
- WF exec UI: re-enable resume when the paused step execution id changes
(new wait).
Scope

_Note: No change to resume API contract or step semantics_

Fixes elastic/security-team#16933
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes Team:One Workflow Team label for One Workflow (Workflow automation) v9.4.0 v9.5.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants