[One Workflow] HITL reliable resume in chat and in execs UI#264605
[One Workflow] HITL reliable resume in chat and in execs UI#264605h88 merged 8 commits intoelastic:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Improves reliability of HITL resume by making the engine pick up resume tasks faster, exposing an identifier to distinguish stale vs new waits when polling, updating agent/tool guidance, and re-enabling the Resume UI on back-to-back pauses.
Changes:
- Nudge Task Manager after resume scheduling (similar to cancel) to reduce pickup latency.
- Add
waiting_input.waiting_step_idto workflow execution state and update tool text to guide correct polling behavior. - Update workflows exec UI to re-enable Resume when the active paused step execution changes, with tests.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| x-pack/platform/plugins/shared/agent_builder_platform/server/tools/resume_workflow_execution.ts | Updates the tool prompt to explain async resume + polling/stale snapshot behavior. |
| x-pack/platform/plugins/shared/agent_builder/server/services/tools/tool_types/workflow/tool_type.ts | Updates workflow tool help text to document waiting_step_id. |
| x-pack/platform/packages/shared/agent-builder/agent-builder-genai-utils/tools/utils/workflows/get_execution_state.ts | Adds waiting_step_id into waiting_input in the derived execution state. |
| x-pack/platform/packages/shared/agent-builder/agent-builder-genai-utils/tools/utils/workflows/get_execution_state.test.ts | Updates tests to assert waiting_step_id is present. |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_step_execution_details.tsx | Threads waitingStepExecutionId through to Resume UI. |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_overview.tsx | Threads waitingStepExecutionId through to Resume UI. |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_detail.tsx | Tracks active wait step execution row id and passes it down. |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/workflow_execution_detail.test.tsx | Updates comment to align with renamed identifier (no behavior change). |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/resume_execution_button.tsx | Resets “submitted” state when the active wait step execution changes. |
| src/platform/plugins/shared/workflows_management/public/features/workflow_execution_detail/ui/resume_execution_button.test.tsx | Adds coverage for re-enabling after waitingStepExecutionId changes. |
| src/platform/plugins/shared/workflows_execution_engine/server/plugin.ts | Forces idle tasks to run after resume to speed up resume pickup. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
💛 Build succeeded, but was flaky
Failed CI StepsTest Failures
Metrics [docs]Async chunks
cc @h88 |
…nId indication => cap post-resume HITL polling by poll count instead of time - Replace ±90s wall-clock guidance with ~5 status polls so the model has a discrete stop condition. - Drop inter-poll (wait a few seconds) wording.
💛 Build succeeded, but was flaky
Failed CI Steps
Test FailuresMetrics [docs]Async chunks
History
cc @h88 |
|
Starting backport for target branches: 9.4 https://github.com/elastic/kibana/actions/runs/25316411854 |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…64605) (#267459) # Backport This will backport the following commits from `main` to `9.4`: - [[One Workflow] HITL reliable resume in chat and in execs UI (#264605)](#264605) <!--- Backport version: 9.6.6 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sorenlouv/backport) <!--BACKPORT [{"author":{"name":"Hashem","email":"11019955+h88@users.noreply.github.com"},"sourceCommit":{"committedDate":"2026-05-04T11:28:28Z","message":"[One Workflow] HITL reliable resume in chat and in execs UI (#264605)\n\nUsers (and agents) get fewer false second approval moments, faster\nresume pickup and a working Resume button on back-to-back pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the same way as\ncancel so the resume task is picked up sooner\n- Agent-facing state: expose waiting_step_id so the model can tell (same\nvs new waiting step) when polling status\n- Prompt: resume + workflow tool text explains async resume behaviour &\nsuggests status polling instead of inventing a second HITL without\nevidence\n- WF exec UI: re-enable resume when the paused step execution id changes\n(new wait).\nScope\n\n_Note: No change to resume API contract or step semantics_\n\nFixes https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64","branchLabelMapping":{"^v9.5.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","backport:version","Team:One Workflow","v9.4.0","v9.5.0"],"title":"[One Workflow] HITL reliable resume in chat and in execs UI","number":264605,"url":"https://github.com/elastic/kibana/pull/264605","mergeCommit":{"message":"[One Workflow] HITL reliable resume in chat and in execs UI (#264605)\n\nUsers (and agents) get fewer false second approval moments, faster\nresume pickup and a working Resume button on back-to-back pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the same way as\ncancel so the resume task is picked up sooner\n- Agent-facing state: expose waiting_step_id so the model can tell (same\nvs new waiting step) when polling status\n- Prompt: resume + workflow tool text explains async resume behaviour &\nsuggests status polling instead of inventing a second HITL without\nevidence\n- WF exec UI: re-enable resume when the paused step execution id changes\n(new wait).\nScope\n\n_Note: No change to resume API contract or step semantics_\n\nFixes https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64"}},"sourceBranch":"main","suggestedTargetBranches":["9.4"],"targetPullRequestStates":[{"branch":"9.4","label":"v9.4.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.5.0","branchLabelMappingKey":"^v9.5.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/264605","number":264605,"mergeCommit":{"message":"[One Workflow] HITL reliable resume in chat and in execs UI (#264605)\n\nUsers (and agents) get fewer false second approval moments, faster\nresume pickup and a working Resume button on back-to-back pauses.\n\n- Engine: after resume is scheduled, nudge Task Manager the same way as\ncancel so the resume task is picked up sooner\n- Agent-facing state: expose waiting_step_id so the model can tell (same\nvs new waiting step) when polling status\n- Prompt: resume + workflow tool text explains async resume behaviour &\nsuggests status polling instead of inventing a second HITL without\nevidence\n- WF exec UI: re-enable resume when the paused step execution id changes\n(new wait).\nScope\n\n_Note: No change to resume API contract or step semantics_\n\nFixes https://github.com/elastic/security-team/issues/16933","sha":"87004a5da3f1b2df718a5b52419205043b280a64"}}]}] BACKPORT--> Co-authored-by: Hashem <11019955+h88@users.noreply.github.com>
…264605) Users (and agents) get fewer false second approval moments, faster resume pickup and a working Resume button on back-to-back pauses. - Engine: after resume is scheduled, nudge Task Manager the same way as cancel so the resume task is picked up sooner - Agent-facing state: expose waiting_step_id so the model can tell (same vs new waiting step) when polling status - Prompt: resume + workflow tool text explains async resume behaviour & suggests status polling instead of inventing a second HITL without evidence - WF exec UI: re-enable resume when the paused step execution id changes (new wait). Scope _Note: No change to resume API contract or step semantics_ Fixes elastic/security-team#16933
Users (and agents) get fewer false second approval moments, faster resume pickup and a working Resume button on back-to-back pauses.
Scope
Note: No change to resume API contract or step semantics
Testing workflow:
Fixes https://github.com/elastic/security-team/issues/16933