Skip to content

feat: add keep_alive to async task status#144010

Merged
thecoop merged 12 commits intoelastic:mainfrom
lxrbox:issue-142861-tasks-keep-alive
Mar 16, 2026
Merged

feat: add keep_alive to async task status#144010
thecoop merged 12 commits intoelastic:mainfrom
lxrbox:issue-142861-tasks-keep-alive

Conversation

@lxrbox
Copy link
Copy Markdown
Contributor

@lxrbox lxrbox commented Mar 11, 2026

Add keep_alive field to task status for async search and ES|QL query tasks. The keep_alive value is now tracked in the task state and exposed via the task status API, allowing clients to observe the current keep_alive setting for running async operations.

  • Update AsyncTask interface to track keep_alive alongside expiration
  • Expose keep_alive in AsyncSearchTask and EsqlQueryTask status
  • Maintain backwards compatibility in task status serialization
  • Add integration tests verifying keep_alive in task info
  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against main? Unless there is a good reason otherwise, we prefer pull requests against main and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS and architecture that we support?
  • If you are submitting this code for a class then read our policy for that.

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Mar 11, 2026

💚 CLA has been signed

@elasticsearchmachine elasticsearchmachine added v9.4.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Mar 11, 2026
@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 11, 2026

CLA is signed. Could an Elastic maintainer please add an assignee and the
appropriate :Team / >type labels? CONTRIBUTING.md says changelog entries
are usually auto-created for external contributors, so I’ll wait unless you’d
prefer me to add docs/changelog/144010.yaml manually.

@szybia szybia added :Search Relevance/ES|QL Search functionality in ES|QL and removed needs:triage Requires assignment of a team area label labels Mar 12, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 12, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@szybia szybia added >enhancement and removed Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Mar 12, 2026
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Mar 12, 2026
@szybia
Copy link
Copy Markdown
Contributor

szybia commented Mar 12, 2026

hi @lxrbox, thank you for the contribution!

hopefully someone from Search Relevance has some time to look at this.

tagging relevant issue: #142861

@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 12, 2026

buildkite test this please

@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 12, 2026

@lxrbox there's several test failures, most specifically from :server:generateClusterFeaturesMetadata. Could you update from main, and fix up the test failures? You may need to re-generate the transport version number

@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 13, 2026

@lxrbox there's several test failures, most specifically from :server:generateClusterFeaturesMetadata. Could you update from main, and fix up the test failures? You may need to re-generate the transport version number

Thanks for flagging this. I'll update from main, regenerate the transport version, and fix the failing checks.

@lxrbox lxrbox force-pushed the issue-142861-tasks-keep-alive branch from 04cec53 to a53f32f Compare March 13, 2026 01:17
@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 13, 2026

buildkite test this please

var targetTaskId = AsyncExecutionId.decode(asyncSearchId).getTaskId();
TaskInfo found = null;
for (TaskInfo taskInfo : client().admin().cluster().prepareListTasks().setDetailed(true).get().getTasks()) {
if (taskInfo.taskId().equals(targetTaskId)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this be a stream? ...getTasks().stream().filter(task()==targetTaskId).findAny()

try (AsyncSearchTask task = createAsyncSearchTask()) {
TaskInfo taskInfo = task.taskInfo("node1", true);
assertThat(taskInfo.status(), notNullValue());
assertThat(taskInfo.toString(), containsString("\"request_id\" : \"" + task.getExecutionId().getEncoded() + "\""));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - these can be hasToString(containsString(...))

List.of(new NamedWriteableRegistry.Entry(Task.Status.class, RawTaskStatus.NAME, RawTaskStatus::new))
);
TaskInfo taskInfo = task.taskInfo("node1", true);
TransportVersion previousVersion = TransportVersionUtils.getPreviousVersion(TransportVersion.current());
Copy link
Copy Markdown
Member

@thecoop thecoop Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to get the transport version previous to the specific change you're introducing here, not the previous to current() (which will move forwards). You can use the randomVersionNotSupporting method for this.

new AsyncExecutionId(randomAlphaOfLength(10), new TaskId(randomAlphaOfLength(4), randomNonNegativeLong())),
TimeValue.timeValueDays(2)
);
TransportVersion previousVersion = TransportVersionUtils.getPreviousVersion(TransportVersion.current());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment on TransportVersion checks above

@thecoop thecoop self-assigned this Mar 13, 2026
@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 13, 2026

Looking pretty good. A couple more comments, and this also needs a changelog, and an update from main, and then we're good to go

@lxrbox lxrbox force-pushed the issue-142861-tasks-keep-alive branch from a53f32f to 5ecd818 Compare March 13, 2026 15:24
@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 13, 2026

看起来不错。再补充几点,还需要一个更新日志,以及主分支的更新,然后就可以正式发布了。

Addressed the remaining comments:

  • switched the task lookup helper to a stream
  • updated the transport-version BWC checks to use randomVersionNotSupporting(...) with the feature-specific transport version
  • added a changelog
  • rebased onto main

@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 13, 2026

The changelog needs to be for this PR number, #144010

@lxrbox lxrbox force-pushed the issue-142861-tasks-keep-alive branch from 5ecd818 to 500e696 Compare March 13, 2026 15:40
@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 13, 2026

The changelog needs to be for this PR number, #144010

Updated the changelog to use this PR number (#144010).

@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 13, 2026

buildkite test this

lxrbox added 5 commits March 14, 2026 07:25
Add keep_alive field to task status for async search and ES|QL query
tasks. The keep_alive value is now tracked in the task state and
exposed via the task status API, allowing clients to observe the
current keep_alive setting for running async operations.

- Update AsyncTask interface to track keep_alive alongside expiration
- Expose keep_alive in AsyncSearchTask and EsqlQueryTask status
- Maintain backwards compatibility in task status serialization
- Add integration tests verifying keep_alive in task info
 Use hasEntry with allOf in AsyncSearchTaskTests to assert the
 serialized RawTaskStatus map more explicitly and avoid repeated
 toMap() calls.
  Use a feature-specific transport version for async task status
  compatibility checks instead of deriving from current().

  Also apply the follow-up review cleanups by switching the task
  lookup helper to a stream-based implementation, using hasToString
  assertions, and adding a changelog entry.
  Rename the changelog entry to match the actual PR number so the
  community PR validation passes.
@lxrbox lxrbox force-pushed the issue-142861-tasks-keep-alive branch from bd8f1c0 to deacdfb Compare March 13, 2026 23:30
@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 13, 2026

updated the transport-version checks to use the feature-specific version boundary

@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 16, 2026

buildkite test this

@thecoop thecoop merged commit 556119e into elastic:main Mar 16, 2026
10 checks passed
@thecoop
Copy link
Copy Markdown
Member

thecoop commented Mar 16, 2026

All merged, thanks for the contribution!

@lxrbox
Copy link
Copy Markdown
Contributor Author

lxrbox commented Mar 16, 2026

All merged, thanks for the contribution!

Thanks for the review and merge! Glad to contribute.

szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 16, 2026
…elocations

* upstream/main: (33 commits)
  Unmute InferenceRestIT and DefaultEndPointsIT (elastic#144217)
  feat: add keep_alive to async task status (elastic#144010)
  Add explicit isNoOpUpdate() method to MapperService (elastic#144113)
  Always attach APM Agent (elastic#144120)
  Fix random_score nightly tests (elastic#144176)
  Add nested query checks for disabled sequence numbers (elastic#144185)
  Return sentinel values from Fetch when sequence numbers are disabled (elastic#144212)
  [Test] Test peer-recovery with sequence numbers pruning (elastic#144116)
  Remove `scaled-*` field assertions from mixed cluster downsampling test (elastic#144295)
  Refactor: Use range syntax in ES|QL exponential histogram tests (elastic#144110)
  Move resolve aliases to IndexAbstractionOptions (elastic#143953)
  unmute test (elastic#144299)
  Fix approximation csvtests (elastic#144233)
  fix test (elastic#144171)
  Add int4 vector scoring benchmarks (elastic#144105)
  Mute org.elasticsearch.xpack.esql.qa.single_node.GenerativeIT test elastic#143023
  Mute org.elasticsearch.test.apmintegration.MetricsApmIT testApmIntegration {withOTel=false} elastic#144282
  Native cli launcher (elastic#143712)
  Mute org.elasticsearch.xpack.esql.qa.multi_node.GenerativeIT test elastic#143023
  Mute org.elasticsearch.xpack.esql.heap_attack.HeapAttackSubqueryIT testManyRandomKeywordFieldsInSubqueryIntermediateResults elastic#144274
  ...
michalborek pushed a commit to michalborek/elasticsearch that referenced this pull request Mar 23, 2026
Add keep_alive field to task status for async search and ES|QL query
tasks. The keep_alive value is now tracked in the task state and
exposed via the task status API, allowing clients to observe the
current keep_alive setting for running async operations.

---------

Co-authored-by: Simon Cooper <simon.cooper@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team :Search Relevance/ES|QL Search functionality in ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants