Skip to content

Comments

[Inference] regex worker for anonymization#227113

Merged
neptunian merged 25 commits intoelastic:mainfrom
neptunian:anonymization-regex-worker
Jul 11, 2025
Merged

[Inference] regex worker for anonymization#227113
neptunian merged 25 commits intoelastic:mainfrom
neptunian:anonymization-regex-worker

Conversation

@neptunian
Copy link
Contributor

@neptunian neptunian commented Jul 8, 2025

Summary

Followup to #225539

Adds an anonymization-regex worker pool to run user-defined regex patterns out-of-process, protecting the Node event-loop from catastrophic back-tracking.

For internal use we have config to adjust or turn off the worker:
enabled – turn worker on/off
minThreads / maxThreads – worker size
idleTimeout – how long a worker thread is allowed to be idle
taskTimeout – per-regex task limit

Contributions by windsurf(o3).

@neptunian neptunian marked this pull request as ready for review July 8, 2025 23:37
@neptunian neptunian requested a review from a team as a code owner July 8, 2025 23:37
@neptunian neptunian added backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes v8.19.0 v9.1.0 Team:Obs AI Assistant Observability AI Assistant labels Jul 8, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/obs-ai-assistant (Team:Obs AI Assistant)

Copy link
Contributor

@pgayvallet pgayvallet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just passing by

Comment on lines 8 to 9
// eslint-disable-next-line @kbn/imports/no_boundary_crossing
require('../../../../../../../../src/setup_node_env');
Copy link
Contributor

@pgayvallet pgayvallet Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈 we really have zero supports for workers today

],
"kbn_references": [
{
"path": "../../../../../src/setup_node_env/tsconfig.json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

@neptunian neptunian Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup_node_env causes type errors without it.

Copy link
Contributor

@dgieselaar dgieselaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving ahead of time, but please take a look at x-pack/platform/plugins/shared/inference/server/chat_complete/anonymization/regex_worker_wrapper.js - specifically the /dist import. thanks for getting this done so quickly!

@neptunian neptunian requested a review from a team as a code owner July 11, 2025 13:19
@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #94 / GlobalSearch API GlobalSearch providers "before all" hook in "GlobalSearch providers"

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/inference-common 161 164 +3
Unknown metric groups

API count

id before after diff
@kbn/inference-common 302 305 +3

ESLint disabled in files

id before after diff
inference 2 4 +2

ESLint disabled line counts

id before after diff
inference 1 2 +1

Total ESLint disabled count

id before after diff
inference 3 6 +3

History

@neptunian neptunian merged commit 1b9063a into elastic:main Jul 11, 2025
14 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.19, 9.1

https://github.com/elastic/kibana/actions/runs/16229533879

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jul 11, 2025
## Summary

Followup to elastic#225539

Adds an anonymization-regex worker pool to run user-defined regex
patterns out-of-process, protecting the Node event-loop from
catastrophic back-tracking.

For internal use we have config to adjust or turn off the worker:
• `enabled` – turn worker on/off
• `minThreads` / `maxThreads` – worker size
• `idleTimeout` – how long a worker thread is allowed to be idle
• `taskTimeout` – per-regex task limit

Contributions by windsurf(o3).

(cherry picked from commit 1b9063a)
kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jul 11, 2025
## Summary

Followup to elastic#225539

Adds an anonymization-regex worker pool to run user-defined regex
patterns out-of-process, protecting the Node event-loop from
catastrophic back-tracking.

For internal use we have config to adjust or turn off the worker:
• `enabled` – turn worker on/off
• `minThreads` / `maxThreads` – worker size
• `idleTimeout` – how long a worker thread is allowed to be idle
• `taskTimeout` – per-regex task limit

Contributions by windsurf(o3).

(cherry picked from commit 1b9063a)
@kibanamachine
Copy link
Contributor

💚 All backports created successfully

Status Branch Result
8.19
9.1

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Jul 11, 2025
# Backport

This will backport the following commits from `main` to `9.1`:
- [[Inference] regex worker for anonymization
(#227113)](#227113)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Sandra
G","email":"neptunian@users.noreply.github.com"},"sourceCommit":{"committedDate":"2025-07-11T20:52:28Z","message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Obs
AI
Assistant","backport:version","v9.1.0","v8.19.0","v9.2.0"],"title":"[Inference]
regex worker for
anonymization","number":227113,"url":"https://github.com/elastic/kibana/pull/227113","mergeCommit":{"message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5"}},"sourceBranch":"main","suggestedTargetBranches":["9.1","8.19"],"targetPullRequestStates":[{"branch":"9.1","label":"v9.1.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/227113","number":227113,"mergeCommit":{"message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5"}}]}]
BACKPORT-->

Co-authored-by: Sandra G <neptunian@users.noreply.github.com>
kibanamachine added a commit that referenced this pull request Jul 11, 2025
# Backport

This will backport the following commits from `main` to `8.19`:
- [[Inference] regex worker for anonymization
(#227113)](#227113)

<!--- Backport version: 9.6.6 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sorenlouv/backport)

<!--BACKPORT [{"author":{"name":"Sandra
G","email":"neptunian@users.noreply.github.com"},"sourceCommit":{"committedDate":"2025-07-11T20:52:28Z","message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5","branchLabelMapping":{"^v9.2.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Team:Obs
AI
Assistant","backport:version","v9.1.0","v8.19.0","v9.2.0"],"title":"[Inference]
regex worker for
anonymization","number":227113,"url":"https://github.com/elastic/kibana/pull/227113","mergeCommit":{"message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5"}},"sourceBranch":"main","suggestedTargetBranches":["9.1","8.19"],"targetPullRequestStates":[{"branch":"9.1","label":"v9.1.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.19","label":"v8.19.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.2.0","branchLabelMappingKey":"^v9.2.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/227113","number":227113,"mergeCommit":{"message":"[Inference]
regex worker for anonymization (#227113)\n\n## Summary\n\nFollowup to
https://github.com/elastic/kibana/pull/225539\n\nAdds an
anonymization-regex worker pool to run user-defined regex\npatterns
out-of-process, protecting the Node event-loop from\ncatastrophic
back-tracking.\n\nFor internal use we have config to adjust or turn off
the worker:\n• `enabled` – turn worker on/off\n• `minThreads` /
`maxThreads` – worker size\n• `idleTimeout` – how long a worker thread
is allowed to be idle\n• `taskTimeout` – per-regex task
limit\n\nContributions by
windsurf(o3).","sha":"1b9063a61e2deaa56fae1f89cfce020a99f733d5"}}]}]
BACKPORT-->

Co-authored-by: Sandra G <neptunian@users.noreply.github.com>
kertal pushed a commit to kertal/kibana that referenced this pull request Jul 25, 2025
## Summary

Followup to elastic#225539

Adds an anonymization-regex worker pool to run user-defined regex
patterns out-of-process, protecting the Node event-loop from
catastrophic back-tracking.

For internal use we have config to adjust or turn off the worker:
• `enabled` – turn worker on/off
• `minThreads` / `maxThreads` – worker size
• `idleTimeout` – how long a worker thread is allowed to be idle
• `taskTimeout` – per-regex task limit

Contributions by windsurf(o3).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport:version Backport to applied version labels release_note:skip Skip the PR/issue when compiling release notes Team:Obs AI Assistant Observability AI Assistant v8.19.0 v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants