Skip to content

Refactor - extract ingest processors functionality into separate libraries#140206

Merged
eyalkoren merged 21 commits intoelastic:mainfrom
eyalkoren:refactor-extract-ingest-processors-functionality
Feb 9, 2026
Merged

Refactor - extract ingest processors functionality into separate libraries#140206
eyalkoren merged 21 commits intoelastic:mainfrom
eyalkoren:refactor-extract-ingest-processors-functionality

Conversation

@eyalkoren
Copy link
Copy Markdown
Contributor

@eyalkoren eyalkoren commented Jan 6, 2026

Like grok and dissect, there are additional functionalities we want to share across ingest processors and ES|QL commands.

This PR extracts the core functionality from the following processors into a shared library:

  • UriPartsProcessor
  • UserAgentProcessor done on a separate PR
  • RegisteredDomainProcessor

Notes for reviewer

  • As part of the RegisteredDomainProcessor refactor, apache's httpclient dependencies were moved from ingest-common to the new web-utils module. As part of the functionality that RegisteredDomain is using it for, it requires a file-read permission. When it was located in ingest-common, this permission was granted by the entitlements system because httpclient was loaded within a PLUGIN scope. However, since it is now located in a separate library, it gets the UNKNOWN scope in tests. I fixed that with a specific exception through TestScopeResolver. I expected to see the same issue in integration tests, but I added such for the registered_domain processor and they pass. So I am not sure what handles that in production, but it seems to be OK. Please verify that I a not missing anything.

@eyalkoren eyalkoren mentioned this pull request Jan 8, 2026
7 tasks
@eyalkoren eyalkoren requested a review from masseyke February 3, 2026 14:39
@eyalkoren eyalkoren marked this pull request as ready for review February 3, 2026 14:40
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Feb 3, 2026
@eyalkoren eyalkoren added >feature :Distributed/Ingest Node Execution or management of Ingest Pipelines labels Feb 3, 2026
@elasticsearchmachine elasticsearchmachine added Team:Distributed Meta label for distributed team. and removed needs:triage Requires assignment of a team area label labels Feb 3, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@masseyke masseyke requested a review from a team February 4, 2026 21:02
@masseyke
Copy link
Copy Markdown
Member

masseyke commented Feb 4, 2026

Aside from the questions I added, it looks like what we had discussed. I'm tagging es-core-infra for review since I've never added a new thing under libs, and don't actually understand what's going on in the build.gradle file.

@eyalkoren eyalkoren requested a review from rjernst February 5, 2026 16:44
Copy link
Copy Markdown
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build changes LGTM

@eyalkoren eyalkoren enabled auto-merge (squash) February 9, 2026 14:40
@eyalkoren eyalkoren merged commit df478e8 into elastic:main Feb 9, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/Ingest Node Execution or management of Ingest Pipelines >feature Team:Distributed Meta label for distributed team. v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants