[processor/enrichmentprocessor] first version of enrichmentprocessor#42056
[processor/enrichmentprocessor] first version of enrichmentprocessor#42056kyo-ke wants to merge 24 commits into
Conversation
|
Pushing this code is premature. We need more discussion on the merits of this approach over adding a detector to resourcedetectionprocessor. |
|
@atoulme Thank you for taking look at this PR! This processor is for enriching each log, metric, span level attribute using external datasource. If there is plan to expand resourcedetectionprocessor to enrich each metric, span, log, totally agreed to add these functionality to resourcedetectionprocessor. Are there any plan for this? |
|
Can you please join a SIG meeting and discuss this with the community? That would help move things forward. Thanks! |
|
In SIG meeting(Aug 26, 2025) got 2 feedback
|
I have some thoughts on 1 I think it's fair to think there is situation like in one ResourceMetrics contains multiple db.namespace and user want to enrich datapoint level attribute by owner. May be user need to use groupbyattrsprocessor in this case. Another case which we need to enrich datapoint level attribute is adding username based on userid. So think there is usecase for functionality for enriching datapoint/log/span level attribute |
|
@kyo-ke 👋 I'm back from vacation and noticed this PR and its movement and discussion in a past SIG, would love to chat about this and see if we can come up with a joint proposal that can be sponsored so we can move on with implementation. I'm on the community slack as João Duarte (EU timezone). |
@jsvd Thank you for taking a look at this. To answer this question we need to clarify 2 thing
For 1. For 2. IMO regrouping always for this type of attribute is too much Correct me if I'm misunderstanding concept. Helpful if you share your opinion/use case. Thanks! |
I do think enrichment should be possible across the entire set of attributes of each signal: This is particularly important for logs where log records can contain events with properties related to entities passable of being mapped into other values.
IMO the processor should iterate over the maps and perform the lookup whenever the attributes are found. A performance improvement could be done to focus the traversal on resource attributes only if a configuration such as |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Thank you @jsvd for the comment. do you think we can get sponsor for this by bringing this up in sig meeting? |
|
hi @kyo-ke, I had a parallel effort to add lookup-based enrichment to the Collector by following the recommended contribution process as closely as possible, which started with a "formal" proposal: #41816. This proposal suggests the creation of a processor that performs enrichment on signals, and the sources for the data (or metadata) come from extensions to the processor. These extensions would have a common interface and behave in a similar manner (e.g. caching, error handling, metrics, etc.). Example lookup extensions could be: HTTP lookups, lookups in CSV/JSON/YAML, lookups in databases, etc. IMO #41816 has several similarities to this effort and it'd be nice to work together on delivering this feature. I haven't put up a "meaty" PR yet as that proposal doesn't have a sponsor either but I added a skeleton PR #43120 to show how the interfaces would work together. The proposal has been brought up in a couple of SIG meetings looking for sponsorship, and while it has been collecting +1s and ❤️ s, I haven't had any luck either, but not losing hope :) Sorry for the direct ping here @atoulme, but I'm wondering if you'd be interested in getting involved in this effort, either through @kyo-ke's or my proposal (or a joint one). The proposal at #41816 is the 2nd/3rd most voted open proposal looking for sponsorship at this point our of nearly 30. This demonstrates lookup-based enrichment is a sough after feature in the Collector, and I'd like to avoid having a single person (or company) own the feature. Happy to discuss this in any channel or medium. |
|
This PR was marked stale due to lack of activity. It will be closed in 14 days. |
|
Closed as inactive. Feel free to reopen if this PR is still being worked on. |
Description
New Component enrichemntprocessor.
This processor can enrich attribute for three pillar using external data via file or http(csv/json).
This component keep monitoring file/endpoint.
Link to tracking issue
#41816
#40526
Testing
unit test is added with 79% coverage
Documentation
Each enrichment for datapoint/ span/ logrecord is done in constant order.
Internally it is holding each line/object of csv/json as array and when it load the data it will create index.(lookup.go)