Skip to content

[pdata/pprofile] Add reference based attributes support#14546

Merged
mx-psi merged 36 commits into
open-telemetry:mainfrom
florianl:pprofile-refs
Mar 11, 2026
Merged

[pdata/pprofile] Add reference based attributes support#14546
mx-psi merged 36 commits into
open-telemetry:mainfrom
florianl:pprofile-refs

Conversation

@florianl
Copy link
Copy Markdown
Member

@florianl florianl commented Feb 9, 2026

Description

Draft implementation for open-telemetry/opentelemetry-proto#733 that shows the transparent conversion between reference based attributes and pdata API.

With open-telemetry/opentelemetry-collector-contrib#46331 there is the respective change that fixes the failing tests in connector/countconnector.

FYI: @open-telemetry/profiling-approvers @tigrannajaryan @bogdandrutu

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 9, 2026

Codecov Report

❌ Patch coverage is 97.41935% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.47%. Comparing base (92c4252) to head (9a4eb4c).
⚠️ Report is 8 commits behind head on main.

Files with missing lines Patch % Lines
pdata/internal/generated_proto_anyvalue.go 92.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14546      +/-   ##
==========================================
+ Coverage   91.46%   91.47%   +0.01%     
==========================================
  Files         688      689       +1     
  Lines       43824    43975     +151     
==========================================
+ Hits        40084    40228     +144     
+ Misses       2629     2627       -2     
- Partials     1111     1120       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

florianl and others added 3 commits February 9, 2026 11:31
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@tigrannajaryan
Copy link
Copy Markdown
Member

@florianl have you been able to run any benchmarks? I would be curious to see what's the end-to-end performance impact when using this approach vs the implementation that doesn't resolve references in a config with a bare otlp-receiver/otlp-exporter pipeline.

florianl and others added 2 commits February 11, 2026 14:13
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@florianl

This comment was marked as outdated.

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@florianl

This comment was marked as outdated.

@tigrannajaryan
Copy link
Copy Markdown
Member

ProfilesFromProto-20 9.317µ ± 5% 8.348µ ± 12% -10.40% (p=0.007 n=10)

@florianl can you please help interpret this? After the change the code is doing strictly more work if I understand correctly. So how can it it be 10% faster?

@florianl
Copy link
Copy Markdown
Member Author

florianl commented Feb 18, 2026

ProfilesFromProto-20 9.317µ ± 5% 8.348µ ± 12% -10.40% (p=0.007 n=10)

can you please help interpret this? After the change the code is doing strictly more work if I understand correctly.

BenchmarkProfilesToProto and BenchmarkProfilesFromProto are two existing benchmarks in the codebase. Both use the function generateBenchmarkProfiles that generates profiles with a given number of samples - no attributes, no resources, no dictionary entries. So in these cases no conversion is happening as there is no data that could be converted. I kept these two benchmarks as they are part of the codebase already and if the new code introduces a regression, I wanted to see that.

So how can it it be 10% faster?

My assumption is, that the compiler refactors the code and for the case where there are "empty" profiles - no attributes, no resources, no use of the dictionary - it performs better.
For that reason, I did write new benchmarks (see the gist), that populate resource attributes with a set of valid attributes, like semconv.ServiceNameKey, semconv.ServiceVersionKey, semconv.ProcessPIDKey, semconv.K8SPodNameKey, semconv.K8SNamespaceNameKey and semconv.TelemetrySDKNameKey. In these cases the benchmark use the new code to convert from/to reference based attributes.

@tigrannajaryan I hope this helps to put the benchmark results into perspective. I'm happy to add these new benchmarks to the codebase - please let me know.

@tigrannajaryan
Copy link
Copy Markdown
Member

@tigrannajaryan I hope this helps to put the benchmark results into perspective. I'm happy to add these new benchmarks to the codebase - please let me know.

Yes, if you think the new benchmarks are a more valid comparison it would be good to see the results.

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@florianl
Copy link
Copy Markdown
Member Author

The results of these new benchmarks are shown in #14546 (comment) - see MarshalProfiles* and UnmarshalProfiles*. With fdab8d7 I have added them to the codebase - so they can fetch future regressions.

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@tigrannajaryan
Copy link
Copy Markdown
Member

The results of these new benchmarks are shown in #14546 (comment) - see MarshalProfiles* and UnmarshalProfiles*. With fdab8d7 I have added them to the codebase - so they can fetch future regressions.

Ahh, sorry I misunderstood the benchmarks.

So, from what I see there is virtually no penalty in the unmarshalling part and about 14% penalty in marshalling of small payloads. This seems acceptable to me.

My only concern would be to double check the benchmarks are correct. For example looking at MarshalProfiles, it shows no difference in allocs/op (actually only 1 alloc/op), but shouldn't this increase because we are building a map during marshalling? Is it possible benchmarks are not hitting the interesting paths? It would be great to benchmark with production data instead of synthetic data.

Comment thread pdata/pprofile/dictionary_helpers.go Outdated
@tigrannajaryan
Copy link
Copy Markdown
Member

@florianl in my opinion 14% marshalling speed penalty is an acceptable tradeoff to avoid the complexity of adding reference support to pdata API.

My only remaining concern is that I am not sure I understand why benchmarks don't show increase in memory allocations, which makes them suspect. If you can double check the benchmarks and explain what's going on then I am fully on board with this approach.

florianl and others added 3 commits February 21, 2026 12:01
Per the proto spec [1], key and key_ref are mutually exclusive on the wire.
Without clearing Key after setting KeyRef, both fields would be
serialized, violating the MUST NOT constraint and wasting bytes.

[1] open-telemetry/opentelemetry-proto#733
@florianl florianl changed the title pdata: add reference based attributes support [pdata/pprofile] Add reference based attributes support Feb 27, 2026
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
@florianl florianl marked this pull request as ready for review March 10, 2026 10:51
@dmathieu
Copy link
Copy Markdown
Member

We should have a collector-contrib PR ready to be merged fixing the breaking changes.

@florianl
Copy link
Copy Markdown
Member Author

@dmathieu the respective collector contrib PR is open-telemetry/opentelemetry-collector-contrib#46331

@mx-psi mx-psi added this pull request to the merge queue Mar 11, 2026
Merged via the queue into open-telemetry:main with commit c85224a Mar 11, 2026
85 of 101 checks passed
florianl added a commit to florianl/opentelemetry-collector-contrib that referenced this pull request Mar 11, 2026
This is a complementary change for open-telemetry/opentelemetry-collector#14546.
With the change in the Profiling signal, the data in input.yaml no longer meets the expected format.

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
mx-psi pushed a commit to open-telemetry/opentelemetry-collector-contrib that referenced this pull request Mar 11, 2026
This is a complementary change for
open-telemetry/opentelemetry-collector#14546.
With the change in the Profiling signal, the data in input.yaml no
longer meets the expected format.

Without
open-telemetry/opentelemetry-collector#14546
being merged, this PR is expected to fail tests in
connector/countconnector.

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

<!-- Issue number (e.g. #1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
avleentwilio pushed a commit to avleentwilio/opentelemetry-collector-contrib that referenced this pull request Apr 1, 2026
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

open-telemetry/opentelemetry-proto#733 and
open-telemetry/opentelemetry-collector#14546 are
about to change how pprofiles work. As the input is changed in memory
during the marshaling operation, it can no longer be used directly to
validate the output.

FYI: @felixge 

<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants