Skip to content

profiles: use single Profile.sample_type and clarify use of timestamps#649

Merged
tigrannajaryan merged 12 commits intoopen-telemetry:mainfrom
florianl:issue-633
Jul 2, 2025
Merged

profiles: use single Profile.sample_type and clarify use of timestamps#649
tigrannajaryan merged 12 commits intoopen-telemetry:mainfrom
florianl:issue-633

Conversation

@florianl
Copy link
Copy Markdown
Member

@florianl florianl commented May 2, 2025

Implements #633 (comment).

FYI: @open-telemetry/profiling-approvers

Implements open-telemetry#633 (comment).

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Co-authored-by: Felix Geisendörfer <felix.geisendoerfer@datadoghq.com>
@florianl florianl requested review from a team May 2, 2025 12:47
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
@aalexand
Copy link
Copy Markdown
Member

aalexand commented May 2, 2025

Should default sample type go to the scope message?

Co-authored-by: Christos Kalkanis <christos.kalkanis@elastic.co>
@christos68k
Copy link
Copy Markdown
Member

christos68k commented May 5, 2025

Should default sample type go to the scope message?

That would be less flexible as it'd force us to have separate instrumentation scopes (ScopeProfiles) for different sample types. With the current proposal that's a possibility but not a requirement. The OTel documentation for instrumentation scope is sufficiently vague to allow both possibilities. For example, with the OTel eBPF profiling agent, we could have a single instrumentation scope (the eBPF profiler receiver/library) produce profiles with different sample types (cpu and memory).

Do you have a use-case in mind where associating different sample types with separate instrumentation scopes would be a benefit?

(I don't have a strong opinion on this, just that if we don't have a good reason to force a separate instrumentation scope per sample type, I'd go for more flexibility).

@aalexand
Copy link
Copy Markdown
Member

aalexand commented May 6, 2025

Should default sample type go to the scope message?

That would be less flexible as it'd force us to have separate instrumentation scopes (ScopeProfiles) for different sample types. With the current proposal that's a possibility but not a requirement. The OTel documentation for instrumentation scope is sufficiently vague to allow both possibilities. For example, with the OTel eBPF profiling agent, we could have a single instrumentation scope (the eBPF profiler receiver/library) produce profiles with different sample types (cpu and memory).

Do you have a use-case in mind where associating different sample types with separate instrumentation scopes would be a benefit?

(I don't have a strong opinion on this, just that if we don't have a good reason to force a separate instrumentation scope per sample type, I'd go for more flexibility).

I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

@florianl
Copy link
Copy Markdown
Member Author

florianl commented May 6, 2025

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

The main purpose of the proposal is to align values with timestamps as laid out in #633 (comment).

In general google/pprof it is possible to have multiple sample_types per Profile with special handling of sample_type.unit=count but does not have detailed timestamps but just a durations. With a slightly more focus on timestamps in OTel Profiling, I think, the concept of having multiple sample_types, like in google/pprof, conflicts. IIRC @felixge and I discussed the situation, where one wants to report profiles with different sample types and we concluded that it should be fine, to have multiple message Profile with their dedicated sample_type. I think, it also reduces complexity by not having a concept of a default sample_type and simplifies also the handling of message Profiles in the backend, if they just want to focus on a certain sample_type.

@christos68k
Copy link
Copy Markdown
Member

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

The main purpose of the proposal is to align values with timestamps as laid out in #633 (comment).

In general google/pprof it is possible to have multiple sample_types per Profile with special handling of sample_type.unit=count but does not have detailed timestamps but just a durations. With a slightly more focus on timestamps in OTel Profiling, I think, the concept of having multiple sample_types, like in google/pprof, conflicts. IIRC @felixge and I discussed the situation, where one wants to report profiles with different sample types and we concluded that it should be fine, to have multiple message Profile with their dedicated sample_type. I think, it also reduces complexity by not having a concept of a default sample_type and simplifies also the handling of message Profiles in the backend, if they just want to focus on a certain sample_type.

@aalexand (I misread your previous comment and thought you wanted to lift sample_type to the scope message). Regarding default_sample_type, are you referring to introducing it in the scope message in order to make sample_type in each profile optional or do you have something else in mind? Lile @florianl mentioned the current proposal splits different sample_types across different profiles.

@aalexand
Copy link
Copy Markdown
Member

aalexand commented May 6, 2025

Should default sample type go to the scope message?
I mostly want to make sure there is some way to capture the default sample type. The current PR removes the field IIUC.

@aalexand (I misread your previous comment and thought you wanted to lift sample_type to the scope message). Regarding default_sample_type, are you referring to introducing it in the scope message in order to make sample_type in each profile optional or do you have something else in mind? Lile @florianl mentioned the current proposal splits different sample_types across different profiles.

As one example, Go heap profiles have four sample types today: alloc_objects, alloc_space, inuse_objects, inuse_space. There are two profile types with this sample type set:

  • The heap profile sets the default sample type to empty string (that is, unsets it) which means that tools visualizing the profile should use the last sample type as the default one (the "use the last one if the default is not set" is a non-intuitive convention but this is what the pprof format has historically).
  • The alloc profile on the other hand sets the default sample type to alloc_space, so that tools like pprof focus the user on that metric by default.

The question is how such a profile with four sample types and one of them being the preferred default as chosen by the profile producer should be encoded in OpenTelemetry profiles. Before this change it would be the array of sample types plus the default sample type. What would it be after this change and how would the default be captured?

@christos68k
Copy link
Copy Markdown
Member

The question is how such a profile with four sample types and one of them being the preferred default as chosen by the profile producer should be encoded in OpenTelemetry profiles. Before this change it would be the array of sample types plus the default sample type. What would it be after this change and how would the default be captured?

The current proposal doesn't have an explicit default sample type (and we probably should steer away from introducing implicit conventions) so assuming four different sample types in a single OTel messsage, samples will be grouped according to sample type to separate profiles. One possible encoding would have four different profiles, each with a different sample type (of course one could further split the data according to other criteria).

Some options to accommodate the notion of a default sample type (e.g. for a consumer visualization tool) include:

  1. Use an InstrumentationScope attribute
  2. Introduce default_sample_type in ScopeProfiles

1 is more restricted than 2 as it could be seen as purely advisory metadata that a visualization tool can use. Adding default_sample_type to ScopeProfiles could also act as a fallback sample_type for profiles that do not declare it.

I think using nanoseconds is the better default unit for CPU profiles
because the weight of a count can be different depending on the sample
rate.

Using the same unit for off_cpu and cpu is more consistent.

The comment above was a little confusing, since it referred only to a
cpu profile, but was then followed by an off_cpu profile sample as well.
Copy link
Copy Markdown
Member

@felixge felixge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Sorry for the delay. PTAL at my comments. (I still need to find time to reply to some of the comments above, will do soon)

Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
florianl and others added 4 commits May 27, 2025 14:11
@florianl
Copy link
Copy Markdown
Member Author

florianl commented Jun 3, 2025

I have added default_sample_type as suggested by @christos68k and discussed in the OTel Profiling SIG (2025-06-29):

  1. Introduce default_sample_type in ScopeProfiles

I didn't mark Profile.sample_type as optional, as this would conflict with the effort to avoid optional in the proto (see #659).

Comment thread opentelemetry/proto/profiles/v1development/profiles.proto Outdated
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
…e_type

Signed-off-by: Florian Lehner <florian.lehner@elastic.co>
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
Copy link
Copy Markdown
Member

@felixge felixge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, including the proposed default_sample_type field. Sorry for the delay 😞.

@florianl florianl changed the title profiles: use single Profile.sample_type profiles: use single Profile.sample_type and clarify use of timestamps Jun 26, 2025
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
Comment thread opentelemetry/proto/profiles/v1development/profiles.proto
florianl and others added 3 commits June 30, 2025 16:34
@tigrannajaryan tigrannajaryan merged commit c306100 into open-telemetry:main Jul 2, 2025
15 checks passed
@pellared pellared mentioned this pull request Aug 29, 2025
tigrannajaryan pushed a commit that referenced this pull request Sep 2, 2025
### Changed

- profiles: drop gzip requirement. [#661](#661)
- profiles: avoid `optional` keyword usage. [#659](#659)
- profiles: make `profile_id` optional. [#665](#665)
- profiles: use single `Profile.sample_type` and clarify use of timestamps. [#649](#649)
- all: add notes about the attribute values restrictions. [#683](https://github.com/open-telemetry/opentelemetry-proto/pull/683)<br>⚠️ **IMPORTANT**: These restrictions can be dropped in a future minor release.
- profiles: clarify usage of the zero value as the first element of tables in `ProfilesDictionary`. [#688](#688), [#698](#698)
- profiles: unsigned `time_nanos` and `duration_nanos` in `Profile`. [#692](#692)
- profiles: improve attribute encoding in `ProfilesDictionary`. [#672](#672)
- profiles: simplify profile stack trace representation. [#708](#708)

### Fixed

- examples: fix OTLP JSON Event example body. [#666](#666)
- docs: minor specification fixes around `UNAVAILABLE` and `RetryInfo`. [#669](#669)

### Removed

- profiles: remove `default_sample_type`. [#679](#679)
- profiles: remove `has_*` debug info fields, they are moving to attributes. [#595](#595)
- profiles: remove `Location.is_folded`. [#690](#690)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants