Skip to content

Add device.crash event semantic convention#1576

Closed
bidetofevil wants to merge 4 commits intoopen-telemetry:mainfrom
bidetofevil:crash-semconv
Closed

Add device.crash event semantic convention#1576
bidetofevil wants to merge 4 commits intoopen-telemetry:mainfrom
bidetofevil:crash-semconv

Conversation

@bidetofevil
Copy link
Copy Markdown
Contributor

Adding the definition for an Event that models a crash in the mobile app. There are still some open questions to be worked out regarding naming and the specific event attribute definitions, but I want to put the draft changes in a PR to get further feedback from a larger group, having done so already via a doc with some other folks in the mobile community.

The PR is not ready to merging right now. I will make it a non-draft when it is.

Merge requirement checklist

@bidetofevil bidetofevil requested review from a team as code owners November 13, 2024 23:07
@bidetofevil bidetofevil marked this pull request as draft November 13, 2024 23:07
Copy link
Copy Markdown
Contributor

@LikeTheSalad LikeTheSalad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you 🙌 I think it's a great start.

Comment thread model/device/events.yaml
The event body fields MUST be used to describe the state of the application at the time of the crash, not when the event was actually
emitted, which could happen at a much later time (e.g. when the app next starts up).
body:
id: device_crash_state
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is it needed to wrap all the body fields within this map? Since it seems like this map is the only field at the root level of the body, I was wondering if we could instead define all of its children as independent fields.

Comment thread model/device/events.yaml
Supplements the `source` field that identifies the specific variation of it [2].
note: >
This version is specifically for the `source` field. It can be a well-defined version of some external format (e.g. Android 15
Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this note it seems like the source_version data might come from different places depending on the case. Maybe there's something I'm missing, though since we're defining source as an enum, I think it would be better (if possible) to define the source_version format for each type of source to avoid potential ambiguity issues.

Comment thread model/device/events.yaml
brief: >
The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046).
note: >
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to check this with an example just to make sure I got it properly.

Considering an event with source = jvm_exception. Let's say that this data_content_type field has application/json as its value. The collector in this case will be able to parse it as json, however, how can it know what fields to look for in that json? My question is mostly about ensuring that the UI can later display this data properly.

Copy link
Copy Markdown

@jzwc jzwc Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.

In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.

Copy link
Copy Markdown

@jzwc jzwc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your initiative.

I apologize for not attending the SIG calls where this may have been discussed, but I am curious if there is a proposal to include an additional attribute that would help identify the crash report format beyond just the MIME type. Specifically, on iOS, there are several well-known libraries that generate crash reports in their own, somewhat distinct formats. For instance, knowing that a crash report was generated using PLCrashReporter could enhance interoperability. While this information could be incorporated into the MIME type, it would require parsing it out, which introduces another potential source of incompatibility.

Never mind, this is indeed covered by the source attribute.

Comment thread model/device/events.yaml
brief: >
Crash in the native layer caught by a signal handler
- id: aei
value: 'aei'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m curious if the full name (application_exit_info) might be a better approach?

Comment thread model/device/events.yaml
type: string
- id: crashed_service_version
stability: experimental
requirement_level: recommended
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note reads "This is required"...

Comment thread model/device/events.yaml
type: string
- id: crashed_os_version
stability: experimental
requirement_level: recommended
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The note reads "This is required"...

Comment thread model/device/events.yaml
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
examples: [ "application/json" ]
type: string
- id: crashed_service_version
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the intention to emphasize, but the crashed_ prefix appears unnecessary in this context, as the attribute is clearly defined within device.crash, isn't it? Ditto crashed_os_version.

Comment thread model/device/events.yaml
brief: >
The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046).
note: >
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
Copy link
Copy Markdown

@jzwc jzwc Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.

In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.

@lmolkova
Copy link
Copy Markdown
Member

/cc @open-telemetry/semconv-mobile-approvers

Comment thread model/device/events.yaml
stability: experimental
requirement_level: required
brief: >
An ID that uniquely identifies the crash instance obtained from a specific `source`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this be generated? From say, an Android device crash?

@tonzhan2
Copy link
Copy Markdown

This looks like a great start. I'm curious as to what some examples of crashes could look like in this format. Such as an Android jvm crash. Obviously most of it will be in the body, so I guess of that format is still tbd. Could be useful nonetheless to have some examples of crashes that use this formatting

Comment thread model/device/events.yaml
Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema).
examples: [ "1.0.0" ]
type: string
- id: data
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this better be in the body (earlier payload) of the Event (https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-body)? This would also implicate the type any rather than string.

@jzwc
Copy link
Copy Markdown

jzwc commented Dec 7, 2024

Just another quick thought. It would be generally beneficial is app state / resource attributes like device.app.lifecycle would be allowed as the crash event attributes to pass the respective application and device state at the moment of the crash.

@lmolkova lmolkova moved this from Untriaged to Draft in Semantic Conventions Triage May 4, 2025
Comment thread model/device/events.yaml
This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field.
examples: [ "application/json" ]
type: string
- id: crashed_service_version
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use attribute field instead and use the corresponding pre-existing semconv attribute.

Comment thread model/device/events.yaml
This is required so crashes can be aggregated by the version in which it occurred, not the one that emitted the event.
examples: [ "7.5.0" ]
type: string
- id: crashed_os_version
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use attribute field instead and use the corresponding pre-existing semconv attribute.

Comment thread model/device/events.yaml
recorded as it is happening (e.g. through an UncaughtExceptionHandler), or after the fact, when a tombstone is detected
containing information about a previously terminated app instance that was caused by an unhandled error or exception.
note: >
The body fields of this event contain data and metadata about the crash tht can be used to classify and aggregate it with similar
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be "about the crash that can be used".

@breedx-splk
Copy link
Copy Markdown
Contributor

@bidetofevil Do you want to circle back on this? It's kinda stale and needs a rebase at the least. I'd love to see this work out. Thanks!

@bidetofevil
Copy link
Copy Markdown
Contributor Author

@breedx-splk Yeah, I'm looking to get back on this soon. I put it aside when I got busy and, well, it's finall time to look at this again. Starting with getting it building and reviewing the comments.

@github-actions
Copy link
Copy Markdown

This PR has been labeled as stale due to lack of activity. It will be automatically closed if there is no further activity over the next 14 days.

@github-actions github-actions Bot added the Stale label Nov 29, 2025
@github-actions github-actions Bot closed this Dec 7, 2025
@LikeTheSalad
Copy link
Copy Markdown
Contributor

Just for the sake of using semantic conventions in OTel Android instrumentations, I've created this PR to add the device.crash event with minimal requirements (open to future enhancements).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

8 participants