Add device.crash event semantic convention#1576
Add device.crash event semantic convention#1576bidetofevil wants to merge 4 commits intoopen-telemetry:mainfrom
device.crash event semantic convention#1576Conversation
LikeTheSalad
left a comment
There was a problem hiding this comment.
Thank you 🙌 I think it's a great start.
| The event body fields MUST be used to describe the state of the application at the time of the crash, not when the event was actually | ||
| emitted, which could happen at a much later time (e.g. when the app next starts up). | ||
| body: | ||
| id: device_crash_state |
There was a problem hiding this comment.
nit: is it needed to wrap all the body fields within this map? Since it seems like this map is the only field at the root level of the body, I was wondering if we could instead define all of its children as independent fields.
| Supplements the `source` field that identifies the specific variation of it [2]. | ||
| note: > | ||
| This version is specifically for the `source` field. It can be a well-defined version of some external format (e.g. Android 15 | ||
| Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema). |
There was a problem hiding this comment.
Based on this note it seems like the source_version data might come from different places depending on the case. Maybe there's something I'm missing, though since we're defining source as an enum, I think it would be better (if possible) to define the source_version format for each type of source to avoid potential ambiguity issues.
| brief: > | ||
| The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046). | ||
| note: > | ||
| This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field. |
There was a problem hiding this comment.
I'd like to check this with an example just to make sure I got it properly.
Considering an event with source = jvm_exception. Let's say that this data_content_type field has application/json as its value. The collector in this case will be able to parse it as json, however, how can it know what fields to look for in that json? My question is mostly about ensuring that the UI can later display this data properly.
There was a problem hiding this comment.
The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.
In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.
There was a problem hiding this comment.
Thank you for your initiative.
I apologize for not attending the SIG calls where this may have been discussed, but I am curious if there is a proposal to include an additional attribute that would help identify the crash report format beyond just the MIME type. Specifically, on iOS, there are several well-known libraries that generate crash reports in their own, somewhat distinct formats. For instance, knowing that a crash report was generated using PLCrashReporter could enhance interoperability. While this information could be incorporated into the MIME type, it would require parsing it out, which introduces another potential source of incompatibility.
Never mind, this is indeed covered by the source attribute.
| brief: > | ||
| Crash in the native layer caught by a signal handler | ||
| - id: aei | ||
| value: 'aei' |
There was a problem hiding this comment.
I’m curious if the full name (application_exit_info) might be a better approach?
| type: string | ||
| - id: crashed_service_version | ||
| stability: experimental | ||
| requirement_level: recommended |
| type: string | ||
| - id: crashed_os_version | ||
| stability: experimental | ||
| requirement_level: recommended |
| This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field. | ||
| examples: [ "application/json" ] | ||
| type: string | ||
| - id: crashed_service_version |
There was a problem hiding this comment.
I understand the intention to emphasize, but the crashed_ prefix appears unnecessary in this context, as the attribute is clearly defined within device.crash, isn't it? Ditto crashed_os_version.
| brief: > | ||
| The format of the `data` field as defined by [RFC 2046](https://datatracker.ietf.org/doc/html/rfc2046). | ||
| note: > | ||
| This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field. |
There was a problem hiding this comment.
The current formats for crash report bodies seem insufficiently defined, particularly for crashes originating from various sources or those captured by different tools. Leaving the data opaque while concentrating on defining the common metadata in the other attributes strikes the right balance IMHO.
In many instances, the crash report cannot be effectively presented without backend post-processing and supplementary data from the application provider, which falls outside the scope of OTel.
|
/cc @open-telemetry/semconv-mobile-approvers |
| stability: experimental | ||
| requirement_level: required | ||
| brief: > | ||
| An ID that uniquely identifies the crash instance obtained from a specific `source`. |
There was a problem hiding this comment.
How would this be generated? From say, an Android device crash?
|
This looks like a great start. I'm curious as to what some examples of crashes could look like in this format. Such as an Android jvm crash. Obviously most of it will be in the body, so I guess of that format is still tbd. Could be useful nonetheless to have some examples of crashes that use this formatting |
| Application Exit Info), or some custom version number associated with the usage in this event (e.g. some custom JSON schema). | ||
| examples: [ "1.0.0" ] | ||
| type: string | ||
| - id: data |
There was a problem hiding this comment.
Shouldn't this better be in the body (earlier payload) of the Event (https://opentelemetry.io/docs/specs/otel/logs/data-model/#field-body)? This would also implicate the type any rather than string.
|
Just another quick thought. It would be generally beneficial is app state / resource attributes like |
| This, combined with a priori knowledge of the structure of the blob, will allow Collectors to parse and process the `data` field. | ||
| examples: [ "application/json" ] | ||
| type: string | ||
| - id: crashed_service_version |
There was a problem hiding this comment.
Use attribute field instead and use the corresponding pre-existing semconv attribute.
| This is required so crashes can be aggregated by the version in which it occurred, not the one that emitted the event. | ||
| examples: [ "7.5.0" ] | ||
| type: string | ||
| - id: crashed_os_version |
There was a problem hiding this comment.
Use attribute field instead and use the corresponding pre-existing semconv attribute.
| recorded as it is happening (e.g. through an UncaughtExceptionHandler), or after the fact, when a tombstone is detected | ||
| containing information about a previously terminated app instance that was caused by an unhandled error or exception. | ||
| note: > | ||
| The body fields of this event contain data and metadata about the crash tht can be used to classify and aggregate it with similar |
There was a problem hiding this comment.
Should be "about the crash that can be used".
|
@bidetofevil Do you want to circle back on this? It's kinda stale and needs a rebase at the least. I'd love to see this work out. Thanks! |
|
@breedx-splk Yeah, I'm looking to get back on this soon. I put it aside when I got busy and, well, it's finall time to look at this again. Starting with getting it building and reviewing the comments. |
|
This PR has been labeled as stale due to lack of activity. It will be automatically closed if there is no further activity over the next 14 days. |
|
Just for the sake of using semantic conventions in OTel Android instrumentations, I've created this PR to add the |
Adding the definition for an
Eventthat models a crash in the mobile app. There are still some open questions to be worked out regarding naming and the specific event attribute definitions, but I want to put the draft changes in a PR to get further feedback from a larger group, having done so already via a doc with some other folks in the mobile community.The PR is not ready to merging right now. I will make it a non-draft when it is.
Merge requirement checklist
[chore]