You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please add following metrics for alerting purposes:
Current raw value (not filtered). This is necessary to track the situation when prev-value is mistakenly set far ahead, while value stops updating and it is unclear whether there are real movements of the counter. Issuing a raw-value will allow you to recognize the fact that the counter is moving, so the presence of changes in the raw-value with the value unchanged for some time is a good target for alerting.
The counter of failed (or vice versa, successful) rounds. Along with ai_on_the_edge_device_rounds_total, such a counter allow to highlight failed rounds and build an alert when the failure threshold is exceeded.
Export status of last round ([9d07h48m01s] 2024-10-06T11:23:19 <INF> [POSTPROC] main: Raw: 10473.9835, Value: 10473.9835, **Status: no error**), I suggest it in the form of a label with constant "1" metric value.
The time that took the entire last round (via gauge-type metric), or probably a cumulative total time spend by all rounds (via counter-type metric). In comparison with ai_on_the_edge_device_uptime_seconds, this metric allows you to assess the load on the system and notify when the threshold is exceeded.
The text was updated successfully, but these errors were encountered:
The Feature
Thanks for great prometheus exporter feature!
Please add following metrics for alerting purposes:
[9d07h48m01s] 2024-10-06T11:23:19 <INF> [POSTPROC] main: Raw: 10473.9835, Value: 10473.9835, **Status: no error**
), I suggest it in the form of a label with constant "1" metric value.The text was updated successfully, but these errors were encountered: