Skip to content

Conversation

@mergify
Copy link
Contributor

@mergify mergify bot commented Nov 19, 2024

Proposed commit message

Add configurable failure threshold before reporting streams as degraded

With this change it is possible to configure a threshold for the number of consecutive errors that may happen while fetching metrics for a given stream before the stream gets marked as DEGRADED.
To configure such threshold, add a "failure_threshold": <n> to a module configuration block.
Depending on the value of <n> the threshold will be configured in different ways:

  • n == 0: status reporting for the stream has been disabled, the stream will never become DEGRADED no matter how many errors are encountered while fetching metrics
  • n==1 or failure_threshold not specified: backward compatible behavior, the stream will become DEGRADED at the first error encountered
  • n > 1: stream will become DEGRADED after at least n consecutive errors have been encountered

When a fetch operation completes without errors the consecutive errors counter is reset and the stream is set to HEALTHY.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • [ ] I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

No disruptive user impact since not specifying the new configuration key maintains the previous behavior

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs


This is an automatic backport of pull request #41570 done by [Mergify](https://mergify.com).

…ms as degraded (#41570)

* Metricbeat: add configurable failure threshold before reporting streams as degraded

With this change it is possible to configure a threshold for the number of consecutive errors that may happen while fetching metrics for a given stream before the stream gets marked as DEGRADED.
To configure such threshold, add a "failure_threshold": <n> to a module configuration block.
Depending on the value of <n> the threshold will be configured in different ways:

    n == 0: status reporting for the stream has been disabled, the stream will never become DEGRADED no matter how many errors are encountered while fetching metrics
    n==1 or failure_threshold not specified: backward compatible behavior, the stream will become DEGRADED at the first error encountered
    n > 1: stream will become DEGRADED after at least n consecutive errors have been encountered

When a fetch operation completes without errors the consecutive errors counter is reset and the stream is set to HEALTHY.

(cherry picked from commit f84c05b)
@mergify mergify bot requested a review from a team as a code owner November 19, 2024 13:34
@mergify mergify bot added the backport label Nov 19, 2024
@mergify mergify bot requested review from VihasMakwana and belimawr and removed request for a team November 19, 2024 13:34
@mergify mergify bot assigned pchila Nov 19, 2024
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Nov 19, 2024
@pierrehilbert pierrehilbert added the Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team label Nov 19, 2024
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Nov 19, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@pierrehilbert pierrehilbert added Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team and removed Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Nov 19, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@pierrehilbert pierrehilbert requested review from pchila and removed request for VihasMakwana and belimawr November 19, 2024 17:06
@pchila pchila merged commit db727d0 into 8.x Nov 20, 2024
6 checks passed
@pchila pchila deleted the mergify/bp/8.x/pr-41570 branch November 20, 2024 17:17
@khushijain21 khushijain21 mentioned this pull request Jun 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants