Skip to content

Expose the cache health as a prometheus metric#54776

Merged
rosstimothy merged 1 commit intomasterfrom
tross/cache_health_metric
May 16, 2025
Merged

Expose the cache health as a prometheus metric#54776
rosstimothy merged 1 commit intomasterfrom
tross/cache_health_metric

Conversation

@rosstimothy
Copy link
Copy Markdown
Contributor

@rosstimothy rosstimothy commented May 13, 2025

Adds two new gauges to track cache health.

  • teleport_cache_health: labeled by component, it reflects the cache health of the particular component. A value of 1 means healthy, a value of 0 means unhealthy.

  • teleport_cache_last_reset_seconds: labeled by component, it reflects the last unix time in seconds that the cache was reset.

Changelog: Expose the Teleport service cache health via prometheus metrics.

@rosstimothy rosstimothy force-pushed the tross/cache_health_metric branch 2 times, most recently from 08abcf4 to 73bd3c5 Compare May 14, 2025 13:07
@rosstimothy rosstimothy marked this pull request as ready for review May 14, 2025 13:30
@github-actions github-actions Bot requested a review from rudream May 14, 2025 13:30
Comment thread lib/cache/cache.go Outdated
@rosstimothy rosstimothy force-pushed the tross/cache_health_metric branch 3 times, most recently from 5c01ebc to 2955ec6 Compare May 15, 2025 01:14
@rosstimothy
Copy link
Copy Markdown
Contributor Author

Friendly ping @rudream @fspmarshall

Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
@rosstimothy rosstimothy force-pushed the tross/cache_health_metric branch from 2955ec6 to 651cab3 Compare May 16, 2025 20:33
@rosstimothy rosstimothy enabled auto-merge May 16, 2025 20:33
@rosstimothy rosstimothy added this pull request to the merge queue May 16, 2025
Merged via the queue into master with commit 075ef0e May 16, 2025
40 checks passed
@rosstimothy rosstimothy deleted the tross/cache_health_metric branch May 16, 2025 21:14
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@rosstimothy See the table below for backport results.

Branch Result
branch/v15 Failed
branch/v16 Failed
branch/v17 Failed

rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
rosstimothy added a commit that referenced this pull request May 16, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
github-merge-queue Bot pushed a commit that referenced this pull request May 19, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
github-merge-queue Bot pushed a commit that referenced this pull request May 19, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
github-merge-queue Bot pushed a commit that referenced this pull request May 19, 2025
Adds two new gauges to track cache health.

- `teleport_cache_health`: labeled by component, it reflects if the
cache is healthy and populated. A value of 1 means healthy,
a value of 0 means unhealthy.

- `teleport_cache_last_reset_seconds`, labeled by component, it
reflects the last unix time in seconds that the cache was reset.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants