Skip to content

Add generic reconciler metrics#60581

Merged
hugoShaka merged 1 commit intomasterfrom
hugo/add-reconciler-metrics
Oct 24, 2025
Merged

Add generic reconciler metrics#60581
hugoShaka merged 1 commit intomasterfrom
hugo/add-reconciler-metrics

Conversation

@hugoShaka
Copy link
Copy Markdown
Contributor

This PR adds optional metrics to the generic reconciler.
The metrics won't be registered if no prometheus.Registerer is passed, this avoid throwing random metrics into the global registry and provides backward compatibility.

@hugoShaka hugoShaka added no-changelog Indicates that a PR does not require a changelog entry backport/branch/v18 labels Oct 24, 2025
return nil
}

type metrics struct {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THis should be renamed to reconcilerMetrics to avoid conflict in the bloated services package

metricLabelKind = "kind"
)

func newMetrics(subsystem string) *metrics {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem

@hugoShaka hugoShaka force-pushed the hugo/add-reconciler-metrics branch from 1b03154 to 31fa7bd Compare October 24, 2025 18:15
@hugoShaka hugoShaka enabled auto-merge October 24, 2025 18:24
@hugoShaka hugoShaka added this pull request to the merge queue Oct 24, 2025
Merged via the queue into master with commit 5a13879 Oct 24, 2025
41 checks passed
@hugoShaka hugoShaka deleted the hugo/add-reconciler-metrics branch October 24, 2025 18:58
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@hugoShaka See the table below for backport results.

Branch Result
branch/v18 Failed

hugoShaka added a commit that referenced this pull request Oct 30, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
hugoShaka added a commit that referenced this pull request Nov 3, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
github-merge-queue bot pushed a commit that referenced this pull request Nov 4, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
mmcallister pushed a commit that referenced this pull request Nov 6, 2025
mmcallister pushed a commit that referenced this pull request Nov 19, 2025
mmcallister pushed a commit that referenced this pull request Nov 19, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
mmcallister pushed a commit that referenced this pull request Nov 20, 2025
mmcallister pushed a commit that referenced this pull request Nov 20, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
hugoShaka added a commit that referenced this pull request Nov 21, 2025
hugoShaka added a commit that referenced this pull request Nov 21, 2025
When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.
github-merge-queue bot pushed a commit that referenced this pull request Nov 24, 2025
* Add entra ID metrics (#60537)

* Add entra ID metrics

This commit adds metrics for entra ID sync. This is the OSS part, it
contains the msgraph client metrics.

As many different parts of Teleport are using the msgraph client and
might not have access to a metric registerer yet, the client gracefully
handles not being given a metric registry. In this case it won't
register its metrics, we don't want to continue polluting the global
metrics registry.

* lint

* add optional reconciler metrics (#60581)

* expose TeleportProcess metrics registry (#60654)

* test setting a non-nil registry in config

* expose teleport process metric registry

* remove metric config

* fixup! remove metric config

* Add support in process for additional metrics gatherers (#60852)

* Add support in process for additional metrics gatherers

Before this change, we were gathering from 2 metrics gatherers:
- the process registry
- the global registry

There are cases where we must add and remove metrics (e.g. plugins).
We could throw them into the global registry but:
- this would pollute the global registry and cause duplicates/conflicts
  in tests
- this would conflate all metrics from the same plugin kind. We support
  several instances of the same hosted plugin and we might want to
  keep distinct metrics.

This change makes the gatherers a list, and add a function so teleport.e
can add its own gatherer. A teleport.e PR using this mechanism will
follow.

* Protect gatherer slice with a mutex

* Fix the generic reconciler metric API (#60853)

When implementing reconciler metrics in #60581
I did not realize some GenericReconciler usage, including the one I
wanted to observe, were short-lived. The implementation had 2 blatant
issues:
- metrics were lost for each invocations
- creating a new reonciler would attempt to register the metric a second
  time and cause a conflict

This PR changes the reconciler metrics API so the caller is responsible
for creating and registering the metrics beforehand. This allows the
caller to create the metric struct once and pass them to successive
`NewGenericReconciler` calls.

* Introduce metrics.Registry to pass down registries (#61239)

* Introduce metrics.Registry and use it

* Update lib/metrics/registry.go

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>

* BlackHole -> BlackHoleRegistry

* merge lib/metrics and lib/observability/metrics

* lint

* address noah's feedback

---------

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>

* metrics.Registry.Wrap() handle empty subsystems properly (#61392)

* handle empty subsystems properly

* appeasing our italian engineering team

* Fix build after rebase

---------

Co-authored-by: rosstimothy <39066650+rosstimothy@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport/branch/v18 no-changelog Indicates that a PR does not require a changelog entry size/md

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants