-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metrics support #217
Add metrics support #217
Conversation
@@ -195,11 +208,15 @@ func (r *ReconcileStack) Reconcile(ctx context.Context, request reconcile.Reques | |||
|
|||
// Step 2. If there are extra environment variables, read them in now and use them for subsequent commands. | |||
if err = sess.SetEnvs(stack.Envs, request.Namespace); err != nil { | |||
reqLogger.Error(err, "Could not find ConfigMap for Envs") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized we weren't marking the stack failed in sufficient places. This improves things a bit. I am happy to move this to a separate PR if necessary.
Opened #218 to track adding docs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to check the expected metrics in a test, but we can defer that in the interest of time if you've checked it manually.
@@ -17,6 +17,10 @@ import ( | |||
"strings" | |||
"time" | |||
|
|||
"github.com/operator-framework/operator-lib/handler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: import sorting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Proposed changes
Fixes: #123
The current implementation explicitly tracks the following metrics:
stacks_active
-gauge
that tracks the number of currently registered stacks managed by the systemstacks_failing
-gaugevec
that provides information about stacks currently failingThe output of curl to
:8383
on the pod results in something like this:In addition to the above, users have access to the following metrics emitted by controller runtime:
controller_runtime_active_workers{controller="stack-controller"}
-gauge
that tracks the number of concurrent stacks being processedcontroller_runtime_max_concurrent_reconciles{controller="stack-controller"}
-gauge
that tracks the max concurrent stack reconciles configured. This defaults to 10 but can be controlled throughMAX_CONCURRENT_RECONCILES
environment variable added in Make max reconciles configurable #213controller_runtime_reconcile_errors_total{controller="stack-controller"}
-counter
of errored reconcilescontroller_runtime_reconcile_time_seconds_*{controller="stack-controller"}
-histogram
providing latency information for reconcilescontroller_runtime_reconcile_total{controller="stack-controller",result="error"}
-counter
for errored reconcilescontroller_runtime_reconcile_total{controller="stack-controller",result="requeue"}
-counter
for requeued reconcilescontroller_runtime_reconcile_total{controller="stack-controller",result="success"}
-counter
for successful reconcilesTogether these should provide sufficient coverage for monitoring basic operation of the controller. Additional metrics can be added as necessary.
Related issues (optional)
#123