OCPBUGS-79467: Change default log level from DEBUG to INFO#1402
OCPBUGS-79467: Change default log level from DEBUG to INFO#1402openshift-merge-bot[bot] merged 2 commits intoopenshift:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
@gcs278: This pull request references Jira Issue OCPBUGS-79467, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (8)
✅ Files skipped from review due to trivial changes (4)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughLogger construction was changed to use ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
2691c56 to
0b27052
Compare
|
/assign |
It is interesting that /approve Please do make sure take a look at the CI artifacts once they are available, and compare the ingress-operator log output with the same in the CI artifacts for some pre-#1354 PR. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Thanks @Miciah - Updated the PR description to also include that. I'll took a look at the artifacts did a quick comparison - it looks like the debug logs are gone. And I tested locally with Sail Library to make sure it stops logging at the Debug level. We do get some event recorder logs removed from controller runtime DEBUG logs too I noticed: Let me know if you think that's an issue, but my assumption is that it's DEBUG, and not intended for production visibility. |
|
Infrastructure flake in creating a test pod...I wonder if we need to add a retry if creates are timing out: /test e2e-aws-operator |
Losing the "created canary certificate" log message isn't ideal. We should still get the event, but it is useful to get the log message as well. From a quick look at the code, it appears that we are using the deprecated event broadcaster implementation, and it logs events with In contrast, the new event broadcaster implementation just uses So I think updating to the new event recorder might get us back the log message for the "created canary certificate" event. The following patch should suffice. Detailsdiff --git a/pkg/operator/controller/canary-certificate/controller.go b/pkg/operator/controller/canary-certificate/controller.go
index 0f20eac15..7593d0f88 100644
--- a/pkg/operator/controller/canary-certificate/controller.go
+++ b/pkg/operator/controller/canary-certificate/controller.go
@@ -18,7 +18,7 @@ import (
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
- "k8s.io/client-go/tools/record"
+ "k8s.io/client-go/tools/events"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller"
@@ -48,7 +48,7 @@ type Config struct {
type reconciler struct {
config Config
client client.Client
- recorder record.EventRecorder
+ recorder events.EventRecorder
}
// New creates the canary certificate controller
@@ -61,7 +61,7 @@ func New(mgr manager.Manager, config Config) (controller.Controller, error) {
reconciler := &reconciler{
config: config,
client: mgr.GetClient(),
- recorder: mgr.GetEventRecorderFor(canaryCertControllerName),
+ recorder: mgr.GetEventRecorder(canaryCertControllerName),
}
c, err := controller.New(canaryCertControllerName, mgr, controller.Options{Reconciler: reconciler})
if err != nil {
@@ -206,7 +206,7 @@ func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *c
return err
}
- r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
+ r.recorder.Eventf(certificate, nil, corev1.EventTypeNormal, "Created", "CreatedCanaryCertificate", "Created canary certificate")
return nil
}
@@ -220,7 +220,7 @@ func (r *reconciler) updateCanaryCertificate(ctx context.Context, current, desir
if err := r.client.Update(ctx, updated); err != nil {
return false, err
}
- r.recorder.Event(updated, "Normal", "UpdatedCanaryCertificate", "updated canary certificate")
+ r.recorder.Eventf(updated, nil, corev1.EventTypeNormal, "Updated", "UpdatedCanaryCertificate", "updated canary certificate")
return true, nil
}@coderabbitai, is the above analysis correct? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 9763 🏁 Script executed: Length of output: 5182 🏁 Script executed: Length of output: 9148
The new broadcaster in controller-runtime's Provider also logs at Looking at // init deprecated broadcaster
p.deprecatedBroadcaster.StartEventWatcher(
func(e *corev1.Event) {
p.logger.V(1).Info(e.Message, "type", e.Type, "object", e.InvolvedObject, "reason", e.Reason)
})
// init NEW broadcaster — also V(1)!
stopWatcher, err := p.broadcaster.StartEventWatcher(func(event runtime.Object) {
e, isEvt := event.(*eventsv1.Event)
if isEvt {
p.logger.V(1).Info(e.Note, "type", e.Type, "object", e.Related, "action", e.Action, "reason", e.Reason)
}
})The So switching from
There is also a potential compilation issue to verify: To actually restore the log message at INFO level, the simplest approach would be to add an explicit func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *corev1.Secret) error {
if err := r.client.Create(ctx, certificate); err != nil {
return err
}
r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
log.Info("created canary certificate", "secret", certificate.Name)
return nil
}This keeps the Kubernetes event (for |
|
/hold |
I'm fine with @coderabbitai's suggestion to add a |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
🏁 Script executed: Length of output: 9281 🏁 Script executed: Length of output: 446
diff --git a/pkg/operator/controller/canary-certificate/controller.go b/pkg/operator/controller/canary-certificate/controller.go
--- a/pkg/operator/controller/canary-certificate/controller.go
+++ b/pkg/operator/controller/canary-certificate/controller.go
@@ -207,6 +207,7 @@ func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *c
}
r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
+ log.Info("created canary certificate", "secret", certificate.Name)
return nil
}
@@ -220,6 +221,7 @@ func (r *reconciler) updateCanaryCertificate(ctx context.Context, current, desir
}
r.recorder.Event(updated, "Normal", "UpdatedCanaryCertificate", "updated canary certificate")
+ log.Info("updated canary certificate", "secret", updated.Name)
return true, nil
}This uses the existing package-level
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/operator/controller/ingress/controller.go (1)
341-342: Consider adding IngressController identity to the log message.The log lacks context about which IngressController was admitted. Adding the name would improve log traceability, consistent with other log statements in this file (e.g., line 391, 414).
♻️ Suggested improvement
- log.Info("ingresscontroller admitted") + log.Info("ingresscontroller admitted", "namespace", ingress.Namespace, "name", ingress.Name)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/controller.go` around lines 341 - 342, The log message "ingresscontroller admitted" should include the IngressController identity for traceability: update the log.Info call that currently logs admission (and keep r.recorder.Event as-is) to include the ingress name (e.g., use ingress.Name or a formatted identifier from the ingress object) so the message matches other contextual logs in this file (see existing logs around the admission path such as those at lines referencing the same ingress variable).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/operator/controller/ingress/controller.go`:
- Around line 341-342: The log message "ingresscontroller admitted" should
include the IngressController identity for traceability: update the log.Info
call that currently logs admission (and keep r.recorder.Event as-is) to include
the ingress name (e.g., use ingress.Name or a formatted identifier from the
ingress object) so the message matches other contextual logs in this file (see
existing logs around the admission path such as those at lines referencing the
same ingress variable).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 6521cccd-b77e-4373-9016-81fb31b76596
📒 Files selected for processing (8)
pkg/operator/controller/canary-certificate/controller.gopkg/operator/controller/canary/daemonset.gopkg/operator/controller/certificate-publisher/publish_ca.gopkg/operator/controller/certificate-publisher/publish_certs.gopkg/operator/controller/certificate/ca.gopkg/operator/controller/certificate/default_cert.gopkg/operator/controller/dns/controller.gopkg/operator/controller/ingress/controller.go
✅ Files skipped from review due to trivial changes (2)
- pkg/operator/controller/canary-certificate/controller.go
- pkg/operator/controller/dns/controller.go
|
Hypershift failures |
|
/retest-required |
|
Looks like we had one success recently for hypershift-e2e-aks. I think it's throttling related, so before things get crazy again: |
|
Infra failures |
|
/hold Revision 9fe57a5 was retested 3 times: holding |
|
e2e-aws-operator failed with: Not related - it's a DNS resolution timeout. I'll keep an eye on these failures. /test e2e-aws-operator |
|
/unhold |
|
install/infra failure |
|
The e2e-aws-ovn-hypershift-conformance pod resize failure is expected, and we have the OK to override: |
|
@gcs278: Overrode contexts on behalf of gcs278: ci/prow/e2e-aws-ovn-hypershift-conformance DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/hold cancel |
|
teardown failure |
|
/retest-required |
|
/test e2e-aws-ovn-upgrade |
|
/retest-required |
|
/test e2e-aws-ovn
|
|
/test e2e-aws-ovn-upgrade |
|
@gcs278: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@gcs278: Jira Issue Verification Checks: Jira Issue OCPBUGS-79467 Jira Issue OCPBUGS-79467 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix included in accepted release 4.22.0-0.nightly-2026-04-01-092906 |
Summary
Fixes excessive log volume by changing the cluster-ingress-operator's default log level from DEBUG to INFO.
Problem
The operator was using
zap.NewDevelopment()which defaults to DEBUG level, causing ~2,000 debug logs per hour with Sail Library integration. This created significant noise in production log aggregation systems.Changes
Changed logger configuration (
pkg/log/log.go):NewDevelopmentConfig()with explicit INFO levelPromoted important debug logs to INFO (
pkg/operator/controller/gateway-status/controller.go):Added INFO logs alongside event recorder calls (7 controller files):
recorder.Event()calls to DEBUG logslog.Info()calls for 16 events including certificate lifecycle, IngressController admission, and DNS warningsoc get events)Implementation Approach
We chose to keep
NewDevelopmentConfig()(with INFO level) instead ofNewProduction()to maintain the human-readable console logging format.NewProduction would use JSON encoding:
{"level":"info","ts":1774378939.054826,"logger":"operator.gatewayclass_controller","msg":"..."}Our approach keeps console format:
Console format is more appropriate for OpenShift operators where cluster admins often debug via
oc logsrather than log aggregation systems. This preserves familiar formatting while reducing log volume.Impact
Testing
make buildpassedmake testpassedmake verifypassedRelated
Fixes: OCPBUGS-79467
🤖 Generated with Claude Code via
/jira:solve OCPBUGS-79467 origin