Skip to content

OCPBUGS-79467: Change default log level from DEBUG to INFO#1402

Merged
openshift-merge-bot[bot] merged 2 commits intoopenshift:masterfrom
gcs278:fix-OCPBUGS-79467
Mar 31, 2026
Merged

OCPBUGS-79467: Change default log level from DEBUG to INFO#1402
openshift-merge-bot[bot] merged 2 commits intoopenshift:masterfrom
gcs278:fix-OCPBUGS-79467

Conversation

@gcs278
Copy link
Copy Markdown
Contributor

@gcs278 gcs278 commented Mar 24, 2026

Summary

Fixes excessive log volume by changing the cluster-ingress-operator's default log level from DEBUG to INFO.

Problem

The operator was using zap.NewDevelopment() which defaults to DEBUG level, causing ~2,000 debug logs per hour with Sail Library integration. This created significant noise in production log aggregation systems.

Changes

  1. Changed logger configuration (pkg/log/log.go):

    • Use NewDevelopmentConfig() with explicit INFO level
    • Keeps console encoding for human-readable logs
    config := zap.NewDevelopmentConfig()
    config.Level = zap.NewAtomicLevelAt(zapcore.InfoLevel)
  2. Promoted important debug logs to INFO (pkg/operator/controller/gateway-status/controller.go):

    • "no service was found for gateway" (V(1) → INFO)
    • "no dnsrecords found for gateway" (V(1) → INFO)
    • These messages are important for troubleshooting Gateway deployment issues
  3. Added INFO logs alongside event recorder calls (7 controller files):

    • Controller-runtime echoes recorder.Event() calls to DEBUG logs
    • After switching to INFO level, these operational events became invisible in logs
    • Added explicit log.Info() calls for 16 events including certificate lifecycle, IngressController admission, and DNS warnings
    • Note: Events still exist as Kubernetes Events (visible via oc get events)

Implementation Approach

We chose to keep NewDevelopmentConfig() (with INFO level) instead of NewProduction() to maintain the human-readable console logging format.

NewProduction would use JSON encoding:

{"level":"info","ts":1774378939.054826,"logger":"operator.gatewayclass_controller","msg":"..."}

Our approach keeps console format:

2026-03-24T19:04:06.963Z  INFO  operator.gatewayclass_controller  msg...

Console format is more appropriate for OpenShift operators where cluster admins often debug via oc logs rather than log aggregation systems. This preserves familiar formatting while reducing log volume.

Impact

  • Before: ~2,000 debug logs/hour with Sail Library
  • After: Estimated <200 INFO+ logs/hour
  • Sail Library debug logs (drift detection, ownership checks) will no longer appear
  • Important operational messages remain visible
  • Log format unchanged - same human-readable console format as before

Testing

  • ✅ Build verification: make build passed
  • ✅ Unit tests: make test passed
  • ✅ Code verification: make verify passed

Related

Fixes: OCPBUGS-79467

🤖 Generated with Claude Code via /jira:solve OCPBUGS-79467 origin

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 24, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added jira/severity-low Referenced Jira bug's severity is low for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Mar 24, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@gcs278: This pull request references Jira Issue OCPBUGS-79467, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

Fixes excessive log volume by changing the cluster-ingress-operator's default log level from DEBUG to INFO.

Problem

The operator was using zap.NewDevelopment() which defaults to DEBUG level, causing ~2,000 debug logs per hour with Sail Library integration. This created significant noise in production log aggregation systems.

Changes

  1. Changed logger configuration (pkg/log/log.go):
  • Switched from zap.NewDevelopment() to zap.NewProduction()
  • Sets default log level to INFO instead of DEBUG
  1. Promoted important debug logs to INFO (pkg/operator/controller/gateway-status/controller.go):
  • "no service was found for gateway" (V(1) → INFO)
  • "no dnsrecords found for gateway" (V(1) → INFO)
  • These messages are important for troubleshooting Gateway deployment issues

Impact

  • Before: ~2,000 debug logs/hour with Sail Library
  • After: Estimated <200 INFO+ logs/hour
  • Sail Library debug logs (drift detection, ownership checks) will no longer appear
  • Important operational messages remain visible

Testing

  • ✅ Build verification: make build passed
  • ✅ Unit tests: make test passed
  • ✅ Code verification: make verify passed

Related

Fixes: OCPBUGS-79467

🤖 Generated with Claude Code via /jira:solve OCPBUGS-79467 origin

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Mar 24, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: aabf73cb-cc60-4dba-a342-a49947bfa8be

📥 Commits

Reviewing files that changed from the base of the PR and between 08e4271 and 9fe57a5.

📒 Files selected for processing (8)
  • pkg/operator/controller/canary-certificate/controller.go
  • pkg/operator/controller/canary/daemonset.go
  • pkg/operator/controller/certificate-publisher/publish_ca.go
  • pkg/operator/controller/certificate-publisher/publish_certs.go
  • pkg/operator/controller/certificate/ca.go
  • pkg/operator/controller/certificate/default_cert.go
  • pkg/operator/controller/dns/controller.go
  • pkg/operator/controller/ingress/controller.go
✅ Files skipped from review due to trivial changes (4)
  • pkg/operator/controller/canary-certificate/controller.go
  • pkg/operator/controller/dns/controller.go
  • pkg/operator/controller/certificate/default_cert.go
  • pkg/operator/controller/ingress/controller.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/operator/controller/canary/daemonset.go
  • pkg/operator/controller/certificate/ca.go
  • pkg/operator/controller/certificate-publisher/publish_ca.go

📝 Walkthrough

Walkthrough

Logger construction was changed to use zap.NewDevelopmentConfig() with config.Level set to zap.NewAtomicLevelAt(zapcore.InfoLevel) before config.Build(...). Multiple controllers had informational logging added or elevated: gateway-status, canary (daemonset and certificate), canary-certificate, certificate-publisher (CA and certs), certificate (CA and default cert), DNS, and ingress. Several messages were moved from verbose to unconditional log.Info. No exported function or type signatures were changed.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@gcs278 gcs278 force-pushed the fix-OCPBUGS-79467 branch 3 times, most recently from 2691c56 to 0b27052 Compare March 24, 2026 19:13
@gcs278 gcs278 marked this pull request as ready for review March 24, 2026 19:21
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2026
@openshift-ci openshift-ci bot requested review from frobware and rikatz March 24, 2026 19:21
@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Mar 24, 2026

/assign

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Mar 24, 2026

We chose to keep NewDevelopmentConfig() (with INFO level) instead of NewProduction() to maintain the human-readable console logging format. While NewProduction() optimizes for machine parsing with JSON encoding and includes log sampling, console format is more appropriate [...]

This preserves the familiar log format (ISO8601 timestamps, uppercase levels, console encoding) while reducing log volume by filtering DEBUG messages.

It is interesting that NewProduction() uses a different log format. Out of curiosity, do you have a sample of the log output with NewProduction()? In any case, I agree with the choice to preserve the log format and just suppress the excessive log messages.

/approve
/lgtm

Please do make sure take a look at the CI artifacts once they are available, and compare the ingress-operator log output with the same in the CI artifacts for some pre-#1354 PR.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 24, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2026
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 24, 2026

Thanks @Miciah - Updated the PR description to also include that.

I'll took a look at the artifacts did a quick comparison - it looks like the debug logs are gone. And I tested locally with Sail Library to make sure it stops logging at the Debug level.

We do get some event recorder logs removed from controller runtime DEBUG logs too I noticed:

2026-03-24T16:24:37.196Z      DEBUG   operator.init.events    record/event.go:418     created canary certificate      {"type": "Normal", "object":
  {"kind":"Secret","namespace":"openshift-ingress-canary","name":"canary-serving-cert","uid":"5761d0e3-61c9-47aa-a561-9d1128c6b1a5","apiVersion":"v1","resourceVersion":"22909"}, "reason":
  "CreatedCanaryCertificate"}

Let me know if you think that's an issue, but my assumption is that it's DEBUG, and not intended for production visibility.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 24, 2026

Infrastructure flake in creating a test pod...I wonder if we need to add a retry if creates are timing out:

=== RUN   TestAll/serial/TestCanaryRoute                                                                                                                                                      
    canary_test.go:47: Checking that the default ingresscontroller is ready...                                                                                                                
    canary_test.go:57: Getting the default ingresscontroller deployment...                                                                                                                    
    canary_test.go:63: Getting the canary route...                                                                                                                                            
    canary_test.go:86: Failed to create pod openshift-ingress-canary/canary-route-check: Internal error occurred: admission plugin "image.openshift.io/ImagePolicy" failed to complete

/test e2e-aws-operator

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Mar 25, 2026

We do get some event recorder logs removed from controller runtime DEBUG logs too I noticed:

2026-03-24T16:24:37.196Z      DEBUG   operator.init.events    record/event.go:418     created canary certificate      {"type": "Normal", "object":
  {"kind":"Secret","namespace":"openshift-ingress-canary","name":"canary-serving-cert","uid":"5761d0e3-61c9-47aa-a561-9d1128c6b1a5","apiVersion":"v1","resourceVersion":"22909"}, "reason":
  "CreatedCanaryCertificate"}

Let me know if you think that's an issue, but my assumption is that it's DEBUG, and not intended for production visibility.

Losing the "created canary certificate" log message isn't ideal. We should still get the event, but it is useful to get the log message as well.

From a quick look at the code, it appears that we are using the deprecated event broadcaster implementation, and it logs events with V(1).Info():

p.logger.V(1).Info(e.Message, "type", e.Type, "object", e.InvolvedObject, "reason", e.Reason)

In contrast, the new event broadcaster implementation just uses Info():

logger.Info("Event occurred", "object", klog.KRef(event.Regarding.Namespace, event.Regarding.Name), "kind", event.Regarding.Kind, "apiVersion", event.Regarding.APIVersion, "type", event.Type, "reason", event.Reason, "action", event.Action, "note", event.Note)

So I think updating to the new event recorder might get us back the log message for the "created canary certificate" event. The following patch should suffice.

Details
diff --git a/pkg/operator/controller/canary-certificate/controller.go b/pkg/operator/controller/canary-certificate/controller.go
index 0f20eac15..7593d0f88 100644
--- a/pkg/operator/controller/canary-certificate/controller.go
+++ b/pkg/operator/controller/canary-certificate/controller.go
@@ -18,7 +18,7 @@ import (
 	"k8s.io/apimachinery/pkg/api/errors"
 	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
 	"k8s.io/apimachinery/pkg/types"
-	"k8s.io/client-go/tools/record"
+	"k8s.io/client-go/tools/events"
 
 	"sigs.k8s.io/controller-runtime/pkg/client"
 	"sigs.k8s.io/controller-runtime/pkg/controller"
@@ -48,7 +48,7 @@ type Config struct {
 type reconciler struct {
 	config   Config
 	client   client.Client
-	recorder record.EventRecorder
+	recorder events.EventRecorder
 }
 
 // New creates the canary certificate controller
@@ -61,7 +61,7 @@ func New(mgr manager.Manager, config Config) (controller.Controller, error) {
 	reconciler := &reconciler{
 		config:   config,
 		client:   mgr.GetClient(),
-		recorder: mgr.GetEventRecorderFor(canaryCertControllerName),
+		recorder: mgr.GetEventRecorder(canaryCertControllerName),
 	}
 	c, err := controller.New(canaryCertControllerName, mgr, controller.Options{Reconciler: reconciler})
 	if err != nil {
@@ -206,7 +206,7 @@ func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *c
 		return err
 	}
 
-	r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
+	r.recorder.Eventf(certificate, nil, corev1.EventTypeNormal, "Created", "CreatedCanaryCertificate", "Created canary certificate")
 	return nil
 }
 
@@ -220,7 +220,7 @@ func (r *reconciler) updateCanaryCertificate(ctx context.Context, current, desir
 	if err := r.client.Update(ctx, updated); err != nil {
 		return false, err
 	}
-	r.recorder.Event(updated, "Normal", "UpdatedCanaryCertificate", "updated canary certificate")
+	r.recorder.Eventf(updated, nil, corev1.EventTypeNormal, "Updated", "UpdatedCanaryCertificate", "updated canary certificate")
 	return true, nil
 }

@coderabbitai, is the above analysis correct?

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the deprecated recorder implementation
cat vendor/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go | grep -n "V(1)\|Info\|func.*Event" | head -40

# Check the new event broadcaster
cat vendor/k8s.io/client-go/tools/events/event_broadcaster.go | grep -n "Info\|func.*Event\|log\." | head -40

# Check the canary-certificate controller
cat pkg/operator/controller/canary-certificate/controller.go

# Check if mgr.GetEventRecorder exists (new API)
grep -rn "GetEventRecorder\b" vendor/sigs.k8s.io/controller-runtime/ | head -20

# Check if mgr.GetEventRecorderFor exists (old API)
grep -rn "GetEventRecorderFor" vendor/sigs.k8s.io/controller-runtime/ | head -20

# Check events.EventRecorder interface Eventf signature
grep -n "Eventf\|EventRecorder" vendor/k8s.io/client-go/tools/events/interfaces.go 2>/dev/null || \
grep -rn "type EventRecorder interface" vendor/k8s.io/client-go/tools/events/ | head -10

# Look at the EventRecorder interface definition
find vendor/k8s.io/client-go/tools/events/ -name "*.go" | xargs grep -n "type EventRecorder\|Eventf(" | head -20

Length of output: 9763


🏁 Script executed:

#!/bin/bash
# Fetch the deprecated recorder implementation from GitHub
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go" | grep -n "V(1)\|\.Info\|func.*Event\|log\." | head -40

echo "---NEW BROADCASTER---"
# Fetch the new event broadcaster from GitHub
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/k8s.io/client-go/tools/events/event_broadcaster.go" | grep -n "\.Info\|log\.\|func.*Event" | head -40

echo "---MANAGER INTERFACE---"
# Check if GetEventRecorder (new) exists on the manager interface
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go" | grep -n "GetEventRecorder\|EventRecorder" | head -20

echo "---EVENTS EVENTRECORDER INTERFACE---"
# Check events.EventRecorder interface definition
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/k8s.io/client-go/tools/events/interfaces.go" | grep -n "Eventf\|EventRecorder\|type " | head -30

Length of output: 5182


🏁 Script executed:

#!/bin/bash
# Check manager interface more thoroughly
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go" | grep -n "GetEvent\|EventRecorder\|Recorder" | head -30

echo "---MANAGER INTERFACE FULL---"
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/sigs.k8s.io/controller-runtime/pkg/manager/manager.go" | grep -n "func\|interface\|Manager" | head -50

echo "---RECORDER PROVIDER FULL CONTEXT around line 115-135---"
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go" | sed -n '100,175p'

echo "---EVENTS EVENTRECORDER INTERNAL INTERFACE---"
# Check the internal events package that events.EventRecorder aliases
curl -s "https://raw.githubusercontent.com/openshift/cluster-ingress-operator/0b27052d94b042d1f492788406a530087e943590/vendor/k8s.io/client-go/tools/events/interfaces.go" | head -50

Length of output: 9148


@Miciah: The root cause analysis is correct — the deprecated record.EventRecorder path logs events at V(1), which is now suppressed under the new INFO default. However, the proposed fix has a subtle issue.

The new broadcaster in controller-runtime's Provider also logs at V(1), not plain Info().

Looking at vendor/sigs.k8s.io/controller-runtime/pkg/internal/recorder/recorder.go, the getBroadcaster() function sets up event watchers for both broadcasters:

// init deprecated broadcaster
p.deprecatedBroadcaster.StartEventWatcher(
    func(e *corev1.Event) {
        p.logger.V(1).Info(e.Message, "type", e.Type, "object", e.InvolvedObject, "reason", e.Reason)
    })

// init NEW broadcaster — also V(1)!
stopWatcher, err := p.broadcaster.StartEventWatcher(func(event runtime.Object) {
    e, isEvt := event.(*eventsv1.Event)
    if isEvt {
        p.logger.V(1).Info(e.Note, "type", e.Type, "object", e.Related, "action", e.Action, "reason", e.Reason)
    }
})

The event_broadcaster.go#L338 line you referenced (logger.Info(...)) is inside the StartLogging() method, but controller-runtime's Provider does not call StartLogging() — it uses StartEventWatcher() with its own V(1) callback for both the old and new broadcasters.

So switching from GetEventRecorderFor() to GetEventRecorder() would:

  • ✅ Use the non-deprecated API (beneficial in itself)
  • ❌ Still log the "created canary certificate" event message at V(1) — it would remain suppressed at the INFO default level

There is also a potential compilation issue to verify: mgr.GetEventRecorder() may not be exposed on the manager.Manager interface (the interface definition in manager.go showed no EventRecorder-related methods), whereas GetEventRecorderFor() is definitely available.

To actually restore the log message at INFO level, the simplest approach would be to add an explicit log.Info() call alongside the recorder.Event() call:

func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *corev1.Secret) error {
    if err := r.client.Create(ctx, certificate); err != nil {
        return err
    }
    r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
    log.Info("created canary certificate", "secret", certificate.Name)
    return nil
}

This keeps the Kubernetes event (for kubectl get events) and also logs the message at INFO level regardless of the broadcaster implementation.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 25, 2026

/hold
While I try to understand how to preserve event logging.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 25, 2026
@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Mar 25, 2026

While I try to understand how to preserve event logging.

I'm fine with @coderabbitai's suggestion to add a log.Info accompanying the r.recorder.Event.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain
  • Add log.Info(...) after each r.recorder.Event(...) call in pkg/operator/controller/canary-certificate/controller.go to preserve visibility of certificate events at INFO level.

🏁 Script executed:

#!/bin/bash
# Read the canary-certificate controller to confirm logger usage and exact context
cat pkg/operator/controller/canary-certificate/controller.go

Length of output: 9281


🏁 Script executed:

#!/bin/bash
# Get the exact lines around recorder.Event calls
grep -n "recorder.Event\|log\.\|import\|\"log\"\|logf\|logr" pkg/operator/controller/canary-certificate/controller.go | head -40

Length of output: 446


@Miciah, @gcs278: Here's the concrete change to add log.Info alongside both r.recorder.Event calls:

diff --git a/pkg/operator/controller/canary-certificate/controller.go b/pkg/operator/controller/canary-certificate/controller.go
--- a/pkg/operator/controller/canary-certificate/controller.go
+++ b/pkg/operator/controller/canary-certificate/controller.go
@@ -207,6 +207,7 @@ func (r *reconciler) createCanaryCertificate(ctx context.Context, certificate *c
 	}
 
 	r.recorder.Event(certificate, "Normal", "CreatedCanaryCertificate", "created canary certificate")
+	log.Info("created canary certificate", "secret", certificate.Name)
 	return nil
 }
 
@@ -220,6 +221,7 @@ func (r *reconciler) updateCanaryCertificate(ctx context.Context, current, desir
 	}
 	r.recorder.Event(updated, "Normal", "UpdatedCanaryCertificate", "updated canary certificate")
+	log.Info("updated canary certificate", "secret", updated.Name)
 	return true, nil
 }

This uses the existing package-level log variable (already scoped to canaryCertControllerName) and follows the same structured-logging convention used elsewhere in the file. No new imports are needed.


  • 📌 Commit the changes to this branch

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 25, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/operator/controller/ingress/controller.go (1)

341-342: Consider adding IngressController identity to the log message.

The log lacks context about which IngressController was admitted. Adding the name would improve log traceability, consistent with other log statements in this file (e.g., line 391, 414).

♻️ Suggested improvement
-		log.Info("ingresscontroller admitted")
+		log.Info("ingresscontroller admitted", "namespace", ingress.Namespace, "name", ingress.Name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/controller/ingress/controller.go` around lines 341 - 342, The
log message "ingresscontroller admitted" should include the IngressController
identity for traceability: update the log.Info call that currently logs
admission (and keep r.recorder.Event as-is) to include the ingress name (e.g.,
use ingress.Name or a formatted identifier from the ingress object) so the
message matches other contextual logs in this file (see existing logs around the
admission path such as those at lines referencing the same ingress variable).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@pkg/operator/controller/ingress/controller.go`:
- Around line 341-342: The log message "ingresscontroller admitted" should
include the IngressController identity for traceability: update the log.Info
call that currently logs admission (and keep r.recorder.Event as-is) to include
the ingress name (e.g., use ingress.Name or a formatted identifier from the
ingress object) so the message matches other contextual logs in this file (see
existing logs around the admission path such as those at lines referencing the
same ingress variable).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 6521cccd-b77e-4373-9016-81fb31b76596

📥 Commits

Reviewing files that changed from the base of the PR and between 0b27052 and 34e6c81.

📒 Files selected for processing (8)
  • pkg/operator/controller/canary-certificate/controller.go
  • pkg/operator/controller/canary/daemonset.go
  • pkg/operator/controller/certificate-publisher/publish_ca.go
  • pkg/operator/controller/certificate-publisher/publish_certs.go
  • pkg/operator/controller/certificate/ca.go
  • pkg/operator/controller/certificate/default_cert.go
  • pkg/operator/controller/dns/controller.go
  • pkg/operator/controller/ingress/controller.go
✅ Files skipped from review due to trivial changes (2)
  • pkg/operator/controller/canary-certificate/controller.go
  • pkg/operator/controller/dns/controller.go

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 27, 2026

Hypershift failures
/retest

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD d16dd45 and 2 for PR HEAD 9fe57a5 in total

@rhamini3
Copy link
Copy Markdown
Contributor

/retest-required

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD dde7a93 and 1 for PR HEAD 9fe57a5 in total

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 29, 2026

Looks like we had one success recently for hypershift-e2e-aks. I think it's throttling related, so before things get crazy again:
/test hypershift-e2e-aks

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

Infra failures
/retest

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 019903c and 0 for PR HEAD 9fe57a5 in total

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/hold

Revision 9fe57a5 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2026
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

e2e-aws-operator failed with:

=== NAME  TestAll/parallel/TestManagedDNSToUnmanagedDNSIngressController
    util_test.go:860: waiting for loadbalancer domain a95a49f4b86944c668c1ad577eabfe57-1701475338.us-east-1.elb.amazonaws.com to resolve...
    util_test.go:860: waiting for loadbalancer domain a95a49f4b86944c668c1ad577eabfe57-1701475338.us-east-1.elb.amazonaws.com to resolve...
    unmanaged_dns_test.go:146: loadbalancer domain a95a49f4b86944c668c1ad577eabfe57-1701475338.us-east-1.elb.amazonaws.com was unable to resolve: timed out waiting for the condition

Not related - it's a DNS resolution timeout. I'll keep an eye on these failures.

/test e2e-aws-operator

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD d212fd1 and 2 for PR HEAD 9fe57a5 in total

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

install/infra failure
/test e2e-gcp-operator

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

The e2e-aws-ovn-hypershift-conformance pod resize failure is expected, and we have the OK to override:
/override ci/prow/e2e-aws-ovn-hypershift-conformance

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 30, 2026

@gcs278: Overrode contexts on behalf of gcs278: ci/prow/e2e-aws-ovn-hypershift-conformance

Details

In response to this:

The e2e-aws-ovn-hypershift-conformance pod resize failure is expected, and we have the OK to override:
/override ci/prow/e2e-aws-ovn-hypershift-conformance

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 30, 2026

/hold

Going to wait for #1404 to merge. There's no dependency, but that one has priority, and if this merges first, we have to spin CI in #1404 again 😵

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2026
@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 30, 2026

/hold cancel
/retest-required

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 30, 2026
@gcs278
Copy link
Copy Markdown
Contributor Author

gcs278 commented Mar 31, 2026

teardown failure
/test e2e-hypershift

@lihongan
Copy link
Copy Markdown
Contributor

/retest-required

@lihongan
Copy link
Copy Markdown
Contributor

/test e2e-aws-ovn-upgrade
/test e2e-gcp-operator
/test hypershift-e2e-aks

@rikatz
Copy link
Copy Markdown
Member

rikatz commented Mar 31, 2026

/retest-required

@lihongan
Copy link
Copy Markdown
Contributor

/test e2e-aws-ovn

The HAProxy router should pass the http2 tests failed

@lihongan
Copy link
Copy Markdown
Contributor

/test e2e-aws-ovn-upgrade

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 31, 2026

@gcs278: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 2f0bb7f into openshift:master Mar 31, 2026
19 checks passed
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@gcs278: Jira Issue Verification Checks: Jira Issue OCPBUGS-79467
✔️ This pull request was pre-merge verified.
✔️ All associated pull requests have merged.
✔️ All associated, merged pull requests were pre-merge verified.

Jira Issue OCPBUGS-79467 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓

Details

In response to this:

Summary

Fixes excessive log volume by changing the cluster-ingress-operator's default log level from DEBUG to INFO.

Problem

The operator was using zap.NewDevelopment() which defaults to DEBUG level, causing ~2,000 debug logs per hour with Sail Library integration. This created significant noise in production log aggregation systems.

Changes

  1. Changed logger configuration (pkg/log/log.go):
  • Use NewDevelopmentConfig() with explicit INFO level
  • Keeps console encoding for human-readable logs
config := zap.NewDevelopmentConfig()
config.Level = zap.NewAtomicLevelAt(zapcore.InfoLevel)
  1. Promoted important debug logs to INFO (pkg/operator/controller/gateway-status/controller.go):
  • "no service was found for gateway" (V(1) → INFO)
  • "no dnsrecords found for gateway" (V(1) → INFO)
  • These messages are important for troubleshooting Gateway deployment issues
  1. Added INFO logs alongside event recorder calls (7 controller files):
  • Controller-runtime echoes recorder.Event() calls to DEBUG logs
  • After switching to INFO level, these operational events became invisible in logs
  • Added explicit log.Info() calls for 16 events including certificate lifecycle, IngressController admission, and DNS warnings
  • Note: Events still exist as Kubernetes Events (visible via oc get events)

Implementation Approach

We chose to keep NewDevelopmentConfig() (with INFO level) instead of NewProduction() to maintain the human-readable console logging format.

NewProduction would use JSON encoding:

{"level":"info","ts":1774378939.054826,"logger":"operator.gatewayclass_controller","msg":"..."}

Our approach keeps console format:

2026-03-24T19:04:06.963Z  INFO  operator.gatewayclass_controller  msg...

Console format is more appropriate for OpenShift operators where cluster admins often debug via oc logs rather than log aggregation systems. This preserves familiar formatting while reducing log volume.

Impact

  • Before: ~2,000 debug logs/hour with Sail Library
  • After: Estimated <200 INFO+ logs/hour
  • Sail Library debug logs (drift detection, ownership checks) will no longer appear
  • Important operational messages remain visible
  • Log format unchanged - same human-readable console format as before

Testing

  • ✅ Build verification: make build passed
  • ✅ Unit tests: make test passed
  • ✅ Code verification: make verify passed

Related

Fixes: OCPBUGS-79467

🤖 Generated with Claude Code via /jira:solve OCPBUGS-79467 origin

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

Fix included in accepted release 4.22.0-0.nightly-2026-04-01-092906

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-low Referenced Jira bug's severity is low for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants