NETOBSERV-2675: flp-informers deployment and wiring by leandroberetta · Pull Request #2650 · netobserv/netobserv-operator

leandroberetta · 2026-04-09T13:48:55Z

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?
- if so please describe it in PR description.
I have added thorough unit tests for the change.
QE requirements (check 1 from the list):
- Standard QE validation, with pre-merge tests unless stated otherwise.
- Regression tests only (e.g. refactoring with no user-facing change).
- No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

New Features
- Added a centralized FLP Informers component with configurable replicas, resources and advanced tuning.
Configuration Changes
- New CRD fields under spec.processor.informers; default FLP metrics port updated (9090 → 9401).
- Validation rejects user-configured ports that conflict with the reserved k8scache port (9090).
Chores
- Expanded RBAC to include coordination.k8s.io/leases permissions.
Tests
- Added unit tests for port-conflict validation.

openshift-ci-robot · 2026-04-09T13:49:01Z

@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-04-09T13:49:19Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new FLP informers component: a dedicated reconciler and builder for flp-informers, reserves k8scache port (9090) with validation, moves informer RBAC/ownership out of monolith/transformer reconcilers, and extends API/CRD/helm/manifests, tests, and docs for informer configuration.

Changes

Cohort / File(s)	Summary
Core FLP controller changes `internal/controller/flp/flp_controller.go`, `internal/controller/flp/flp_monolith_reconciler.go`, `internal/controller/flp/flp_transfo_reconciler.go`	Adds informer reconciler to main loop; removes informer ClusterRoleBinding reconciliation from monolith/transformer reconcilers.
Informer implementation `internal/controller/flp/flp_informer_builder.go`, `internal/controller/flp/flp_informer_reconciler.go`	New builder and reconciler implement `flp-informers` Deployment, ServiceAccount, RBAC, container args/ports/probes, defaults, and status handling.
Port reservation & builder validation `internal/controller/flp/flp_common_objects.go`, `internal/controller/flp/flp_monolith_objects.go`, `internal/controller/flp/flp_transfo_objects.go`	Introduces `k8scachePort = 9090`, adds `validatePortConflicts(...)` to reject user ports that equal 9090, exposes k8scache port and appends `--k8scache.*` args.
API, CRD, webhook `api/flowcollector/v1beta2/flowcollector_types.go`, `api/flowcollector/v1beta2/flowcollector_validation_webhook.go`, `config/crd/bases/...flowcollectors.yaml`, `bundle/manifests/...flowcollectors.yaml`, `helm/crds/...flowcollectors.yaml`	Adds `spec.processor.informers` types/schema, defaults and validation (enabled, replicas, resources, advanced tuning); webhook enforces replicas >= 2 when enabled.
Manifests, Helm, RBAC `bundle/manifests/netobserv-informers_rbac...clusterrole.yaml`, `bundle/manifests/netobserv-operator.clusterserviceversion.yaml`, `config/rbac/component_roles.yaml`, `config/rbac/role.yaml`, `helm/templates/clusterrole.yaml`, `helm/templates/netobserv-informers_rbac...clusterrole.yaml`, `helm/templates/netobserv-informers_rbac...clusterrolebinding.yaml`, `internal/pkg/manager/manager.go`	Adds `coordination.k8s.io/leases` RBAC rules and new ClusterRoleBinding for `flp-informers`; updates CSV descriptors and operator RBAC annotation.
Status, tests, docs, samples `internal/pkg/manager/status/status_manager.go`, `internal/controller/flp/flp_test.go`, `internal/controller/flp/flp_controller_test.go`, `config/samples/flows_v1beta2_flowcollector.yaml`, `docs/FlowCollector.md`	Adds `FLPInformers` status component constant; updates/extends tests for `flp-informers` Deployment/RBAC and port-conflict validation; adds sample and docs entries for informers config.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

NETOBSERV-2376 improve FlowCollector status #2610: Adds/extends FLPInformers component and status support tied to the new informer reconciler and status.FLPInformers.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 18.18% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'NETOBSERV-2675: flp-informers deployment and wiring' accurately summarizes the main change: adding a dedicated flp-informers deployment and wiring it to FLP processors.
Description check	✅ Passed	The PR description covers the main changes and dependencies, and the author completed the checklist. However, the description lacks detail about required test setup despite checking 'specific configuration or environment set up for testing'.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 6

🧹 Nitpick comments (2)

internal/controller/flp/flp_controller_test.go (1)
173-174: Use informerName constant instead of string literal

Replace hardcoded "flp-informers" in expectations with informerName to avoid drift if naming changes.

Also applies to: 336-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_controller_test.go` around lines 173 - 174,
Replace the hardcoded string "flp-informers" in the test expectations with the
existing informerName constant so the assertions use informerName instead of a
literal; update the checks that reference rb3.Subjects[0].Name and any other
assertions (e.g., the ones at the other occurrence around lines noted) to
compare to informerName, leaving the RoleRef check (rb3.RoleRef.Name) as-is
unless it should also be replaced by a corresponding constant.
internal/controller/flp/flp_informer_builder.go (1)
60-69: Consider making informer resources tunable

Hardcoding informer resources may under-size large clusters. Consider deriving from config/CR defaults so limits can be adjusted without code changes.

As per coding guidelines "Monitor memory usage impact for large scale clusters - watch for memory growth with large flow volumes and optimize cacheMaxFlows, cacheActiveTimeout, writeBatchWait, writeBatchSize settings".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_builder.go` around lines 60 - 69, The
ResourceRequirements for the informer are hardcoded (corev1.ResourceRequirements
with fixed Requests/Limits); make these values configurable by reading from the
controller config or CR defaults instead of literals—e.g., add config/CR fields
(or reuse existing controller config) for informer.memoryRequest,
informer.cpuRequest, informer.memoryLimit, informer.cpuLimit, and use them when
constructing the ResourceRequirements in flp_informer_builder.go (fall back to
the current "128Mi"/"50m"/"256Mi"/"200m" defaults if unset); ensure the
builder/constructor that creates the informer (the code that sets
ResourceRequirements) accepts or fetches the config/CR and applies those values
so operators can tune
cacheMaxFlows/cacheActiveTimeout/writeBatchWait/writeBatchSize and informer
resource limits without code changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bundle.Dockerfile`:
- Line 10: The LABEL
"operators.operatorframework.io.metrics.builder=operator-sdk-v1.40.0" shows a
downgrade to v1.40.0; verify and make the bundle's operator-sdk version
consistent across files (e.g., update this LABEL or
bundle/metadata/annotations.yaml) so all bundle files reference the same
operator-sdk version (ensure the LABEL value and the entry in
bundle/metadata/annotations.yaml match the intended version).

In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml`:
- Line 99: The CSV contains the builder annotation
operators.operatorframework.io/builder set to operator-sdk-v1.40.0 which
downgrades from v1.42.0; confirm intended SDK version and update the annotation
value (operators.operatorframework.io/builder) to the correct version (e.g.,
operator-sdk-v1.42.0) to match the rest of the bundle and your intended build
tooling so all CSVs use the same operator-sdk version.

In `@bundle/metadata/annotations.yaml`:
- Line 9: The bundle annotations indicate operator-sdk version
operator-sdk-v1.40.0 but the Makefile uses v1.42.0; update the bundle generation
to match the Makefile or vice versa: regenerate the bundle artifacts
(annotations.yaml, clusterserviceversion.yaml, bundle.Dockerfile) using
operator-sdk v1.42.0 so operators.operatorframework.io.metrics.builder reads
operator-sdk-v1.42.0, or change the Makefile/operator-sdk reference back to
v1.40.0 to match the existing bundle—ensure the versions are identical across
the Makefile and the generated bundle files.

In `@internal/controller/flp/flp_common_objects.go`:
- Around line 131-135: The code hardcodes a "k8scache" ContainerPort at 9090
which can conflict with user-configured FLP/health/metrics/profile ports; add
validation before Pod spec/container args are generated to detect any
user-configured port equal to 9090 (or vice-versa) and return an error instead
of proceeding. Locate the code that builds the ports slice (where ports =
append(... Name: "k8scache", ContainerPort: 9090 ...)) and the routine that
composes process args for the FLP/health/metrics/profile settings, then
implement a check that scans those config fields for equality with 9090 and
rejects the Pod creation with a clear error message if a collision is found.
Ensure the validation runs early (before creating corev1.ContainerPort entries)
and covers the other hardcoded occurrence mentioned (lines ~174-178).

In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 39-70: The informer Container construction (variable "container"
created with Name informerName and Image b.Images[reconcilers.MainImage]) lacks
a hardened SecurityContext; add SecurityContext:
&corev1.SecurityContext{RunAsNonRoot: pointer.BoolPtr(true), RunAsUser:
pointer.Int64Ptr(1000), AllowPrivilegeEscalation: pointer.BoolPtr(false),
ReadOnlyRootFilesystem: pointer.BoolPtr(true), Privileged:
pointer.BoolPtr(false), Capabilities: &corev1.Capabilities{Drop:
[]corev1.Capability{"ALL"}}, SeccompProfile: &corev1.SeccompProfile{Type:
corev1.SeccompProfileTypeRuntimeDefault}} to that container definition to match
other FLP workloads, and apply the same changes to the other informer Container
construction in this file (the second container block).

In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 54-57: Multiple reconciliation steps in flp_informer_reconciler.go
return bare errors (e.g., returns from r.Managed.FetchAll, r.ensureConfigured,
r.Lister.GetParentByName, r.Managed.Create/Update/Delete) which loses context;
update each return to wrap the underlying error with contextual text using the
standard error-wrapping pattern (e.g., fmt.Errorf("step X failed: %w", err) or
errors.Wrap) inside the Reconcile method (and any helper methods like
ensureConfigured) so callers see which reconcile sub-step failed—replace each
bare "return err" with a descriptive wrapped error referencing the failing call
(e.g., "fetch all managed resources", "ensure configured", "get parent by name",
"create managed resource", "update managed resource", etc.).

---

Nitpick comments:
In `@internal/controller/flp/flp_controller_test.go`:
- Around line 173-174: Replace the hardcoded string "flp-informers" in the test
expectations with the existing informerName constant so the assertions use
informerName instead of a literal; update the checks that reference
rb3.Subjects[0].Name and any other assertions (e.g., the ones at the other
occurrence around lines noted) to compare to informerName, leaving the RoleRef
check (rb3.RoleRef.Name) as-is unless it should also be replaced by a
corresponding constant.

In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 60-69: The ResourceRequirements for the informer are hardcoded
(corev1.ResourceRequirements with fixed Requests/Limits); make these values
configurable by reading from the controller config or CR defaults instead of
literals—e.g., add config/CR fields (or reuse existing controller config) for
informer.memoryRequest, informer.cpuRequest, informer.memoryLimit,
informer.cpuLimit, and use them when constructing the ResourceRequirements in
flp_informer_builder.go (fall back to the current "128Mi"/"50m"/"256Mi"/"200m"
defaults if unset); ensure the builder/constructor that creates the informer
(the code that sets ResourceRequirements) accepts or fetches the config/CR and
applies those values so operators can tune
cacheMaxFlows/cacheActiveTimeout/writeBatchWait/writeBatchSize and informer
resource limits without code changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 93c5b91f-3f2c-44e2-bed2-f73d938ec6ea

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad9ed9 and f676fa1.

📒 Files selected for processing (21)

bundle.Dockerfile
bundle/manifests/flows.netobserv.io_flowcollectors.yaml
bundle/manifests/flows.netobserv.io_flowcollectorslices.yaml
bundle/manifests/flows.netobserv.io_flowmetrics.yaml
bundle/manifests/netobserv-operator.clusterserviceversion.yaml
bundle/metadata/annotations.yaml
config/crd/bases/flows.netobserv.io_flowcollectors.yaml
config/crd/bases/flows.netobserv.io_flowcollectorslices.yaml
config/crd/bases/flows.netobserv.io_flowmetrics.yaml
helm/crds/flows.netobserv.io_flowcollectors.yaml
helm/crds/flows.netobserv.io_flowcollectorslices.yaml
helm/crds/flows.netobserv.io_flowmetrics.yaml
internal/controller/flp/flp_common_objects.go
internal/controller/flp/flp_controller.go
internal/controller/flp/flp_controller_test.go
internal/controller/flp/flp_informer_builder.go
internal/controller/flp/flp_informer_reconciler.go
internal/controller/flp/flp_monolith_objects.go
internal/controller/flp/flp_monolith_reconciler.go
internal/controller/flp/flp_transfo_reconciler.go
internal/pkg/manager/status/status_manager.go

💤 Files with no reviewable changes (2)

internal/controller/flp/flp_transfo_reconciler.go
internal/controller/flp/flp_monolith_reconciler.go

openshift-ci-robot · 2026-04-09T14:03:27Z

@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

Chores

Updated build/tool annotations (operator-sdk and controller-gen versions).

New Features

Added an FLP informer component: deployable informer workload with its own ServiceAccount, RBAC, Deployment, k8scache TCP port (9090) exposed, and new status component for informer health.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2026-04-10T13:29:10Z

@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

Release Notes

New Features

Introduced a new FLP Informers component that runs as a separate deployment for improved resource management.

Added port conflict validation to prevent configuration errors when user-selected ports conflict with the reserved k8scache port (9090).

Configuration Changes

Updated default metrics server port from 9090 to 9401 to avoid port conflicts.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

🧹 Nitpick comments (5)

internal/controller/flp/flp_informer_reconciler.go (1)

60-64: Consider handling TryDeleteAll errors.

TryDeleteAll return value is ignored. While "Try" implies best-effort, logging failures would aid debugging cleanup issues.

Suggested improvement

 	if desired.Spec.OnHold() {
 		r.Status.SetUnused("FlowCollector is on hold")
-		r.Managed.TryDeleteAll(ctx)
+		if err := r.Managed.TryDeleteAll(ctx); err != nil {
+			log.FromContext(ctx).Error(err, "failed to delete some managed resources")
+		}
 		return nil
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_reconciler.go` around lines 60 - 64,
When handling the OnHold branch, capture the error returned by
r.Managed.TryDeleteAll(ctx) instead of ignoring it; call it into a variable
(e.g., err := r.Managed.TryDeleteAll(ctx)) and if non-nil emit a diagnostic (use
the reconciler logger or status updater such as r.Log.Error(err, "failed to
cleanup managed resources on hold") or r.Status.SetError/SetDegraded with
context) so cleanup failures are recorded, then continue returning nil as
before.

internal/controller/flp/flp_informer_builder.go (3)

60-68: Hardcoded resource limits.

Resource requirements are fixed at 128Mi/256Mi memory and 50m/200m CPU. Other FLP components use desired.Processor.Resources from the CR spec. Consider exposing informer resources via config or document why fixed values are appropriate here.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_builder.go` around lines 60 - 68, The
informer is using hardcoded resource values in the Resources block; replace the
fixed corev1.ResourceRequirements with the FLP spec's resource settings (use
desired.Processor.Resources) or thread a configurable field through the informer
builder so callers can supply resource requirements; update the builder function
that constructs the informer pod (look for Resources:
corev1.ResourceRequirements in flp_informer_builder.go) to read and apply
desired.Processor.Resources (or a new InformerResources field) and ensure
fallback/default values are documented or applied when the CR value is nil.

90-96: Pod template missing part-of label.

Deployment ObjectMeta.Labels includes "part-of": constants.OperatorName but the pod template labels don't. Other FLP builders (monolith, transformer) include this label in their pod templates for consistency.

Suggested fix

 			Template: corev1.PodTemplateSpec{
 				ObjectMeta: metav1.ObjectMeta{
 					Labels: map[string]string{
+						"part-of": constants.OperatorName,
 						"app":     informerName,
 						"version": version,
 					},
 				},

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_builder.go` around lines 90 - 96, The
pod template under Template: corev1.PodTemplateSpec for the FLP informer builder
is missing the "part-of" label; update the ObjectMeta.Labels map inside the
PodTemplateSpec (the block that currently sets "app": informerName and
"version": version) to also include "part-of": constants.OperatorName so the pod
template matches the Deployment ObjectMeta.Labels and other FLP builders
(monolith, transformer).

29-104: Method signature returns error but never fails.

deployment() returns (*appsv1.Deployment, error) but always returns nil error. Either simplify the signature to return only *appsv1.Deployment, or add validation that could fail.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_builder.go` around lines 29 - 104, The
deployment() method on informerBuilder currently returns (*appsv1.Deployment,
error) but never returns a non-nil error; either remove the unused error return
or add meaningful validation. To fix quickly, change informerBuilder.deployment
signature to return *appsv1.Deployment (no error), update its return to only the
Deployment value, and update all callers to accept the single return (search for
calls to deployment() and adjust unpacking). Alternatively, if you prefer
validation, add checks inside deployment (e.g., ensure
b.Images[reconcilers.MainImage] is non-empty and b.desired is non-nil) and
return an error when invalid instead of always nil; update callers accordingly
to handle the error.

internal/controller/flp/flp_monolith_objects.go (1)

183-188: Use constant and consistent type for k8scache port.

Line 185 hardcodes 9090 and line 187 uses intstr.FromInt() instead of intstr.FromInt32(). Use the k8scachePort constant for maintainability and match the int32 pattern used on line 181.

Suggested fix

 			{
 				Name:       "k8scache",
-				Port:       9090,
+				Port:       k8scachePort,
 				Protocol:   corev1.ProtocolTCP,
-				TargetPort: intstr.FromInt(9090),
+				TargetPort: intstr.FromInt32(k8scachePort),
 			},

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_monolith_objects.go` around lines 183 - 188,
Replace the hardcoded 9090 and the intstr.FromInt call with the k8scachePort
constant and an int32 conversion so types are consistent: set Port to
k8scachePort and TargetPort to intstr.FromInt32(int32(k8scachePort)) for the
object with Name "k8scache" (the same pattern used on the earlier port
definition).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 60-68: The informer is using hardcoded resource values in the
Resources block; replace the fixed corev1.ResourceRequirements with the FLP
spec's resource settings (use desired.Processor.Resources) or thread a
configurable field through the informer builder so callers can supply resource
requirements; update the builder function that constructs the informer pod (look
for Resources: corev1.ResourceRequirements in flp_informer_builder.go) to read
and apply desired.Processor.Resources (or a new InformerResources field) and
ensure fallback/default values are documented or applied when the CR value is
nil.
- Around line 90-96: The pod template under Template: corev1.PodTemplateSpec for
the FLP informer builder is missing the "part-of" label; update the
ObjectMeta.Labels map inside the PodTemplateSpec (the block that currently sets
"app": informerName and "version": version) to also include "part-of":
constants.OperatorName so the pod template matches the Deployment
ObjectMeta.Labels and other FLP builders (monolith, transformer).
- Around line 29-104: The deployment() method on informerBuilder currently
returns (*appsv1.Deployment, error) but never returns a non-nil error; either
remove the unused error return or add meaningful validation. To fix quickly,
change informerBuilder.deployment signature to return *appsv1.Deployment (no
error), update its return to only the Deployment value, and update all callers
to accept the single return (search for calls to deployment() and adjust
unpacking). Alternatively, if you prefer validation, add checks inside
deployment (e.g., ensure b.Images[reconcilers.MainImage] is non-empty and
b.desired is non-nil) and return an error when invalid instead of always nil;
update callers accordingly to handle the error.

In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 60-64: When handling the OnHold branch, capture the error returned
by r.Managed.TryDeleteAll(ctx) instead of ignoring it; call it into a variable
(e.g., err := r.Managed.TryDeleteAll(ctx)) and if non-nil emit a diagnostic (use
the reconciler logger or status updater such as r.Log.Error(err, "failed to
cleanup managed resources on hold") or r.Status.SetError/SetDegraded with
context) so cleanup failures are recorded, then continue returning nil as
before.

In `@internal/controller/flp/flp_monolith_objects.go`:
- Around line 183-188: Replace the hardcoded 9090 and the intstr.FromInt call
with the k8scachePort constant and an int32 conversion so types are consistent:
set Port to k8scachePort and TargetPort to intstr.FromInt32(int32(k8scachePort))
for the object with Name "k8scache" (the same pattern used on the earlier port
definition).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5085d614-83f7-410f-bbeb-7e640cf41e4c

📥 Commits

Reviewing files that changed from the base of the PR and between 05bd730 and 79953b0.

📒 Files selected for processing (6)

internal/controller/flp/flp_common_objects.go
internal/controller/flp/flp_informer_builder.go
internal/controller/flp/flp_informer_reconciler.go
internal/controller/flp/flp_monolith_objects.go
internal/controller/flp/flp_test.go
internal/controller/flp/flp_transfo_objects.go

jpinsonneau

I feel we should make everything configurable and hide the options we don't want to expose much in the advanced config.

The number of replicas is mandatory here since the cache may fail and we'll loose the whole pipeline.

Also, the cache component should have health and report some metrics in the health dashboard.

jpinsonneau · 2026-04-14T08:54:44Z

+	builder := newInformerBuilder(r.Instance, &desired.Spec)
+
+	// Reconcile ServiceAccount
+	if err := r.reconcileServiceAccount(ctx, &builder); err != nil {
+		return fmt.Errorf("failed to reconcile service account: %w", err)
+	}
+
+	// Reconcile RBAC
+	if err := r.reconcilePermissions(ctx); err != nil {
+		return fmt.Errorf("failed to reconcile permissions: %w", err)
+	}
+
+	// Reconcile Deployment
+	if err := r.reconcileDeployment(ctx, &builder); err != nil {
+		return fmt.Errorf("failed to reconcile deployment: %w", err)
+	}


I feel we should make this configurable. On a small cluster with few FLPs (maybe even one), it doesn't make sense to have cache separated. WDYT ?

Makes sense, I added:

# Centralized Kubernetes informers (enabled by default for reduced API server load) # informers: # enabled: true # replicas: 2 # Minimum 2 for HA with leader election

jpinsonneau · 2026-04-14T08:56:21Z

+		Command:         []string{"/app/flp-informers"},
+		Args: []string{
+			fmt.Sprintf("--processor-selector=%s", processorSelector),
+			"--processor-port=9090",


Should rely on k8scachePort const

jpinsonneau · 2026-04-14T08:56:48Z

+					Name:       "k8scache",
+					Port:       9090,
+					Protocol:   corev1.ProtocolTCP,
+					TargetPort: intstr.FromInt(9090),


Should rely on k8scachePort const

jpinsonneau · 2026-04-14T09:00:27Z

+}
+
+func (b *informerBuilder) deployment() (*appsv1.Deployment, error) {
+	var replicas int32 = 1


I have concerns about this. It is not resilient to have a single replica.

It would be interesting to make it configurable and have a leader election mechanism.

I've added a leader election mechanism.

jpinsonneau · 2026-04-14T09:01:02Z

+		Resources: corev1.ResourceRequirements{
+			Requests: corev1.ResourceList{
+				corev1.ResourceMemory: resource.MustParse("128Mi"),
+				corev1.ResourceCPU:    resource.MustParse("50m"),
+			},
+			Limits: corev1.ResourceList{
+				corev1.ResourceMemory: resource.MustParse("256Mi"),
+				corev1.ResourceCPU:    resource.MustParse("200m"),
+			},
+		},


This should be configurable in the advanced options

Now is configurable at:

# Centralized Kubernetes informers (enabled by default for reduced API server load) # informers: # enabled: true # replicas: 2 # Minimum 2 for HA with leader election # resources: # requests: # memory: 128Mi # cpu: 50m # limits: # memory: 256Mi # cpu: 200m # advanced: # resyncInterval: 60 # Seconds to rediscover processors # batchSize: 100 # Max entries per update batch # sendTimeout: 10 # Seconds timeout for sending updates # updateBufferSize: 100 # Update channel buffer size

I can move it to the advance section if you want.

That looks good to me ! thanks @leandroberetta

openshift-ci · 2026-04-15T16:34:36Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jpinsonneau. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

DOWNSTREAM_OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2026-04-15T16:37:13Z

@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

New Features

Added a new FLP Informers component (centralized informers deployment) with configurable replicas and advanced tuning.

Configuration Changes

New CRD/config fields under spec.processor.informers; default processor metrics port changed from 9090 to 9401.

Added validation to prevent user ports conflicting with reserved k8scache port (9090).

Chores

Expanded RBAC to include coordination.k8s.io/leases permissions.

Tests

Added port-conflict validation unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bundle/manifests/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5552-5559: The YAML's `replicas` schema currently sets `minimum:
1` but the comment requires `replicas >= 2` when `enabled` is true; update the
CRD validation to enforce that conditional rule (or relax the comment). Add an
x-kubernetes-validations rule for the relevant `informers`/`flp-informers`
section that asserts "if .enabled == true then .replicas >= 2" (using the CRD
validation expression language), or alternatively change `minimum: 1` to
`minimum: 2` and adjust the description; target the `replicas` and `enabled`
fields in the same schema block so the validation is applied correctly.

In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5121-5127: The schema for the "replicas" field currently sets
minimum: 1 while the description for the flp-informers deployment mandates a
minimum of 2; update either the validation or the description to remove the
conflict: either change the "replicas" property minimum from 1 to 2 to enforce
the described requirement, or modify the description to state that "a minimum of
2 replicas is recommended when `enabled` is `true`" (making it a recommendation
rather than a hard requirement); target the "replicas" field under the
flp-informers deployment definition so the change aligns schema validation with
the description.

In `@config/rbac/component_roles.yaml`:
- Around line 99-107: The informers ClusterRole is missing the "watch" verb for
the coordination.k8s.io/leases permission; update the informers ClusterRole
entry that lists apiGroups: coordination.k8s.io and resources: leases to include
"watch" alongside create, get, list, update so controller-runtime leader
election can observe lease changes (i.e., modify the informers ClusterRole's
leases verbs to add "watch").

In `@docs/FlowCollector.md`:
- Around line 10709-10715: Update the replicas description for the flp-informers
setting to remove the apparent contradiction: make the conditional explicit by
stating that when `enabled` is true a minimum of 2 replicas is
recommended/required for high availability, while the schema default/minimum
remains 1 otherwise; reference the `replicas` field and the `enabled` flag in
the text so readers understand the constraint applies only when `enabled=true`
and not to the base `<i>Minimum</i>: 1` schema value.

In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 63-79: After reading config.Advanced.ProcessorPort into
processorPort, validate it against the informer’s reserved ports before
constructing the container/grpc port: ensure processorPort does not equal 8080
or 9091 (and any other ports used by the informer/process like k8scachePort),
and if it conflicts return an error (or reject the value) from the informer
builder path so the reconciler (flp_informer_reconciler.go) cannot create a pod
with conflicting ports; implement the check right after the assignment from
config.Advanced.ProcessorPort and before using processorPort to populate the
container port fields, referencing processorPort and
config.Advanced.ProcessorPort in the validation logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6b8e2942-9cb9-4c51-9617-41a8afdfb570

📥 Commits

Reviewing files that changed from the base of the PR and between 79953b0 and 326ff55.

⛔ Files ignored due to path filters (1)

api/flowcollector/v1beta2/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go

📒 Files selected for processing (18)

api/flowcollector/v1beta2/flowcollector_types.go
api/flowcollector/v1beta2/flowcollector_validation_webhook.go
bundle/manifests/flows.netobserv.io_flowcollectors.yaml
bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
bundle/manifests/netobserv-operator.clusterserviceversion.yaml
config/crd/bases/flows.netobserv.io_flowcollectors.yaml
config/rbac/component_roles.yaml
config/rbac/role.yaml
config/samples/flows_v1beta2_flowcollector.yaml
docs/FlowCollector.md
helm/crds/flows.netobserv.io_flowcollectors.yaml
helm/templates/clusterrole.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml
internal/controller/flp/flp_informer_builder.go
internal/controller/flp/flp_informer_reconciler.go
internal/controller/flp/flp_monolith_objects.go
internal/pkg/manager/manager.go

✅ Files skipped from review due to trivial changes (6)

config/rbac/role.yaml
internal/pkg/manager/manager.go
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
config/samples/flows_v1beta2_flowcollector.yaml
bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml

🚧 Files skipped from review as they are similar to previous changes (1)

internal/controller/flp/flp_informer_reconciler.go

coderabbitai · 2026-04-15T16:50:05Z

+                      replicas:
+                        default: 2
+                        description: |-
+                          `replicas` defines the number of replicas for the flp-informers deployment.
+                          For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+                        format: int32
+                        minimum: 1
+                        type: integer


⚠️ Potential issue | 🟡 Minor

Enforce the HA replica rule in schema (or relax the wording).

Line 5556 says a minimum of 2 replicas is required when enabled, but the schema only enforces minimum: 1 (Line 5558) with no conditional rule. enabled: true + replicas: 1 would still validate.

Suggested CRD fix

informers: @@ properties: @@ replicas: default: 2 description: |- `replicas` defines the number of replicas for the flp-informers deployment. For high availability, a minimum of 2 replicas is required when `enabled` is `true`. format: int32 minimum: 1 type: integer @@ - type: object + type: object + x-kubernetes-validations: + - message: replicas must be at least 2 when informers are enabled + rule: self.enabled != true || !has(self.replicas) || self.replicas >= 2

#!/bin/bash set -euo pipefail # Verify whether webhook already enforces "enabled => replicas >= 2" fd -i 'flowcollector_validation_webhook.go' -x sh -c ' echo "==> $1" rg -n "informers|replicas|enabled|at least 2|>=\\s*2" "$1" ' sh {} # Verify current CRD validation presence for informers replicas fd -i 'flows.netobserv.io_flowcollectors.yaml' -x sh -c ' echo "==> $1" rg -n "informers:|replicas:|x-kubernetes-validations|enabled:" "$1" ' sh {}

Also applies to: 5628-5628

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@bundle/manifests/flows.netobserv.io_flowcollectors.yaml` around lines 5552 - 5559, The YAML's `replicas` schema currently sets `minimum: 1` but the comment requires `replicas >= 2` when `enabled` is true; update the CRD validation to enforce that conditional rule (or relax the comment). Add an x-kubernetes-validations rule for the relevant `informers`/`flp-informers` section that asserts "if .enabled == true then .replicas >= 2" (using the CRD validation expression language), or alternatively change `minimum: 1` to `minimum: 2` and adjust the description; target the `replicas` and `enabled` fields in the same schema block so the validation is applied correctly.

coderabbitai · 2026-04-15T16:50:05Z

+                          default: 2
+                          description: |-
+                            `replicas` defines the number of replicas for the flp-informers deployment.
+                            For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+                          format: int32
+                          minimum: 1
+                          type: integer


⚠️ Potential issue | 🟡 Minor

Align replicas validation with its description.

The description says a minimum of 2 replicas is required when enabled, but Line 5126 allows minimum: 1. Either enforce 2 in schema/webhook or reword this as a recommendation to avoid conflicting API behavior.

Suggested wording fix (if 1 replica is intentionally allowed)

- For high availability, a minimum of 2 replicas is required when `enabled` is `true`. + For high availability, 2 or more replicas are recommended when `enabled` is `true`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml` around lines 5121 - 5127, The schema for the "replicas" field currently sets minimum: 1 while the description for the flp-informers deployment mandates a minimum of 2; update either the validation or the description to remove the conflict: either change the "replicas" property minimum from 1 to 2 to enforce the described requirement, or modify the description to state that "a minimum of 2 replicas is recommended when `enabled` is `true`" (making it a recommendation rather than a hard requirement); target the "replicas" field under the flp-informers deployment definition so the change aligns schema validation with the description.

coderabbitai · 2026-04-15T16:50:06Z

+	processorPort := int32(k8scachePort)
+	if config.Advanced != nil {
+		if config.Advanced.ResyncInterval != nil {
+			resyncInterval = *config.Advanced.ResyncInterval
+		}
+		if config.Advanced.BatchSize != nil {
+			batchSize = *config.Advanced.BatchSize
+		}
+		if config.Advanced.SendTimeout != nil {
+			sendTimeout = *config.Advanced.SendTimeout
+		}
+		if config.Advanced.UpdateBufferSize != nil {
+			updateBufferSize = *config.Advanced.UpdateBufferSize
+		}
+		if config.Advanced.ProcessorPort != nil {
+			processorPort = *config.Advanced.ProcessorPort
+		}


⚠️ Potential issue | 🟠 Major

Add informer-side validation for processorPort conflicts.

processorPort is user-overridable (Line 77-79) and then reused as container grpc port (Line 105), while Line 110 and Line 115 already reserve 8080/9091. The informer reconciliation path (internal/controller/flp/flp_informer_reconciler.go, Line 53-99) does not call shared port-conflict validation, so conflicting values can slip through.

Proposed patch

@@ if config.Advanced != nil { @@ if config.Advanced.ProcessorPort != nil { processorPort = *config.Advanced.ProcessorPort } } + + // Reserved informer container ports + const ( + healthPort int32 = 8080 + metricsPort int32 = 9091 + ) + if processorPort == healthPort || processorPort == metricsPort { + return nil, fmt.Errorf("invalid processor port %d: conflicts with reserved informer ports (%d health, %d metrics)", processorPort, healthPort, metricsPort) + } @@ { Name: "health", - ContainerPort: 8080, + ContainerPort: healthPort, Protocol: corev1.ProtocolTCP, }, { Name: "metrics", - ContainerPort: 9091, + ContainerPort: metricsPort, Protocol: corev1.ProtocolTCP, }, @@ Path: "/healthz", - Port: intstr.FromInt(8080), + Port: intstr.FromInt(int(healthPort)), @@ Path: "/ready", - Port: intstr.FromInt(8080), + Port: intstr.FromInt(int(healthPort)),

Also applies to: 101-118

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_builder.go` around lines 63 - 79, After reading config.Advanced.ProcessorPort into processorPort, validate it against the informer’s reserved ports before constructing the container/grpc port: ensure processorPort does not equal 8080 or 9091 (and any other ports used by the informer/process like k8scachePort), and if it conflicts return an error (or reject the value) from the informer builder path so the reconciler (flp_informer_reconciler.go) cannot create a pod with conflicting ports; implement the check right after the assignment from config.Advanced.ProcessorPort and before using processorPort to populate the container port fields, referencing processorPort and config.Advanced.ProcessorPort in the validation logic.

jpinsonneau

Thanks for addressing the feedback. That looks good to me now

Let's just try to fix the tests

Metric netobserv_ingest_flows_processed not found after 180 secs!!!

jpinsonneau · 2026-04-15T18:38:40Z

/test e2e-operator

openshift-ci · 2026-04-16T22:36:29Z

New changes are detected. LGTM label has been removed.

openshift-ci-robot · 2026-04-16T22:39:11Z

@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Description

Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.

Dependencies

netobserv/flowlogs-pipeline#1221

Checklist

Does the changes in PR need specific configuration or environment set up for testing?

if so please describe it in PR description.

I have added thorough unit tests for the change.

QE requirements (check 1 from the list):

Standard QE validation, with pre-merge tests unless stated otherwise.

Regression tests only (e.g. refactoring with no user-facing change).

No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

Summary by CodeRabbit

New Features

Added a centralized FLP Informers component with configurable replicas, resources and advanced tuning.

Configuration Changes

New CRD fields under spec.processor.informers; default FLP metrics port updated (9090 → 9401).

Validation rejects user-configured ports that conflict with the reserved k8scache port (9090).

Chores

Expanded RBAC to include coordination.k8s.io/leases permissions.

Tests

Added unit tests for port-conflict validation.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

internal/controller/flp/flp_common_objects.go (1)

160-207: ⚠️ Potential issue | 🟡 Minor

k8scache port & args are added unconditionally — gate on informers.enabled.

The k8scache container port (Line 160-164) and the --k8scache.* args (Line 203-207) are appended regardless of whether spec.processor.informers.enabled is true. When the feature is off, you still bind 9090 and pass flags the user didn't opt into. This also means FLP must always understand the --k8scache.* flags, tightening the version coupling with flowlogs-pipeline#1221.

🔧 Suggested gating

-	ports = append(ports, corev1.ContainerPort{
-		Name:          "k8scache",
-		ContainerPort: k8scachePort,
-		Protocol:      corev1.ProtocolTCP,
-	})
+	informersEnabled := desired.Processor.Informers != nil &&
+		(desired.Processor.Informers.Enabled == nil || *desired.Processor.Informers.Enabled)
+	if informersEnabled {
+		ports = append(ports, corev1.ContainerPort{
+			Name:          "k8scache",
+			ContainerPort: k8scachePort,
+			Protocol:      corev1.ProtocolTCP,
+		})
+	}
@@
-		Args: []string{
-			fmt.Sprintf(`--config=%s/%s`, configPath, configFile),
-			fmt.Sprintf("--k8scache.port=%d", k8scachePort),
-			"--k8scache.address=0.0.0.0",
-		},
+		Args: func() []string {
+			a := []string{fmt.Sprintf(`--config=%s/%s`, configPath, configFile)}
+			if informersEnabled {
+				a = append(a, fmt.Sprintf("--k8scache.port=%d", k8scachePort), "--k8scache.address=0.0.0.0")
+			}
+			return a
+		}(),

Likewise, validatePortConflicts only needs to fire when informers are enabled.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_common_objects.go` around lines 160 - 207, The
k8scache ContainerPort and the "--k8scache.*" args are being added
unconditionally; change the logic to only append the k8scache port (the
corev1.ContainerPort with Name "k8scache") and the
"--k8scache.port/--k8scache.address" entries in the container.Args for the
constants.FLPName container when the informers feature is enabled (i.e., gate on
the informers.enabled flag in advancedConfig or the equivalent field). Also
ensure any call to validatePortConflicts is only invoked when informers are
enabled so we don’t validate 9090 unless the informers are on. Locate these
changes around where ports are appended (k8scache port block), where
container.Args are built (k8scache args), and where validatePortConflicts is
called and wrap them in the same informers-enabled conditional.

♻️ Duplicate comments (1)

config/crd/bases/flows.netobserv.io_flowcollectors.yaml (1)

5150-5157: ⚠️ Potential issue | 🟡 Minor

replicas validation contradicts the documented requirement.

minimum: 1 conflicts with the field description saying 2 is required when enabled=true. Please either enforce the conditional constraint or reword it as a recommendation.

Suggested schema-side enforcement

                     informers:
                       description: |-
                         `informers` configuration for centralized Kubernetes informers that push cache updates to flowlogs-pipeline processors.
                         This reduces load on the Kubernetes API server by having a single component (flp-informers) query the API instead of N FLP processors.
                         When enabled, a dedicated `flp-informers` deployment is created that watches Kubernetes resources and pushes updates via gRPC.
                       properties:
@@
                         replicas:
                           default: 2
                           description: |-
                             `replicas` defines the number of replicas for the flp-informers deployment.
                             For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
                           format: int32
                           minimum: 1
                           type: integer
@@
+                      x-kubernetes-validations:
+                        - message: "informers.replicas must be >= 2 when informers.enabled is true."
+                          rule: "!has(self.enabled) || !self.enabled || !has(self.replicas) || self.replicas >= 2"
                       type: object

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml` around lines 5150 -
5157, The replicas schema for the flp-informers deployment currently declares
minimum: 1 but the description mandates "a minimum of 2 replicas when `enabled`
is `true` — fix this by either updating the schema to enforce the constraint or
rewording the description: either change the `replicas` validation so minimum: 2
and/or add an if/then conditional that checks the `enabled` boolean and requires
replicas >= 2 when `enabled: true`, or keep minimum: 1 but rephrase the
description to a recommendation (e.g., "recommended minimum of 2 when `enabled`
is true"); locate the `replicas` field and the related `enabled` flag in the
flowcollectors CRD to apply the chosen fix.

🧹 Nitpick comments (8)

internal/controller/flp/flp_common_objects.go (3)

98-124: Extract the port-name literal and simplify the repetition.

Four near-identical branches with the same error template. A small table collapses this nicely and makes future additions (e.g., another reserved port) one-line changes:

♻️ Suggested refactor

-func validatePortConflicts(desired *flowslatest.FlowCollectorSpec) error {
-	advancedConfig := helper.GetAdvancedProcessorConfig(desired)
-
-	// Check FLP port
-	if advancedConfig.Port != nil && *advancedConfig.Port == k8scachePort {
-		return fmt.Errorf("flowlogs-pipeline port %d conflicts with reserved k8scache port %d", *advancedConfig.Port, k8scachePort)
-	}
-
-	// Check health port
-	if advancedConfig.HealthPort != nil && *advancedConfig.HealthPort == k8scachePort {
-		return fmt.Errorf("flowlogs-pipeline health port %d conflicts with reserved k8scache port %d", *advancedConfig.HealthPort, k8scachePort)
-	}
-
-	// Check metrics port
-	metricsPort := desired.Processor.GetMetricsPort()
-	if metricsPort == k8scachePort {
-		return fmt.Errorf("flowlogs-pipeline metrics port %d conflicts with reserved k8scache port %d", metricsPort, k8scachePort)
-	}
-
-	// Check profile port (optional)
-	if advancedConfig.ProfilePort != nil && *advancedConfig.ProfilePort == k8scachePort {
-		return fmt.Errorf("flowlogs-pipeline profile port %d conflicts with reserved k8scache port %d", *advancedConfig.ProfilePort, k8scachePort)
-	}
-
-	return nil
-}
+func validatePortConflicts(desired *flowslatest.FlowCollectorSpec) error {
+	adv := helper.GetAdvancedProcessorConfig(desired)
+	checks := []struct {
+		label string
+		port  *int32
+	}{
+		{"port", adv.Port},
+		{"health port", adv.HealthPort},
+		{"metrics port", ptr.To(desired.Processor.GetMetricsPort())},
+		{"profile port", adv.ProfilePort},
+	}
+	for _, c := range checks {
+		if c.port != nil && *c.port == k8scachePort {
+			return fmt.Errorf("flowlogs-pipeline %s %d conflicts with reserved k8scache port %d", c.label, *c.port, k8scachePort)
+		}
+	}
+	return nil
+}

Also — this is a spec validation and arguably belongs next to validateFLP() in the webhook, not inside the builder. Builders today silently swallow the error in tests (b, _ := newMonolithBuilder(...)), so a misconfig would surface as a broken controller reconcile rather than a clean webhook rejection.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_common_objects.go` around lines 98 - 124, The
validatePortConflicts function repeats the same error template for multiple
ports; refactor it by extracting the error message template and building a small
map/slice of (portName, portValue) entries using advancedConfig.Port,
advancedConfig.HealthPort, desired.Processor.GetMetricsPort(), and
advancedConfig.ProfilePort, then iterate and return fmt.Errorf(fmtTemplate, val,
k8scachePort) when a conflict is found; update references to the unique symbols
validatePortConflicts, advancedConfig, desired.Processor.GetMetricsPort, and
k8scachePort to implement the loop-based check and remove duplicated branches —
also consider moving this spec validation into the webhook near validateFLP so
misconfigs are rejected early instead of being swallowed by builders.

161-161: Promote "k8scache" to a named const.

healthPortName, prometheusPortName, profilePortName all have constants above — this one sticks out as a bare string literal. Add k8scachePortName = "k8scache" alongside k8scachePort for consistency.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_common_objects.go` at line 161, Promote the
literal "k8scache" to a named constant by adding k8scachePortName = "k8scache"
alongside the existing k8scachePort constant, and replace the literal Name:
"k8scache" with Name: k8scachePortName (or use the constant wherever that port
name is referenced). Update references to the string in the same file so usage
is consistent with healthPortName/prometheusPortName/profilePortName and the
existing k8scachePort symbol.

30-30: Consider exposing k8scachePort as configurable (e.g., under Advanced).

Hardcoding a well-known port like 9090 invites collisions with other sidecars or injected tooling (kube-rbac-proxy, Prometheus node-exporter style setups, etc.). Promoting this to an advanced-config field keeps the reserved default but gives operators an escape hatch — which also neatly avoids the whole validatePortConflicts dance.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_common_objects.go` at line 30, Replace the
hardcoded k8scachePort = 9090 with a configurable Advanced option: add a field
(e.g., Advanced.K8sCachePort int) to the existing configuration struct and
supply a default of 9090 when loading/constructing the config, then replace uses
of the package-level k8scachePort symbol with reads from that config (falling
back to 9090 if unset); also update any config parsing/CLI flags and
documentation to expose the new Advanced option and keep existing validation (or
simplify validatePortConflicts if desired).

internal/controller/flp/flp_test.go (1)

814-850: Good coverage — consider one small extension.

validatePortConflicts is invoked from both newMonolithBuilder and newTransfoBuilder, but only the monolith path is exercised here. A single mirror case via newTransfoBuilder (or parameterizing the constructor) would guard against someone accidentally dropping the call from one constructor.

Also an ideal candidate for table-driven tests — the five cases repeat the same three-line pattern:

♻️ Table-driven sketch

cases := []struct {
    name   string
    mutate func(*flowslatest.FlowCollectorSpec)
    errSub string
}{
    {"flp port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.Port = ptr.To(int32(9090)) }, "flowlogs-pipeline port 9090"},
    {"health port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.HealthPort = ptr.To(int32(9090)) }, "health port 9090"},
    {"metrics port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Metrics.Server.Port = ptr.To(int32(9090)) }, "metrics port 9090"},
    {"profile port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.ProfilePort = ptr.To(int32(9090)) }, "profile port 9090"},
    {"ok", func(*flowslatest.FlowCollectorSpec) {}, ""},
}
for _, tc := range cases {
    t.Run(tc.name, func(t *testing.T) { /* ... */ })
}

Based on learnings, testify is an approved framework in this repo, so no need to port to Ginkgo/Gomega just for this.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_test.go` around lines 814 - 850,
TestPortConflictValidation only exercises validatePortConflicts via
newMonolithBuilder; add a mirror set of assertions that call newTransfoBuilder
to ensure validatePortConflicts is enforced there too (or refactor
TestPortConflictValidation into a table-driven loop that runs each test case
against both newMonolithBuilder and newTransfoBuilder). Keep the same mutation
functions and error substring checks (use the existing cases for
Processor.Advanced.Port, HealthPort, Metrics.Server.Port, ProfilePort and the
no-conflict case) and assert the same error messages when invoking
newTransfoBuilder (or both builders) so the coverage is identical for
validatePortConflicts, referencing the existing TestPortConflictValidation,
newMonolithBuilder, newTransfoBuilder and validatePortConflicts symbols.

bundle/manifests/netobserv-operator.clusterserviceversion.yaml (1)

469-486: Missing UI descriptors for consistency.

Other toggles/advanced sections in this CSV use booleanSwitch for enabled flags and hidden/advanced markers for advanced subsections (e.g., consolePlugin.enable, processor.advanced). Consider aligning processor.informers.enabled with booleanSwitch and marking processor.informers.advanced fields with advanced (and a fieldDependency:processor.informers.enabled:true) so the OpenShift form matches existing patterns.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml` around lines
469 - 486, The CSV UI descriptors should mirror existing patterns: change the
descriptor for processor.informers.enabled to use booleanSwitch (like
consolePlugin.enable) and mark the processor.informers.advanced section and its
child fields (processor.informers.advanced.batchSize,
processor.informers.advanced.processorPort,
processor.informers.advanced.resyncInterval,
processor.informers.advanced.sendTimeout,
processor.informers.advanced.updateBufferSize) as advanced/hidden; add a
fieldDependency on processor.informers.enabled:true so the advanced fields only
appear when the Informers toggle is enabled and the OpenShift form behavior
matches other components.

internal/controller/flp/flp_controller_test.go (1)

168-174: Consider asserting informer Deployment/SA lifecycle across Kafka transitions.

The Kafka context verifies the shared informer CRB but doesn't assert the flp-informers Deployment/ServiceAccount persist across the mono→Kafka→mono transitions (Contexts "With Kafka" and "Back without Kafka"). Since this is now a shared, independent deployment, a quick Eventually ... Get(flpKeyInformer, &Deployment{}).Should(Succeed()) in both contexts would guard against accidental teardown on model switch.

Also applies to: 330-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_controller_test.go` around lines 168 - 174, Add
assertions in the "With Kafka" and "Back without Kafka" test contexts to verify
the flp-informers Deployment and ServiceAccount persist across the
mono→Kafka→mono transitions: use Eventually with k8sClient.Get and the existing
flpKeyInformer key to ensure Get(ctx, flpKeyInformer, &appsv1.Deployment{}) and
Get(ctx, flpKeyInformer, &corev1.ServiceAccount{}) both succeed (similar to the
existing ClusterRoleBinding check using rb3), so the shared informer
Deployment/SA is not accidentally torn down when switching models.

internal/controller/flp/flp_informer_reconciler.go (2)

66-70: Nit: collapse the nil-check with ptr.Deref.

Minor readability tweak — no behavior change.

Proposed diff

-	// Check if informers are enabled (default: true)
-	enabled := true
-	if desired.Spec.Processor.Informers != nil && desired.Spec.Processor.Informers.Enabled != nil {
-		enabled = *desired.Spec.Processor.Informers.Enabled
-	}
-
-	if !enabled {
+	// Check if informers are enabled (default: true)
+	if desired.Spec.Processor.Informers != nil && !ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) {

Requires k8s.io/utils/ptr import. Skip if the project prefers the explicit form.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_reconciler.go` around lines 66 - 70,
Replace the manual nil-check that initializes the local variable enabled from
desired.Spec.Processor.Informers.Enabled with a single ptr.Deref call: set
enabled := ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) (or the
equivalent expression) so the default true is used when the pointer is nil; add
the import "k8s.io/utils/ptr". Update the block that references
desired.Spec.Processor.Informers.Enabled and remove the explicit if nil { ... }
branch.

101-108: SA drift won't be corrected — confirm this is acceptable.

reconcileServiceAccount only creates when missing. If labels/annotations/imagePullSecrets on the SA ever diverge from the builder (e.g., user edit, future builder changes), it won't self-heal until the SA is deleted. The inline comment acknowledges it, so flagging as FYI in case the informers SA gains secrets/annotations later.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/controller/flp/flp_informer_reconciler.go` around lines 101 - 108,
reconcileServiceAccount in informerReconciler currently only creates the
ServiceAccount via builder.serviceAccount() when
r.Managed.Exists(r.serviceAccount) is false, so
label/annotation/imagePullSecrets drift won't be corrected; update
reconcileServiceAccount to fetch the existing SA, compare labels, annotations
and imagePullSecrets with builder.serviceAccount() and perform a patch/update
(using r.Update or a strategic merge/patch method) when differences exist while
preserving managed fields, or explicitly document/guard that this resource is
intentionally immutable; reference reconcileServiceAccount, informerReconciler,
builder.serviceAccount(), r.Managed.Exists and r.CreateOwned when making the
change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/flowcollector/v1beta2/flowcollector_validation_webhook.go`:
- Around line 460-481: In validateInformers(), treat a nil Enabled pointer as
true (respect the kubebuilder default) by changing the enabled calculation to
consider nil as enabled instead of disabled (refer to
v.fc.Processor.Informers.Enabled and the validateInformers function); keep the
replicas >=2 enforcement when enabled but update the error message or coordinate
with the CRD (Replicas' +kubebuilder:validation:Minimum) so documentation/CRD
marker and the webhook behavior are aligned (either relax webhook to match CRD
or change CRD/marker to indicate minimum 2).

In `@helm/crds/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5154-5161: The description for the replicas field claims "a
minimum of 2 replicas is required" but the schema still allows minimum: 1;
update the CRD by either changing the schema to enforce minimum: 2 (replace
"minimum: 1" with "minimum: 2" for the replicas property) or, if you don't want
to enforce it, reword the description to say "recommended for HA" instead of
"required"; locate the replicas block (contains "replicas:", "default: 2", and
the description) and apply the chosen change so description and validation are
consistent.
- Around line 5110-5230: The CRD and status population are missing the informers
component; add an Informers *FlowCollectorComponentStatus field to the
FlowCollectorComponentsStatus struct and update populateComponentStatuses() to
handle status.FLPInformers (mirror the existing cases for
Agent/Processor/Plugin) so that when the controller calls
status.ForComponent(status.FLPInformers) and uses SetReady()/SetUnused() the
value is preserved and emitted in status.components.informers; ensure the CRD
description list is updated to include informers as well.

---

Outside diff comments:
In `@internal/controller/flp/flp_common_objects.go`:
- Around line 160-207: The k8scache ContainerPort and the "--k8scache.*" args
are being added unconditionally; change the logic to only append the k8scache
port (the corev1.ContainerPort with Name "k8scache") and the
"--k8scache.port/--k8scache.address" entries in the container.Args for the
constants.FLPName container when the informers feature is enabled (i.e., gate on
the informers.enabled flag in advancedConfig or the equivalent field). Also
ensure any call to validatePortConflicts is only invoked when informers are
enabled so we don’t validate 9090 unless the informers are on. Locate these
changes around where ports are appended (k8scache port block), where
container.Args are built (k8scache args), and where validatePortConflicts is
called and wrap them in the same informers-enabled conditional.

---

Duplicate comments:
In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5150-5157: The replicas schema for the flp-informers deployment
currently declares minimum: 1 but the description mandates "a minimum of 2
replicas when `enabled` is `true` — fix this by either updating the schema to
enforce the constraint or rewording the description: either change the
`replicas` validation so minimum: 2 and/or add an if/then conditional that
checks the `enabled` boolean and requires replicas >= 2 when `enabled: true`, or
keep minimum: 1 but rephrase the description to a recommendation (e.g.,
"recommended minimum of 2 when `enabled` is true"); locate the `replicas` field
and the related `enabled` flag in the flowcollectors CRD to apply the chosen
fix.

---

Nitpick comments:
In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml`:
- Around line 469-486: The CSV UI descriptors should mirror existing patterns:
change the descriptor for processor.informers.enabled to use booleanSwitch (like
consolePlugin.enable) and mark the processor.informers.advanced section and its
child fields (processor.informers.advanced.batchSize,
processor.informers.advanced.processorPort,
processor.informers.advanced.resyncInterval,
processor.informers.advanced.sendTimeout,
processor.informers.advanced.updateBufferSize) as advanced/hidden; add a
fieldDependency on processor.informers.enabled:true so the advanced fields only
appear when the Informers toggle is enabled and the OpenShift form behavior
matches other components.

In `@internal/controller/flp/flp_common_objects.go`:
- Around line 98-124: The validatePortConflicts function repeats the same error
template for multiple ports; refactor it by extracting the error message
template and building a small map/slice of (portName, portValue) entries using
advancedConfig.Port, advancedConfig.HealthPort,
desired.Processor.GetMetricsPort(), and advancedConfig.ProfilePort, then iterate
and return fmt.Errorf(fmtTemplate, val, k8scachePort) when a conflict is found;
update references to the unique symbols validatePortConflicts, advancedConfig,
desired.Processor.GetMetricsPort, and k8scachePort to implement the loop-based
check and remove duplicated branches — also consider moving this spec validation
into the webhook near validateFLP so misconfigs are rejected early instead of
being swallowed by builders.
- Line 161: Promote the literal "k8scache" to a named constant by adding
k8scachePortName = "k8scache" alongside the existing k8scachePort constant, and
replace the literal Name: "k8scache" with Name: k8scachePortName (or use the
constant wherever that port name is referenced). Update references to the string
in the same file so usage is consistent with
healthPortName/prometheusPortName/profilePortName and the existing k8scachePort
symbol.
- Line 30: Replace the hardcoded k8scachePort = 9090 with a configurable
Advanced option: add a field (e.g., Advanced.K8sCachePort int) to the existing
configuration struct and supply a default of 9090 when loading/constructing the
config, then replace uses of the package-level k8scachePort symbol with reads
from that config (falling back to 9090 if unset); also update any config
parsing/CLI flags and documentation to expose the new Advanced option and keep
existing validation (or simplify validatePortConflicts if desired).

In `@internal/controller/flp/flp_controller_test.go`:
- Around line 168-174: Add assertions in the "With Kafka" and "Back without
Kafka" test contexts to verify the flp-informers Deployment and ServiceAccount
persist across the mono→Kafka→mono transitions: use Eventually with
k8sClient.Get and the existing flpKeyInformer key to ensure Get(ctx,
flpKeyInformer, &appsv1.Deployment{}) and Get(ctx, flpKeyInformer,
&corev1.ServiceAccount{}) both succeed (similar to the existing
ClusterRoleBinding check using rb3), so the shared informer Deployment/SA is not
accidentally torn down when switching models.

In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 66-70: Replace the manual nil-check that initializes the local
variable enabled from desired.Spec.Processor.Informers.Enabled with a single
ptr.Deref call: set enabled :=
ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) (or the equivalent
expression) so the default true is used when the pointer is nil; add the import
"k8s.io/utils/ptr". Update the block that references
desired.Spec.Processor.Informers.Enabled and remove the explicit if nil { ... }
branch.
- Around line 101-108: reconcileServiceAccount in informerReconciler currently
only creates the ServiceAccount via builder.serviceAccount() when
r.Managed.Exists(r.serviceAccount) is false, so
label/annotation/imagePullSecrets drift won't be corrected; update
reconcileServiceAccount to fetch the existing SA, compare labels, annotations
and imagePullSecrets with builder.serviceAccount() and perform a patch/update
(using r.Update or a strategic merge/patch method) when differences exist while
preserving managed fields, or explicitly document/guard that this resource is
intentionally immutable; reference reconcileServiceAccount, informerReconciler,
builder.serviceAccount(), r.Managed.Exists and r.CreateOwned when making the
change.

In `@internal/controller/flp/flp_test.go`:
- Around line 814-850: TestPortConflictValidation only exercises
validatePortConflicts via newMonolithBuilder; add a mirror set of assertions
that call newTransfoBuilder to ensure validatePortConflicts is enforced there
too (or refactor TestPortConflictValidation into a table-driven loop that runs
each test case against both newMonolithBuilder and newTransfoBuilder). Keep the
same mutation functions and error substring checks (use the existing cases for
Processor.Advanced.Port, HealthPort, Metrics.Server.Port, ProfilePort and the
no-conflict case) and assert the same error messages when invoking
newTransfoBuilder (or both builders) so the coverage is identical for
validatePortConflicts, referencing the existing TestPortConflictValidation,
newMonolithBuilder, newTransfoBuilder and validatePortConflicts symbols.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 46903d42-c9f4-4324-9404-64e76dab93f5

📥 Commits

Reviewing files that changed from the base of the PR and between 326ff55 and 9f0160e.

⛔ Files ignored due to path filters (1)

api/flowcollector/v1beta2/zz_generated.deepcopy.go is excluded by !**/zz_generated.deepcopy.go

📒 Files selected for processing (26)

api/flowcollector/v1beta2/flowcollector_types.go
api/flowcollector/v1beta2/flowcollector_validation_webhook.go
bundle/manifests/flows.netobserv.io_flowcollectors.yaml
bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
bundle/manifests/netobserv-operator.clusterserviceversion.yaml
config/crd/bases/flows.netobserv.io_flowcollectors.yaml
config/rbac/component_roles.yaml
config/rbac/role.yaml
config/samples/flows_v1beta2_flowcollector.yaml
docs/FlowCollector.md
helm/crds/flows.netobserv.io_flowcollectors.yaml
helm/templates/clusterrole.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml
internal/controller/flp/flp_common_objects.go
internal/controller/flp/flp_controller.go
internal/controller/flp/flp_controller_test.go
internal/controller/flp/flp_informer_builder.go
internal/controller/flp/flp_informer_reconciler.go
internal/controller/flp/flp_monolith_objects.go
internal/controller/flp/flp_monolith_reconciler.go
internal/controller/flp/flp_test.go
internal/controller/flp/flp_transfo_objects.go
internal/controller/flp/flp_transfo_reconciler.go
internal/pkg/manager/manager.go
internal/pkg/manager/status/status_manager.go

💤 Files with no reviewable changes (2)

internal/controller/flp/flp_monolith_reconciler.go
internal/controller/flp/flp_transfo_reconciler.go

✅ Files skipped from review due to trivial changes (11)

internal/pkg/manager/manager.go
internal/controller/flp/flp_transfo_objects.go
bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
internal/pkg/manager/status/status_manager.go
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
config/samples/flows_v1beta2_flowcollector.yaml
helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml
helm/templates/clusterrole.yaml
internal/controller/flp/flp_informer_builder.go
internal/controller/flp/flp_controller.go
docs/FlowCollector.md

🚧 Files skipped from review as they are similar to previous changes (5)

config/rbac/role.yaml
internal/controller/flp/flp_monolith_objects.go
config/rbac/component_roles.yaml
api/flowcollector/v1beta2/flowcollector_types.go
bundle/manifests/flows.netobserv.io_flowcollectors.yaml

coderabbitai · 2026-04-16T22:44:31Z

+func (v *validator) validateInformers() {
+	if v.fc.Processor.Informers == nil {
+		return
+	}
+
+	// Check if enabled
+	enabled := v.fc.Processor.Informers.Enabled != nil && *v.fc.Processor.Informers.Enabled
+
+	if enabled {
+		// When enabled, replicas must be at least 2 for high availability
+		replicas := int32(2) // default
+		if v.fc.Processor.Informers.Replicas != nil {
+			replicas = *v.fc.Processor.Informers.Replicas
+		}
+		if replicas < 2 {
+			v.errors = append(
+				v.errors,
+				fmt.Errorf("spec.processor.informers.replicas must be at least 2 when informers are enabled (got %d). Centralized informers require high availability to avoid losing the entire flow collection pipeline in case of failure", replicas),
+			)
+		}
+	}
+}


⚠️ Potential issue | 🟡 Minor

Default-enabled nil check may skip validation.

Enabled has kubebuilder default true, but this code treats nil as disabled. If the webhook runs before CRD defaulting (or on an object that bypassed defaulting), a user with informers: {} sneaks past the replica check while effectively running enabled. Consider treating nil as true to match the declared default:

🔧 Suggested tweak

- // Check if enabled - enabled := v.fc.Processor.Informers.Enabled != nil && *v.fc.Processor.Informers.Enabled + // Check if enabled (defaults to true per CRD) + enabled := v.fc.Processor.Informers.Enabled == nil || *v.fc.Processor.Informers.Enabled

Also note: the CRD declares +kubebuilder:validation:Minimum=1 on Replicas, but this webhook enforces >=2 when enabled. Worth aligning the docstring on Replicas or the CRD marker so users aren't surprised by a "1 is valid here, but not really" situation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/flowcollector/v1beta2/flowcollector_validation_webhook.go` around lines 460 - 481, In validateInformers(), treat a nil Enabled pointer as true (respect the kubebuilder default) by changing the enabled calculation to consider nil as enabled instead of disabled (refer to v.fc.Processor.Informers.Enabled and the validateInformers function); keep the replicas >=2 enforcement when enabled but update the error message or coordinate with the CRD (Replicas' +kubebuilder:validation:Minimum) so documentation/CRD marker and the webhook behavior are aligned (either relax webhook to match CRD or change CRD/marker to indicate minimum 2).

coderabbitai · 2026-04-16T22:44:31Z

+                    informers:
+                      description: |-
+                        `informers` configuration for centralized Kubernetes informers that push cache updates to flowlogs-pipeline processors.
+                        This reduces load on the Kubernetes API server by having a single component (flp-informers) query the API instead of N FLP processors.
+                        When enabled, a dedicated `flp-informers` deployment is created that watches Kubernetes resources and pushes updates via gRPC.
+                      properties:
+                        advanced:
+                          description: '`advanced` allows setting some technical parameters of the informers component.'
+                          properties:
+                            batchSize:
+                              default: 100
+                              description: '`batchSize` defines the maximum number of cache entries to send in a single update batch.'
+                              minimum: 1
+                              type: integer
+                            processorPort:
+                              default: 9090
+                              description: '`processorPort` defines the gRPC port where flowlogs-pipeline processors listen for k8s cache updates.'
+                              format: int32
+                              maximum: 65535
+                              minimum: 1
+                              type: integer
+                            resyncInterval:
+                              default: 60
+                              description: '`resyncInterval` defines the interval in seconds to rediscover processors and sync state.'
+                              minimum: 1
+                              type: integer
+                            sendTimeout:
+                              default: 10
+                              description: '`sendTimeout` defines the timeout in seconds for sending updates to processors.'
+                              minimum: 1
+                              type: integer
+                            updateBufferSize:
+                              default: 100
+                              description: '`updateBufferSize` defines the size of the internal update channel buffer.'
+                              minimum: 1
+                              type: integer
+                          type: object
+                        enabled:
+                          default: true
+                          description: |-
+                            `enabled` controls whether to deploy centralized Kubernetes informers.
+                            When `true`, a dedicated `flp-informers` deployment watches K8s resources and pushes cache updates via gRPC to FLP processors.
+                            When `false`, each FLP processor uses local informers (previous behavior).
+                          type: boolean
+                        replicas:
+                          default: 2
+                          description: |-
+                            `replicas` defines the number of replicas for the flp-informers deployment.
+                            For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+                          format: int32
+                          minimum: 1
+                          type: integer
+                        resources:
+                          default:
+                            limits:
+                              cpu: 200m
+                              memory: 256Mi
+                            requests:
+                              cpu: 50m
+                              memory: 128Mi
+                          description: |-
+                            `resources` are the compute resources required by the informers container.
+                            For more information, see https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+                          properties:
+                            claims:
+                              description: |-
+                                Claims lists the names of resources, defined in spec.resourceClaims,
+                                that are used by this container.
+
+                                This field depends on the
+                                DynamicResourceAllocation feature gate.
+
+                                This field is immutable. It can only be set for containers.
+                              items:
+                                description: ResourceClaim references one entry in PodSpec.ResourceClaims.
+                                properties:
+                                  name:
+                                    description: |-
+                                      Name must match the name of one entry in pod.spec.resourceClaims of
+                                      the Pod where this field is used. It makes that resource available
+                                      inside a container.
+                                    type: string
+                                  request:
+                                    description: |-
+                                      Request is the name chosen for a request in the referenced claim.
+                                      If empty, everything from the claim is made available, otherwise
+                                      only the result of this request.
+                                    type: string
+                                required:
+                                  - name
+                                type: object
+                              type: array
+                              x-kubernetes-list-map-keys:
+                                - name
+                              x-kubernetes-list-type: map
+                            limits:
+                              additionalProperties:
+                                anyOf:
+                                  - type: integer
+                                  - type: string
+                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                                x-kubernetes-int-or-string: true
+                              description: |-
+                                Limits describes the maximum amount of compute resources allowed.
+                                More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+                              type: object
+                            requests:
+                              additionalProperties:
+                                anyOf:
+                                  - type: integer
+                                  - type: string
+                                pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
+                                x-kubernetes-int-or-string: true
+                              description: |-
+                                Requests describes the minimum amount of compute resources required.
+                                If Requests is omitted for a container, it defaults to Limits if that is explicitly specified,
+                                otherwise to an implementation-defined value. Requests cannot exceed Limits.
+                                More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
+                              type: object
+                          type: object
+                      type: object


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== status.components fields in CRDs ==" rg -n -C2 '^\s{20}(agent|processor|plugin|informers):$' \ helm/crds/flows.netobserv.io_flowcollectors.yaml \ config/crd/bases/flows.netobserv.io_flowcollectors.yaml \ bundle/manifests/flows.netobserv.io_flowcollectors.yaml || true echo echo "== controller / API references to informer component status ==" rg -n -C3 'status\.components|components.*informers|informers.*state|Informer.*State|Informers.*State' \ api/flowcollector/v1beta2 \ internal/controller || true

Repository: netobserv/netobserv-operator

Length of output: 5798

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Go status types for components ==" rg -n -B2 -A10 'type.*ComponentsStatus|type.*Status.*components' api/flowcollector/v1beta2/ echo echo "== Check for Informers in status types ==" rg -n 'type.*Informers.*Status|Informers.*\*.*Status|StatusInformers' api/flowcollector/v1beta2/ echo echo "== Controller writes to status.components ==" rg -n -B2 -A2 '\.Status\.Components' internal/controller/ | head -50 echo echo "== Check if controller references informers in status ==" rg -n 'informers.*status|status.*informers|components\.Informers|Informers.*=|status.*Informers' internal/controller/ || true

Repository: netobserv/netobserv-operator

Length of output: 1355

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Full FlowCollectorComponentsStatus struct ==" sed -n '1736,1760p' api/flowcollector/v1beta2/flowcollector_types.go echo echo "== Check controller reconciler for informers status writes ==" rg -n 'components\.' internal/controller/ | grep -i informer || echo "No informers status writes found" echo echo "== Check if informers deployment is managed ==" rg -n -B2 -A2 'flp-informers|informersDeployment' internal/controller/ | head -40 echo echo "== Look for any recent TODO/FIXME about informers status ==" rg -n 'TODO|FIXME' internal/controller/ | grep -i informer || echo "No TODOs found"

Repository: netobserv/netobserv-operator

Length of output: 3528

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check CRD description of status.components ==" rg -n -A2 'components.*reports the status' config/crd/bases/flows.netobserv.io_flowcollectors.yaml helm/crds/flows.netobserv.io_flowcollectors.yaml echo echo "== Verify informers deployment is actually created by reconciler ==" rg -n 'flp_informer_reconciler\|informer.*reconciler' internal/controller/ | head -20

Repository: netobserv/netobserv-operator

Length of output: 899

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Search for any status update code in flp_informer files ==" find internal/controller -name '*informer*' -type f echo echo "== Look at flp_informer_reconciler.go for status updates ==" cat internal/controller/flp/flp_informer_reconciler.go | head -100 echo echo "== Check if status.components is set anywhere for informers ==" rg -n 'Informers.*:=|\.Informers\s*=' internal/controller/ || echo "No Informers status assignments"

Repository: netobserv/netobserv-operator

Length of output: 3585

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check what r.Status.SetReady() actually does ==" rg -n -B5 -A10 'func.*SetReady|SetUnused' internal/pkg/manager/status/ | head -50 echo echo "== Look for where Components status is actually updated ==" rg -n 'Components\.Agent|Components\.Processor|Components\.Plugin' internal/controller/ | head -20

Repository: netobserv/netobserv-operator

Length of output: 4515

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Search for status manager component types ==" rg -n 'EBPFAgents|FLPProcessor|ConsolePlugin|Informers' internal/pkg/manager/status/status_manager.go | head -30 echo echo "== Check status manager initialization ==" cat internal/pkg/manager/status/status_manager.go | head -100

Repository: netobserv/netobserv-operator

Length of output: 3257

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check populateComponentStatuses function for informers ==" rg -n -A50 'func.*populateComponentStatuses' internal/pkg/manager/status/status_manager.go echo echo "== Check ForComponent to see if informers can be accessed ==" rg -n -B2 -A10 'func.*ForComponent' internal/pkg/manager/status/status_manager.go

Repository: netobserv/netobserv-operator

Length of output: 2757

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check what status the informer reconciler is using ==" rg -n 'ForComponent\|getStatus' internal/controller/flp/flp_informer_reconciler.go echo echo "== Check the full informer reconciler status tracking ==" cat -n internal/controller/flp/flp_informer_reconciler.go | grep -A5 -B5 'Status' echo echo "== Verify FLPInformers is actually used anywhere ==" rg -n 'FLPInformers' internal/controller/

Repository: netobserv/netobserv-operator

Length of output: 128

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== Check Instance type in informer reconciler ==" grep -n 'Instance\|Status' internal/controller/flp/flp_informer_reconciler.go | head -20 echo echo "== Look at reconcilers.Instance to see Status field ==" rg -n -B2 -A15 'type Instance struct' internal/controller/reconcilers/ echo echo "== Search for where FLPInformers is used ==" rg -n 'FLPInformers' internal/

Repository: netobserv/netobserv-operator

Length of output: 2746

Add status.components.informers field to track the flp-informers component.

The controller tracks informer status via status.ForComponent(status.FLPInformers) and calls SetReady()/SetUnused() on it, but populateComponentStatuses() has no case for FLPInformers, so the status is silently dropped. The CRD lacks a corresponding Components.Informers field entirely, and the description still lists only (agent, processor, plugin). Update FlowCollectorComponentsStatus to include Informers *FlowCollectorComponentStatus and add the matching case in populateComponentStatuses().

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@helm/crds/flows.netobserv.io_flowcollectors.yaml` around lines 5110 - 5230, The CRD and status population are missing the informers component; add an Informers *FlowCollectorComponentStatus field to the FlowCollectorComponentsStatus struct and update populateComponentStatuses() to handle status.FLPInformers (mirror the existing cases for Agent/Processor/Plugin) so that when the controller calls status.ForComponent(status.FLPInformers) and uses SetReady()/SetUnused() the value is preserved and emitted in status.components.informers; ensure the CRD description list is updated to include informers as well.

coderabbitai · 2026-04-16T22:44:31Z

+                        replicas:
+                          default: 2
+                          description: |-
+                            `replicas` defines the number of replicas for the flp-informers deployment.
+                            For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+                          format: int32
+                          minimum: 1
+                          type: integer


⚠️ Potential issue | 🟡 Minor

The HA requirement is stronger than the schema.

Line 5158 says 2 replicas are required, but the schema still allows replicas: 1. Either enforce that with validation or reword this to “recommended for HA”.

Suggested doc-only fix

replicas: default: 2 description: |- `replicas` defines the number of replicas for the flp-informers deployment. - For high availability, a minimum of 2 replicas is required when `enabled` is `true`. + Set this to `2` or more for high availability. format: int32 minimum: 1 type: integer

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

replicas:

default: 2

description: |-

`replicas` defines the number of replicas for the flp-informers deployment.

For high availability, a minimum of 2 replicas is required when `enabled` is `true`.

format: int32

minimum: 1

type: integer

replicas:

default: 2

description: |-

`replicas` defines the number of replicas for the flp-informers deployment.

Set this to `2` or more for high availability.

format: int32

minimum: 1

type: integer

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@helm/crds/flows.netobserv.io_flowcollectors.yaml` around lines 5154 - 5161, The description for the replicas field claims "a minimum of 2 replicas is required" but the schema still allows minimum: 1; update the CRD by either changing the schema to enforce minimum: 2 (replace "minimum: 1" with "minimum: 2" for the replicas property) or, if you don't want to enforce it, reword the description to say "recommended for HA" instead of "required"; locate the replicas block (contains "replicas:", "default: 2", and the description) and apply the chosen change so description and validation are consistent.

openshift-ci · 2026-04-17T00:15:15Z

@leandroberetta: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-operator	`0d13eee`	link	false	`/test e2e-operator`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference label Apr 9, 2026

coderabbitai Bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai Bot reviewed Apr 10, 2026

View reviewed changes

leandroberetta requested review from OlivierCazade, jotak and jpinsonneau April 13, 2026 13:36

jpinsonneau requested changes Apr 14, 2026

View reviewed changes

openshift-ci Bot assigned jpinsonneau Apr 14, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

leandroberetta requested a review from jpinsonneau April 15, 2026 17:16

jpinsonneau previously approved these changes Apr 15, 2026

View reviewed changes

openshift-ci Bot added the lgtm label Apr 15, 2026

openshift-ci Bot added the needs-rebase label Apr 15, 2026

leandroberetta dismissed jpinsonneau’s stale review via 9f0160e April 16, 2026 22:36

leandroberetta force-pushed the netobserv-1248 branch from 326ff55 to 9f0160e Compare April 16, 2026 22:36

openshift-ci Bot removed the lgtm label Apr 16, 2026

openshift-ci Bot removed the needs-rebase label Apr 16, 2026

coderabbitai Bot reviewed Apr 16, 2026

View reviewed changes

leandroberetta added 2 commits April 28, 2026 09:23

flp-informers deployment and wiring

7694874

fix unused parameter

e9332d7

leandroberetta added 8 commits April 28, 2026 09:23

fix bundle

5960d2a

fix bundle

d926d6c

fix possible port conflict

41fccae

add missing scc

29f8516

improve error handling

3d006ac

feedback addressed

2d230f0

fix linter

5b7d3e2

fix inf loop

d1de54c

leandroberetta force-pushed the netobserv-1248 branch from 0d13eee to d1de54c Compare April 28, 2026 13:23

Conversation

leandroberetta commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Apr 9, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Uh oh!

coderabbitai Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci-robot commented Apr 9, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Apr 10, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

jpinsonneau left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented Apr 15, 2026

Uh oh!

openshift-ci-robot commented Apr 15, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Dependencies

leandroberetta commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Apr 9, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Apr 9, 2026 •

edited

Loading

openshift-ci-robot commented Apr 9, 2026 •

edited by openshift-ci Bot

Loading

openshift-ci-robot commented Apr 10, 2026 •

edited by openshift-ci Bot

Loading

openshift-ci-robot commented Apr 15, 2026 •

edited by openshift-ci Bot

Loading

openshift-ci-robot commented Apr 16, 2026 •

edited by openshift-ci Bot

Loading