NETOBSERV-2675: flp-informers deployment and wiring#2650
NETOBSERV-2675: flp-informers deployment and wiring#2650leandroberetta wants to merge 10 commits intonetobserv:mainfrom
Conversation
|
@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new FLP informers component: a dedicated reconciler and builder for Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (2)
internal/controller/flp/flp_controller_test.go (1)
173-174: UseinformerNameconstant instead of string literalReplace hardcoded
"flp-informers"in expectations withinformerNameto avoid drift if naming changes.Also applies to: 336-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_controller_test.go` around lines 173 - 174, Replace the hardcoded string "flp-informers" in the test expectations with the existing informerName constant so the assertions use informerName instead of a literal; update the checks that reference rb3.Subjects[0].Name and any other assertions (e.g., the ones at the other occurrence around lines noted) to compare to informerName, leaving the RoleRef check (rb3.RoleRef.Name) as-is unless it should also be replaced by a corresponding constant.internal/controller/flp/flp_informer_builder.go (1)
60-69: Consider making informer resources tunableHardcoding informer resources may under-size large clusters. Consider deriving from config/CR defaults so limits can be adjusted without code changes.
As per coding guidelines "Monitor memory usage impact for large scale clusters - watch for memory growth with large flow volumes and optimize cacheMaxFlows, cacheActiveTimeout, writeBatchWait, writeBatchSize settings".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_builder.go` around lines 60 - 69, The ResourceRequirements for the informer are hardcoded (corev1.ResourceRequirements with fixed Requests/Limits); make these values configurable by reading from the controller config or CR defaults instead of literals—e.g., add config/CR fields (or reuse existing controller config) for informer.memoryRequest, informer.cpuRequest, informer.memoryLimit, informer.cpuLimit, and use them when constructing the ResourceRequirements in flp_informer_builder.go (fall back to the current "128Mi"/"50m"/"256Mi"/"200m" defaults if unset); ensure the builder/constructor that creates the informer (the code that sets ResourceRequirements) accepts or fetches the config/CR and applies those values so operators can tune cacheMaxFlows/cacheActiveTimeout/writeBatchWait/writeBatchSize and informer resource limits without code changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bundle.Dockerfile`:
- Line 10: The LABEL
"operators.operatorframework.io.metrics.builder=operator-sdk-v1.40.0" shows a
downgrade to v1.40.0; verify and make the bundle's operator-sdk version
consistent across files (e.g., update this LABEL or
bundle/metadata/annotations.yaml) so all bundle files reference the same
operator-sdk version (ensure the LABEL value and the entry in
bundle/metadata/annotations.yaml match the intended version).
In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml`:
- Line 99: The CSV contains the builder annotation
operators.operatorframework.io/builder set to operator-sdk-v1.40.0 which
downgrades from v1.42.0; confirm intended SDK version and update the annotation
value (operators.operatorframework.io/builder) to the correct version (e.g.,
operator-sdk-v1.42.0) to match the rest of the bundle and your intended build
tooling so all CSVs use the same operator-sdk version.
In `@bundle/metadata/annotations.yaml`:
- Line 9: The bundle annotations indicate operator-sdk version
operator-sdk-v1.40.0 but the Makefile uses v1.42.0; update the bundle generation
to match the Makefile or vice versa: regenerate the bundle artifacts
(annotations.yaml, clusterserviceversion.yaml, bundle.Dockerfile) using
operator-sdk v1.42.0 so operators.operatorframework.io.metrics.builder reads
operator-sdk-v1.42.0, or change the Makefile/operator-sdk reference back to
v1.40.0 to match the existing bundle—ensure the versions are identical across
the Makefile and the generated bundle files.
In `@internal/controller/flp/flp_common_objects.go`:
- Around line 131-135: The code hardcodes a "k8scache" ContainerPort at 9090
which can conflict with user-configured FLP/health/metrics/profile ports; add
validation before Pod spec/container args are generated to detect any
user-configured port equal to 9090 (or vice-versa) and return an error instead
of proceeding. Locate the code that builds the ports slice (where ports =
append(... Name: "k8scache", ContainerPort: 9090 ...)) and the routine that
composes process args for the FLP/health/metrics/profile settings, then
implement a check that scans those config fields for equality with 9090 and
rejects the Pod creation with a clear error message if a collision is found.
Ensure the validation runs early (before creating corev1.ContainerPort entries)
and covers the other hardcoded occurrence mentioned (lines ~174-178).
In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 39-70: The informer Container construction (variable "container"
created with Name informerName and Image b.Images[reconcilers.MainImage]) lacks
a hardened SecurityContext; add SecurityContext:
&corev1.SecurityContext{RunAsNonRoot: pointer.BoolPtr(true), RunAsUser:
pointer.Int64Ptr(1000), AllowPrivilegeEscalation: pointer.BoolPtr(false),
ReadOnlyRootFilesystem: pointer.BoolPtr(true), Privileged:
pointer.BoolPtr(false), Capabilities: &corev1.Capabilities{Drop:
[]corev1.Capability{"ALL"}}, SeccompProfile: &corev1.SeccompProfile{Type:
corev1.SeccompProfileTypeRuntimeDefault}} to that container definition to match
other FLP workloads, and apply the same changes to the other informer Container
construction in this file (the second container block).
In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 54-57: Multiple reconciliation steps in flp_informer_reconciler.go
return bare errors (e.g., returns from r.Managed.FetchAll, r.ensureConfigured,
r.Lister.GetParentByName, r.Managed.Create/Update/Delete) which loses context;
update each return to wrap the underlying error with contextual text using the
standard error-wrapping pattern (e.g., fmt.Errorf("step X failed: %w", err) or
errors.Wrap) inside the Reconcile method (and any helper methods like
ensureConfigured) so callers see which reconcile sub-step failed—replace each
bare "return err" with a descriptive wrapped error referencing the failing call
(e.g., "fetch all managed resources", "ensure configured", "get parent by name",
"create managed resource", "update managed resource", etc.).
---
Nitpick comments:
In `@internal/controller/flp/flp_controller_test.go`:
- Around line 173-174: Replace the hardcoded string "flp-informers" in the test
expectations with the existing informerName constant so the assertions use
informerName instead of a literal; update the checks that reference
rb3.Subjects[0].Name and any other assertions (e.g., the ones at the other
occurrence around lines noted) to compare to informerName, leaving the RoleRef
check (rb3.RoleRef.Name) as-is unless it should also be replaced by a
corresponding constant.
In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 60-69: The ResourceRequirements for the informer are hardcoded
(corev1.ResourceRequirements with fixed Requests/Limits); make these values
configurable by reading from the controller config or CR defaults instead of
literals—e.g., add config/CR fields (or reuse existing controller config) for
informer.memoryRequest, informer.cpuRequest, informer.memoryLimit,
informer.cpuLimit, and use them when constructing the ResourceRequirements in
flp_informer_builder.go (fall back to the current "128Mi"/"50m"/"256Mi"/"200m"
defaults if unset); ensure the builder/constructor that creates the informer
(the code that sets ResourceRequirements) accepts or fetches the config/CR and
applies those values so operators can tune
cacheMaxFlows/cacheActiveTimeout/writeBatchWait/writeBatchSize and informer
resource limits without code changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 93c5b91f-3f2c-44e2-bed2-f73d938ec6ea
📒 Files selected for processing (21)
bundle.Dockerfilebundle/manifests/flows.netobserv.io_flowcollectors.yamlbundle/manifests/flows.netobserv.io_flowcollectorslices.yamlbundle/manifests/flows.netobserv.io_flowmetrics.yamlbundle/manifests/netobserv-operator.clusterserviceversion.yamlbundle/metadata/annotations.yamlconfig/crd/bases/flows.netobserv.io_flowcollectors.yamlconfig/crd/bases/flows.netobserv.io_flowcollectorslices.yamlconfig/crd/bases/flows.netobserv.io_flowmetrics.yamlhelm/crds/flows.netobserv.io_flowcollectors.yamlhelm/crds/flows.netobserv.io_flowcollectorslices.yamlhelm/crds/flows.netobserv.io_flowmetrics.yamlinternal/controller/flp/flp_common_objects.gointernal/controller/flp/flp_controller.gointernal/controller/flp/flp_controller_test.gointernal/controller/flp/flp_informer_builder.gointernal/controller/flp/flp_informer_reconciler.gointernal/controller/flp/flp_monolith_objects.gointernal/controller/flp/flp_monolith_reconciler.gointernal/controller/flp/flp_transfo_reconciler.gointernal/pkg/manager/status/status_manager.go
💤 Files with no reviewable changes (2)
- internal/controller/flp/flp_transfo_reconciler.go
- internal/controller/flp/flp_monolith_reconciler.go
|
@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
🧹 Nitpick comments (5)
internal/controller/flp/flp_informer_reconciler.go (1)
60-64: Consider handling TryDeleteAll errors.
TryDeleteAllreturn value is ignored. While "Try" implies best-effort, logging failures would aid debugging cleanup issues.Suggested improvement
if desired.Spec.OnHold() { r.Status.SetUnused("FlowCollector is on hold") - r.Managed.TryDeleteAll(ctx) + if err := r.Managed.TryDeleteAll(ctx); err != nil { + log.FromContext(ctx).Error(err, "failed to delete some managed resources") + } return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_reconciler.go` around lines 60 - 64, When handling the OnHold branch, capture the error returned by r.Managed.TryDeleteAll(ctx) instead of ignoring it; call it into a variable (e.g., err := r.Managed.TryDeleteAll(ctx)) and if non-nil emit a diagnostic (use the reconciler logger or status updater such as r.Log.Error(err, "failed to cleanup managed resources on hold") or r.Status.SetError/SetDegraded with context) so cleanup failures are recorded, then continue returning nil as before.internal/controller/flp/flp_informer_builder.go (3)
60-68: Hardcoded resource limits.Resource requirements are fixed at 128Mi/256Mi memory and 50m/200m CPU. Other FLP components use
desired.Processor.Resourcesfrom the CR spec. Consider exposing informer resources via config or document why fixed values are appropriate here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_builder.go` around lines 60 - 68, The informer is using hardcoded resource values in the Resources block; replace the fixed corev1.ResourceRequirements with the FLP spec's resource settings (use desired.Processor.Resources) or thread a configurable field through the informer builder so callers can supply resource requirements; update the builder function that constructs the informer pod (look for Resources: corev1.ResourceRequirements in flp_informer_builder.go) to read and apply desired.Processor.Resources (or a new InformerResources field) and ensure fallback/default values are documented or applied when the CR value is nil.
90-96: Pod template missingpart-oflabel.Deployment
ObjectMeta.Labelsincludes"part-of": constants.OperatorNamebut the pod template labels don't. Other FLP builders (monolith, transformer) include this label in their pod templates for consistency.Suggested fix
Template: corev1.PodTemplateSpec{ ObjectMeta: metav1.ObjectMeta{ Labels: map[string]string{ + "part-of": constants.OperatorName, "app": informerName, "version": version, }, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_builder.go` around lines 90 - 96, The pod template under Template: corev1.PodTemplateSpec for the FLP informer builder is missing the "part-of" label; update the ObjectMeta.Labels map inside the PodTemplateSpec (the block that currently sets "app": informerName and "version": version) to also include "part-of": constants.OperatorName so the pod template matches the Deployment ObjectMeta.Labels and other FLP builders (monolith, transformer).
29-104: Method signature returns error but never fails.
deployment()returns(*appsv1.Deployment, error)but always returnsnilerror. Either simplify the signature to return only*appsv1.Deployment, or add validation that could fail.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_builder.go` around lines 29 - 104, The deployment() method on informerBuilder currently returns (*appsv1.Deployment, error) but never returns a non-nil error; either remove the unused error return or add meaningful validation. To fix quickly, change informerBuilder.deployment signature to return *appsv1.Deployment (no error), update its return to only the Deployment value, and update all callers to accept the single return (search for calls to deployment() and adjust unpacking). Alternatively, if you prefer validation, add checks inside deployment (e.g., ensure b.Images[reconcilers.MainImage] is non-empty and b.desired is non-nil) and return an error when invalid instead of always nil; update callers accordingly to handle the error.internal/controller/flp/flp_monolith_objects.go (1)
183-188: Use constant and consistent type for k8scache port.Line 185 hardcodes
9090and line 187 usesintstr.FromInt()instead ofintstr.FromInt32(). Use thek8scachePortconstant for maintainability and match the int32 pattern used on line 181.Suggested fix
{ Name: "k8scache", - Port: 9090, + Port: k8scachePort, Protocol: corev1.ProtocolTCP, - TargetPort: intstr.FromInt(9090), + TargetPort: intstr.FromInt32(k8scachePort), },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_monolith_objects.go` around lines 183 - 188, Replace the hardcoded 9090 and the intstr.FromInt call with the k8scachePort constant and an int32 conversion so types are consistent: set Port to k8scachePort and TargetPort to intstr.FromInt32(int32(k8scachePort)) for the object with Name "k8scache" (the same pattern used on the earlier port definition).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 60-68: The informer is using hardcoded resource values in the
Resources block; replace the fixed corev1.ResourceRequirements with the FLP
spec's resource settings (use desired.Processor.Resources) or thread a
configurable field through the informer builder so callers can supply resource
requirements; update the builder function that constructs the informer pod (look
for Resources: corev1.ResourceRequirements in flp_informer_builder.go) to read
and apply desired.Processor.Resources (or a new InformerResources field) and
ensure fallback/default values are documented or applied when the CR value is
nil.
- Around line 90-96: The pod template under Template: corev1.PodTemplateSpec for
the FLP informer builder is missing the "part-of" label; update the
ObjectMeta.Labels map inside the PodTemplateSpec (the block that currently sets
"app": informerName and "version": version) to also include "part-of":
constants.OperatorName so the pod template matches the Deployment
ObjectMeta.Labels and other FLP builders (monolith, transformer).
- Around line 29-104: The deployment() method on informerBuilder currently
returns (*appsv1.Deployment, error) but never returns a non-nil error; either
remove the unused error return or add meaningful validation. To fix quickly,
change informerBuilder.deployment signature to return *appsv1.Deployment (no
error), update its return to only the Deployment value, and update all callers
to accept the single return (search for calls to deployment() and adjust
unpacking). Alternatively, if you prefer validation, add checks inside
deployment (e.g., ensure b.Images[reconcilers.MainImage] is non-empty and
b.desired is non-nil) and return an error when invalid instead of always nil;
update callers accordingly to handle the error.
In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 60-64: When handling the OnHold branch, capture the error returned
by r.Managed.TryDeleteAll(ctx) instead of ignoring it; call it into a variable
(e.g., err := r.Managed.TryDeleteAll(ctx)) and if non-nil emit a diagnostic (use
the reconciler logger or status updater such as r.Log.Error(err, "failed to
cleanup managed resources on hold") or r.Status.SetError/SetDegraded with
context) so cleanup failures are recorded, then continue returning nil as
before.
In `@internal/controller/flp/flp_monolith_objects.go`:
- Around line 183-188: Replace the hardcoded 9090 and the intstr.FromInt call
with the k8scachePort constant and an int32 conversion so types are consistent:
set Port to k8scachePort and TargetPort to intstr.FromInt32(int32(k8scachePort))
for the object with Name "k8scache" (the same pattern used on the earlier port
definition).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5085d614-83f7-410f-bbeb-7e640cf41e4c
📒 Files selected for processing (6)
internal/controller/flp/flp_common_objects.gointernal/controller/flp/flp_informer_builder.gointernal/controller/flp/flp_informer_reconciler.gointernal/controller/flp/flp_monolith_objects.gointernal/controller/flp/flp_test.gointernal/controller/flp/flp_transfo_objects.go
jpinsonneau
left a comment
There was a problem hiding this comment.
I feel we should make everything configurable and hide the options we don't want to expose much in the advanced config.
The number of replicas is mandatory here since the cache may fail and we'll loose the whole pipeline.
Also, the cache component should have health and report some metrics in the health dashboard.
| builder := newInformerBuilder(r.Instance, &desired.Spec) | ||
|
|
||
| // Reconcile ServiceAccount | ||
| if err := r.reconcileServiceAccount(ctx, &builder); err != nil { | ||
| return fmt.Errorf("failed to reconcile service account: %w", err) | ||
| } | ||
|
|
||
| // Reconcile RBAC | ||
| if err := r.reconcilePermissions(ctx); err != nil { | ||
| return fmt.Errorf("failed to reconcile permissions: %w", err) | ||
| } | ||
|
|
||
| // Reconcile Deployment | ||
| if err := r.reconcileDeployment(ctx, &builder); err != nil { | ||
| return fmt.Errorf("failed to reconcile deployment: %w", err) | ||
| } |
There was a problem hiding this comment.
I feel we should make this configurable. On a small cluster with few FLPs (maybe even one), it doesn't make sense to have cache separated. WDYT ?
There was a problem hiding this comment.
Makes sense, I added:
# Centralized Kubernetes informers (enabled by default for reduced API server load)
# informers:
# enabled: true
# replicas: 2 # Minimum 2 for HA with leader election| Command: []string{"/app/flp-informers"}, | ||
| Args: []string{ | ||
| fmt.Sprintf("--processor-selector=%s", processorSelector), | ||
| "--processor-port=9090", |
There was a problem hiding this comment.
Should rely on k8scachePort const
| Name: "k8scache", | ||
| Port: 9090, | ||
| Protocol: corev1.ProtocolTCP, | ||
| TargetPort: intstr.FromInt(9090), |
There was a problem hiding this comment.
Should rely on k8scachePort const
| } | ||
|
|
||
| func (b *informerBuilder) deployment() (*appsv1.Deployment, error) { | ||
| var replicas int32 = 1 |
There was a problem hiding this comment.
I have concerns about this. It is not resilient to have a single replica.
It would be interesting to make it configurable and have a leader election mechanism.
There was a problem hiding this comment.
I've added a leader election mechanism.
| Resources: corev1.ResourceRequirements{ | ||
| Requests: corev1.ResourceList{ | ||
| corev1.ResourceMemory: resource.MustParse("128Mi"), | ||
| corev1.ResourceCPU: resource.MustParse("50m"), | ||
| }, | ||
| Limits: corev1.ResourceList{ | ||
| corev1.ResourceMemory: resource.MustParse("256Mi"), | ||
| corev1.ResourceCPU: resource.MustParse("200m"), | ||
| }, | ||
| }, |
There was a problem hiding this comment.
This should be configurable in the advanced options
There was a problem hiding this comment.
Now is configurable at:
# Centralized Kubernetes informers (enabled by default for reduced API server load)
# informers:
# enabled: true
# replicas: 2 # Minimum 2 for HA with leader election
# resources:
# requests:
# memory: 128Mi
# cpu: 50m
# limits:
# memory: 256Mi
# cpu: 200m
# advanced:
# resyncInterval: 60 # Seconds to rediscover processors
# batchSize: 100 # Max entries per update batch
# sendTimeout: 10 # Seconds timeout for sending updates
# updateBufferSize: 100 # Update channel buffer sizeI can move it to the advance section if you want.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bundle/manifests/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5552-5559: The YAML's `replicas` schema currently sets `minimum:
1` but the comment requires `replicas >= 2` when `enabled` is true; update the
CRD validation to enforce that conditional rule (or relax the comment). Add an
x-kubernetes-validations rule for the relevant `informers`/`flp-informers`
section that asserts "if .enabled == true then .replicas >= 2" (using the CRD
validation expression language), or alternatively change `minimum: 1` to
`minimum: 2` and adjust the description; target the `replicas` and `enabled`
fields in the same schema block so the validation is applied correctly.
In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5121-5127: The schema for the "replicas" field currently sets
minimum: 1 while the description for the flp-informers deployment mandates a
minimum of 2; update either the validation or the description to remove the
conflict: either change the "replicas" property minimum from 1 to 2 to enforce
the described requirement, or modify the description to state that "a minimum of
2 replicas is recommended when `enabled` is `true`" (making it a recommendation
rather than a hard requirement); target the "replicas" field under the
flp-informers deployment definition so the change aligns schema validation with
the description.
In `@config/rbac/component_roles.yaml`:
- Around line 99-107: The informers ClusterRole is missing the "watch" verb for
the coordination.k8s.io/leases permission; update the informers ClusterRole
entry that lists apiGroups: coordination.k8s.io and resources: leases to include
"watch" alongside create, get, list, update so controller-runtime leader
election can observe lease changes (i.e., modify the informers ClusterRole's
leases verbs to add "watch").
In `@docs/FlowCollector.md`:
- Around line 10709-10715: Update the replicas description for the flp-informers
setting to remove the apparent contradiction: make the conditional explicit by
stating that when `enabled` is true a minimum of 2 replicas is
recommended/required for high availability, while the schema default/minimum
remains 1 otherwise; reference the `replicas` field and the `enabled` flag in
the text so readers understand the constraint applies only when `enabled=true`
and not to the base `<i>Minimum</i>: 1` schema value.
In `@internal/controller/flp/flp_informer_builder.go`:
- Around line 63-79: After reading config.Advanced.ProcessorPort into
processorPort, validate it against the informer’s reserved ports before
constructing the container/grpc port: ensure processorPort does not equal 8080
or 9091 (and any other ports used by the informer/process like k8scachePort),
and if it conflicts return an error (or reject the value) from the informer
builder path so the reconciler (flp_informer_reconciler.go) cannot create a pod
with conflicting ports; implement the check right after the assignment from
config.Advanced.ProcessorPort and before using processorPort to populate the
container port fields, referencing processorPort and
config.Advanced.ProcessorPort in the validation logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6b8e2942-9cb9-4c51-9617-41a8afdfb570
⛔ Files ignored due to path filters (1)
api/flowcollector/v1beta2/zz_generated.deepcopy.gois excluded by!**/zz_generated.deepcopy.go
📒 Files selected for processing (18)
api/flowcollector/v1beta2/flowcollector_types.goapi/flowcollector/v1beta2/flowcollector_validation_webhook.gobundle/manifests/flows.netobserv.io_flowcollectors.yamlbundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yamlbundle/manifests/netobserv-operator.clusterserviceversion.yamlconfig/crd/bases/flows.netobserv.io_flowcollectors.yamlconfig/rbac/component_roles.yamlconfig/rbac/role.yamlconfig/samples/flows_v1beta2_flowcollector.yamldocs/FlowCollector.mdhelm/crds/flows.netobserv.io_flowcollectors.yamlhelm/templates/clusterrole.yamlhelm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yamlhelm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yamlinternal/controller/flp/flp_informer_builder.gointernal/controller/flp/flp_informer_reconciler.gointernal/controller/flp/flp_monolith_objects.gointernal/pkg/manager/manager.go
✅ Files skipped from review due to trivial changes (6)
- config/rbac/role.yaml
- internal/pkg/manager/manager.go
- helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
- config/samples/flows_v1beta2_flowcollector.yaml
- bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
- helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/controller/flp/flp_informer_reconciler.go
| replicas: | ||
| default: 2 | ||
| description: |- | ||
| `replicas` defines the number of replicas for the flp-informers deployment. | ||
| For high availability, a minimum of 2 replicas is required when `enabled` is `true`. | ||
| format: int32 | ||
| minimum: 1 | ||
| type: integer |
There was a problem hiding this comment.
Enforce the HA replica rule in schema (or relax the wording).
Line 5556 says a minimum of 2 replicas is required when enabled, but the schema only enforces minimum: 1 (Line 5558) with no conditional rule. enabled: true + replicas: 1 would still validate.
Suggested CRD fix
informers:
@@
properties:
@@
replicas:
default: 2
description: |-
`replicas` defines the number of replicas for the flp-informers deployment.
For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
format: int32
minimum: 1
type: integer
@@
- type: object
+ type: object
+ x-kubernetes-validations:
+ - message: replicas must be at least 2 when informers are enabled
+ rule: self.enabled != true || !has(self.replicas) || self.replicas >= 2#!/bin/bash
set -euo pipefail
# Verify whether webhook already enforces "enabled => replicas >= 2"
fd -i 'flowcollector_validation_webhook.go' -x sh -c '
echo "==> $1"
rg -n "informers|replicas|enabled|at least 2|>=\\s*2" "$1"
' sh {}
# Verify current CRD validation presence for informers replicas
fd -i 'flows.netobserv.io_flowcollectors.yaml' -x sh -c '
echo "==> $1"
rg -n "informers:|replicas:|x-kubernetes-validations|enabled:" "$1"
' sh {}Also applies to: 5628-5628
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bundle/manifests/flows.netobserv.io_flowcollectors.yaml` around lines 5552 -
5559, The YAML's `replicas` schema currently sets `minimum: 1` but the comment
requires `replicas >= 2` when `enabled` is true; update the CRD validation to
enforce that conditional rule (or relax the comment). Add an
x-kubernetes-validations rule for the relevant `informers`/`flp-informers`
section that asserts "if .enabled == true then .replicas >= 2" (using the CRD
validation expression language), or alternatively change `minimum: 1` to
`minimum: 2` and adjust the description; target the `replicas` and `enabled`
fields in the same schema block so the validation is applied correctly.
| default: 2 | ||
| description: |- | ||
| `replicas` defines the number of replicas for the flp-informers deployment. | ||
| For high availability, a minimum of 2 replicas is required when `enabled` is `true`. | ||
| format: int32 | ||
| minimum: 1 | ||
| type: integer |
There was a problem hiding this comment.
Align replicas validation with its description.
The description says a minimum of 2 replicas is required when enabled, but Line 5126 allows minimum: 1. Either enforce 2 in schema/webhook or reword this as a recommendation to avoid conflicting API behavior.
Suggested wording fix (if 1 replica is intentionally allowed)
- For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+ For high availability, 2 or more replicas are recommended when `enabled` is `true`.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml` around lines 5121 -
5127, The schema for the "replicas" field currently sets minimum: 1 while the
description for the flp-informers deployment mandates a minimum of 2; update
either the validation or the description to remove the conflict: either change
the "replicas" property minimum from 1 to 2 to enforce the described
requirement, or modify the description to state that "a minimum of 2 replicas is
recommended when `enabled` is `true`" (making it a recommendation rather than a
hard requirement); target the "replicas" field under the flp-informers
deployment definition so the change aligns schema validation with the
description.
| processorPort := int32(k8scachePort) | ||
| if config.Advanced != nil { | ||
| if config.Advanced.ResyncInterval != nil { | ||
| resyncInterval = *config.Advanced.ResyncInterval | ||
| } | ||
| if config.Advanced.BatchSize != nil { | ||
| batchSize = *config.Advanced.BatchSize | ||
| } | ||
| if config.Advanced.SendTimeout != nil { | ||
| sendTimeout = *config.Advanced.SendTimeout | ||
| } | ||
| if config.Advanced.UpdateBufferSize != nil { | ||
| updateBufferSize = *config.Advanced.UpdateBufferSize | ||
| } | ||
| if config.Advanced.ProcessorPort != nil { | ||
| processorPort = *config.Advanced.ProcessorPort | ||
| } |
There was a problem hiding this comment.
Add informer-side validation for processorPort conflicts.
processorPort is user-overridable (Line 77-79) and then reused as container grpc port (Line 105), while Line 110 and Line 115 already reserve 8080/9091. The informer reconciliation path (internal/controller/flp/flp_informer_reconciler.go, Line 53-99) does not call shared port-conflict validation, so conflicting values can slip through.
Proposed patch
@@
if config.Advanced != nil {
@@
if config.Advanced.ProcessorPort != nil {
processorPort = *config.Advanced.ProcessorPort
}
}
+
+ // Reserved informer container ports
+ const (
+ healthPort int32 = 8080
+ metricsPort int32 = 9091
+ )
+ if processorPort == healthPort || processorPort == metricsPort {
+ return nil, fmt.Errorf("invalid processor port %d: conflicts with reserved informer ports (%d health, %d metrics)", processorPort, healthPort, metricsPort)
+ }
@@
{
Name: "health",
- ContainerPort: 8080,
+ ContainerPort: healthPort,
Protocol: corev1.ProtocolTCP,
},
{
Name: "metrics",
- ContainerPort: 9091,
+ ContainerPort: metricsPort,
Protocol: corev1.ProtocolTCP,
},
@@
Path: "/healthz",
- Port: intstr.FromInt(8080),
+ Port: intstr.FromInt(int(healthPort)),
@@
Path: "/ready",
- Port: intstr.FromInt(8080),
+ Port: intstr.FromInt(int(healthPort)),Also applies to: 101-118
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/controller/flp/flp_informer_builder.go` around lines 63 - 79, After
reading config.Advanced.ProcessorPort into processorPort, validate it against
the informer’s reserved ports before constructing the container/grpc port:
ensure processorPort does not equal 8080 or 9091 (and any other ports used by
the informer/process like k8scachePort), and if it conflicts return an error (or
reject the value) from the informer builder path so the reconciler
(flp_informer_reconciler.go) cannot create a pod with conflicting ports;
implement the check right after the assignment from
config.Advanced.ProcessorPort and before using processorPort to populate the
container port fields, referencing processorPort and
config.Advanced.ProcessorPort in the validation logic.
jpinsonneau
left a comment
There was a problem hiding this comment.
Thanks for addressing the feedback. That looks good to me now
Let's just try to fix the tests
Metric netobserv_ingest_flows_processed not found after 180 secs!!!
|
/test e2e-operator |
326ff55 to
9f0160e
Compare
|
New changes are detected. LGTM label has been removed. |
|
@leandroberetta: This pull request references NETOBSERV-2675 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/controller/flp/flp_common_objects.go (1)
160-207:⚠️ Potential issue | 🟡 Minork8scache port & args are added unconditionally — gate on
informers.enabled.The
k8scachecontainer port (Line 160-164) and the--k8scache.*args (Line 203-207) are appended regardless of whetherspec.processor.informers.enabledis true. When the feature is off, you still bind 9090 and pass flags the user didn't opt into. This also means FLP must always understand the--k8scache.*flags, tightening the version coupling with flowlogs-pipeline#1221.🔧 Suggested gating
- ports = append(ports, corev1.ContainerPort{ - Name: "k8scache", - ContainerPort: k8scachePort, - Protocol: corev1.ProtocolTCP, - }) + informersEnabled := desired.Processor.Informers != nil && + (desired.Processor.Informers.Enabled == nil || *desired.Processor.Informers.Enabled) + if informersEnabled { + ports = append(ports, corev1.ContainerPort{ + Name: "k8scache", + ContainerPort: k8scachePort, + Protocol: corev1.ProtocolTCP, + }) + } @@ - Args: []string{ - fmt.Sprintf(`--config=%s/%s`, configPath, configFile), - fmt.Sprintf("--k8scache.port=%d", k8scachePort), - "--k8scache.address=0.0.0.0", - }, + Args: func() []string { + a := []string{fmt.Sprintf(`--config=%s/%s`, configPath, configFile)} + if informersEnabled { + a = append(a, fmt.Sprintf("--k8scache.port=%d", k8scachePort), "--k8scache.address=0.0.0.0") + } + return a + }(),Likewise,
validatePortConflictsonly needs to fire when informers are enabled.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_common_objects.go` around lines 160 - 207, The k8scache ContainerPort and the "--k8scache.*" args are being added unconditionally; change the logic to only append the k8scache port (the corev1.ContainerPort with Name "k8scache") and the "--k8scache.port/--k8scache.address" entries in the container.Args for the constants.FLPName container when the informers feature is enabled (i.e., gate on the informers.enabled flag in advancedConfig or the equivalent field). Also ensure any call to validatePortConflicts is only invoked when informers are enabled so we don’t validate 9090 unless the informers are on. Locate these changes around where ports are appended (k8scache port block), where container.Args are built (k8scache args), and where validatePortConflicts is called and wrap them in the same informers-enabled conditional.
♻️ Duplicate comments (1)
config/crd/bases/flows.netobserv.io_flowcollectors.yaml (1)
5150-5157:⚠️ Potential issue | 🟡 Minor
replicasvalidation contradicts the documented requirement.
minimum: 1conflicts with the field description saying 2 is required whenenabled=true. Please either enforce the conditional constraint or reword it as a recommendation.Suggested schema-side enforcement
informers: description: |- `informers` configuration for centralized Kubernetes informers that push cache updates to flowlogs-pipeline processors. This reduces load on the Kubernetes API server by having a single component (flp-informers) query the API instead of N FLP processors. When enabled, a dedicated `flp-informers` deployment is created that watches Kubernetes resources and pushes updates via gRPC. properties: @@ replicas: default: 2 description: |- `replicas` defines the number of replicas for the flp-informers deployment. For high availability, a minimum of 2 replicas is required when `enabled` is `true`. format: int32 minimum: 1 type: integer @@ + x-kubernetes-validations: + - message: "informers.replicas must be >= 2 when informers.enabled is true." + rule: "!has(self.enabled) || !self.enabled || !has(self.replicas) || self.replicas >= 2" type: object🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml` around lines 5150 - 5157, The replicas schema for the flp-informers deployment currently declares minimum: 1 but the description mandates "a minimum of 2 replicas when `enabled` is `true` — fix this by either updating the schema to enforce the constraint or rewording the description: either change the `replicas` validation so minimum: 2 and/or add an if/then conditional that checks the `enabled` boolean and requires replicas >= 2 when `enabled: true`, or keep minimum: 1 but rephrase the description to a recommendation (e.g., "recommended minimum of 2 when `enabled` is true"); locate the `replicas` field and the related `enabled` flag in the flowcollectors CRD to apply the chosen fix.
🧹 Nitpick comments (8)
internal/controller/flp/flp_common_objects.go (3)
98-124: Extract the port-name literal and simplify the repetition.Four near-identical branches with the same error template. A small table collapses this nicely and makes future additions (e.g., another reserved port) one-line changes:
♻️ Suggested refactor
-func validatePortConflicts(desired *flowslatest.FlowCollectorSpec) error { - advancedConfig := helper.GetAdvancedProcessorConfig(desired) - - // Check FLP port - if advancedConfig.Port != nil && *advancedConfig.Port == k8scachePort { - return fmt.Errorf("flowlogs-pipeline port %d conflicts with reserved k8scache port %d", *advancedConfig.Port, k8scachePort) - } - - // Check health port - if advancedConfig.HealthPort != nil && *advancedConfig.HealthPort == k8scachePort { - return fmt.Errorf("flowlogs-pipeline health port %d conflicts with reserved k8scache port %d", *advancedConfig.HealthPort, k8scachePort) - } - - // Check metrics port - metricsPort := desired.Processor.GetMetricsPort() - if metricsPort == k8scachePort { - return fmt.Errorf("flowlogs-pipeline metrics port %d conflicts with reserved k8scache port %d", metricsPort, k8scachePort) - } - - // Check profile port (optional) - if advancedConfig.ProfilePort != nil && *advancedConfig.ProfilePort == k8scachePort { - return fmt.Errorf("flowlogs-pipeline profile port %d conflicts with reserved k8scache port %d", *advancedConfig.ProfilePort, k8scachePort) - } - - return nil -} +func validatePortConflicts(desired *flowslatest.FlowCollectorSpec) error { + adv := helper.GetAdvancedProcessorConfig(desired) + checks := []struct { + label string + port *int32 + }{ + {"port", adv.Port}, + {"health port", adv.HealthPort}, + {"metrics port", ptr.To(desired.Processor.GetMetricsPort())}, + {"profile port", adv.ProfilePort}, + } + for _, c := range checks { + if c.port != nil && *c.port == k8scachePort { + return fmt.Errorf("flowlogs-pipeline %s %d conflicts with reserved k8scache port %d", c.label, *c.port, k8scachePort) + } + } + return nil +}Also — this is a
specvalidation and arguably belongs next tovalidateFLP()in the webhook, not inside the builder. Builders today silently swallow the error in tests (b, _ := newMonolithBuilder(...)), so a misconfig would surface as a broken controller reconcile rather than a clean webhook rejection.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_common_objects.go` around lines 98 - 124, The validatePortConflicts function repeats the same error template for multiple ports; refactor it by extracting the error message template and building a small map/slice of (portName, portValue) entries using advancedConfig.Port, advancedConfig.HealthPort, desired.Processor.GetMetricsPort(), and advancedConfig.ProfilePort, then iterate and return fmt.Errorf(fmtTemplate, val, k8scachePort) when a conflict is found; update references to the unique symbols validatePortConflicts, advancedConfig, desired.Processor.GetMetricsPort, and k8scachePort to implement the loop-based check and remove duplicated branches — also consider moving this spec validation into the webhook near validateFLP so misconfigs are rejected early instead of being swallowed by builders.
161-161: Promote"k8scache"to a named const.
healthPortName,prometheusPortName,profilePortNameall have constants above — this one sticks out as a bare string literal. Addk8scachePortName = "k8scache"alongsidek8scachePortfor consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_common_objects.go` at line 161, Promote the literal "k8scache" to a named constant by adding k8scachePortName = "k8scache" alongside the existing k8scachePort constant, and replace the literal Name: "k8scache" with Name: k8scachePortName (or use the constant wherever that port name is referenced). Update references to the string in the same file so usage is consistent with healthPortName/prometheusPortName/profilePortName and the existing k8scachePort symbol.
30-30: Consider exposingk8scachePortas configurable (e.g., underAdvanced).Hardcoding a well-known port like
9090invites collisions with other sidecars or injected tooling (kube-rbac-proxy, Prometheus node-exporter style setups, etc.). Promoting this to an advanced-config field keeps the reserved default but gives operators an escape hatch — which also neatly avoids the wholevalidatePortConflictsdance.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_common_objects.go` at line 30, Replace the hardcoded k8scachePort = 9090 with a configurable Advanced option: add a field (e.g., Advanced.K8sCachePort int) to the existing configuration struct and supply a default of 9090 when loading/constructing the config, then replace uses of the package-level k8scachePort symbol with reads from that config (falling back to 9090 if unset); also update any config parsing/CLI flags and documentation to expose the new Advanced option and keep existing validation (or simplify validatePortConflicts if desired).internal/controller/flp/flp_test.go (1)
814-850: Good coverage — consider one small extension.
validatePortConflictsis invoked from bothnewMonolithBuilderandnewTransfoBuilder, but only the monolith path is exercised here. A single mirror case vianewTransfoBuilder(or parameterizing the constructor) would guard against someone accidentally dropping the call from one constructor.Also an ideal candidate for table-driven tests — the five cases repeat the same three-line pattern:
♻️ Table-driven sketch
cases := []struct { name string mutate func(*flowslatest.FlowCollectorSpec) errSub string }{ {"flp port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.Port = ptr.To(int32(9090)) }, "flowlogs-pipeline port 9090"}, {"health port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.HealthPort = ptr.To(int32(9090)) }, "health port 9090"}, {"metrics port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Metrics.Server.Port = ptr.To(int32(9090)) }, "metrics port 9090"}, {"profile port", func(c *flowslatest.FlowCollectorSpec) { c.Processor.Advanced.ProfilePort = ptr.To(int32(9090)) }, "profile port 9090"}, {"ok", func(*flowslatest.FlowCollectorSpec) {}, ""}, } for _, tc := range cases { t.Run(tc.name, func(t *testing.T) { /* ... */ }) }Based on learnings, testify is an approved framework in this repo, so no need to port to Ginkgo/Gomega just for this.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_test.go` around lines 814 - 850, TestPortConflictValidation only exercises validatePortConflicts via newMonolithBuilder; add a mirror set of assertions that call newTransfoBuilder to ensure validatePortConflicts is enforced there too (or refactor TestPortConflictValidation into a table-driven loop that runs each test case against both newMonolithBuilder and newTransfoBuilder). Keep the same mutation functions and error substring checks (use the existing cases for Processor.Advanced.Port, HealthPort, Metrics.Server.Port, ProfilePort and the no-conflict case) and assert the same error messages when invoking newTransfoBuilder (or both builders) so the coverage is identical for validatePortConflicts, referencing the existing TestPortConflictValidation, newMonolithBuilder, newTransfoBuilder and validatePortConflicts symbols.bundle/manifests/netobserv-operator.clusterserviceversion.yaml (1)
469-486: Missing UI descriptors for consistency.Other toggles/advanced sections in this CSV use
booleanSwitchforenabledflags andhidden/advancedmarkers foradvancedsubsections (e.g.,consolePlugin.enable,processor.advanced). Consider aligningprocessor.informers.enabledwithbooleanSwitchand markingprocessor.informers.advancedfields withadvanced(and afieldDependency:processor.informers.enabled:true) so the OpenShift form matches existing patterns.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml` around lines 469 - 486, The CSV UI descriptors should mirror existing patterns: change the descriptor for processor.informers.enabled to use booleanSwitch (like consolePlugin.enable) and mark the processor.informers.advanced section and its child fields (processor.informers.advanced.batchSize, processor.informers.advanced.processorPort, processor.informers.advanced.resyncInterval, processor.informers.advanced.sendTimeout, processor.informers.advanced.updateBufferSize) as advanced/hidden; add a fieldDependency on processor.informers.enabled:true so the advanced fields only appear when the Informers toggle is enabled and the OpenShift form behavior matches other components.internal/controller/flp/flp_controller_test.go (1)
168-174: Consider asserting informer Deployment/SA lifecycle across Kafka transitions.The Kafka context verifies the shared informer CRB but doesn't assert the
flp-informersDeployment/ServiceAccount persist across the mono→Kafka→mono transitions (Contexts "With Kafka" and "Back without Kafka"). Since this is now a shared, independent deployment, a quickEventually ... Get(flpKeyInformer, &Deployment{}).Should(Succeed())in both contexts would guard against accidental teardown on model switch.Also applies to: 330-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_controller_test.go` around lines 168 - 174, Add assertions in the "With Kafka" and "Back without Kafka" test contexts to verify the flp-informers Deployment and ServiceAccount persist across the mono→Kafka→mono transitions: use Eventually with k8sClient.Get and the existing flpKeyInformer key to ensure Get(ctx, flpKeyInformer, &appsv1.Deployment{}) and Get(ctx, flpKeyInformer, &corev1.ServiceAccount{}) both succeed (similar to the existing ClusterRoleBinding check using rb3), so the shared informer Deployment/SA is not accidentally torn down when switching models.internal/controller/flp/flp_informer_reconciler.go (2)
66-70: Nit: collapse the nil-check withptr.Deref.Minor readability tweak — no behavior change.
Proposed diff
- // Check if informers are enabled (default: true) - enabled := true - if desired.Spec.Processor.Informers != nil && desired.Spec.Processor.Informers.Enabled != nil { - enabled = *desired.Spec.Processor.Informers.Enabled - } - - if !enabled { + // Check if informers are enabled (default: true) + if desired.Spec.Processor.Informers != nil && !ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) {Requires
k8s.io/utils/ptrimport. Skip if the project prefers the explicit form.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_reconciler.go` around lines 66 - 70, Replace the manual nil-check that initializes the local variable enabled from desired.Spec.Processor.Informers.Enabled with a single ptr.Deref call: set enabled := ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) (or the equivalent expression) so the default true is used when the pointer is nil; add the import "k8s.io/utils/ptr". Update the block that references desired.Spec.Processor.Informers.Enabled and remove the explicit if nil { ... } branch.
101-108: SA drift won't be corrected — confirm this is acceptable.
reconcileServiceAccountonly creates when missing. If labels/annotations/imagePullSecrets on the SA ever diverge from the builder (e.g., user edit, future builder changes), it won't self-heal until the SA is deleted. The inline comment acknowledges it, so flagging as FYI in case the informers SA gains secrets/annotations later.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/controller/flp/flp_informer_reconciler.go` around lines 101 - 108, reconcileServiceAccount in informerReconciler currently only creates the ServiceAccount via builder.serviceAccount() when r.Managed.Exists(r.serviceAccount) is false, so label/annotation/imagePullSecrets drift won't be corrected; update reconcileServiceAccount to fetch the existing SA, compare labels, annotations and imagePullSecrets with builder.serviceAccount() and perform a patch/update (using r.Update or a strategic merge/patch method) when differences exist while preserving managed fields, or explicitly document/guard that this resource is intentionally immutable; reference reconcileServiceAccount, informerReconciler, builder.serviceAccount(), r.Managed.Exists and r.CreateOwned when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@api/flowcollector/v1beta2/flowcollector_validation_webhook.go`:
- Around line 460-481: In validateInformers(), treat a nil Enabled pointer as
true (respect the kubebuilder default) by changing the enabled calculation to
consider nil as enabled instead of disabled (refer to
v.fc.Processor.Informers.Enabled and the validateInformers function); keep the
replicas >=2 enforcement when enabled but update the error message or coordinate
with the CRD (Replicas' +kubebuilder:validation:Minimum) so documentation/CRD
marker and the webhook behavior are aligned (either relax webhook to match CRD
or change CRD/marker to indicate minimum 2).
In `@helm/crds/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5154-5161: The description for the replicas field claims "a
minimum of 2 replicas is required" but the schema still allows minimum: 1;
update the CRD by either changing the schema to enforce minimum: 2 (replace
"minimum: 1" with "minimum: 2" for the replicas property) or, if you don't want
to enforce it, reword the description to say "recommended for HA" instead of
"required"; locate the replicas block (contains "replicas:", "default: 2", and
the description) and apply the chosen change so description and validation are
consistent.
- Around line 5110-5230: The CRD and status population are missing the informers
component; add an Informers *FlowCollectorComponentStatus field to the
FlowCollectorComponentsStatus struct and update populateComponentStatuses() to
handle status.FLPInformers (mirror the existing cases for
Agent/Processor/Plugin) so that when the controller calls
status.ForComponent(status.FLPInformers) and uses SetReady()/SetUnused() the
value is preserved and emitted in status.components.informers; ensure the CRD
description list is updated to include informers as well.
---
Outside diff comments:
In `@internal/controller/flp/flp_common_objects.go`:
- Around line 160-207: The k8scache ContainerPort and the "--k8scache.*" args
are being added unconditionally; change the logic to only append the k8scache
port (the corev1.ContainerPort with Name "k8scache") and the
"--k8scache.port/--k8scache.address" entries in the container.Args for the
constants.FLPName container when the informers feature is enabled (i.e., gate on
the informers.enabled flag in advancedConfig or the equivalent field). Also
ensure any call to validatePortConflicts is only invoked when informers are
enabled so we don’t validate 9090 unless the informers are on. Locate these
changes around where ports are appended (k8scache port block), where
container.Args are built (k8scache args), and where validatePortConflicts is
called and wrap them in the same informers-enabled conditional.
---
Duplicate comments:
In `@config/crd/bases/flows.netobserv.io_flowcollectors.yaml`:
- Around line 5150-5157: The replicas schema for the flp-informers deployment
currently declares minimum: 1 but the description mandates "a minimum of 2
replicas when `enabled` is `true` — fix this by either updating the schema to
enforce the constraint or rewording the description: either change the
`replicas` validation so minimum: 2 and/or add an if/then conditional that
checks the `enabled` boolean and requires replicas >= 2 when `enabled: true`, or
keep minimum: 1 but rephrase the description to a recommendation (e.g.,
"recommended minimum of 2 when `enabled` is true"); locate the `replicas` field
and the related `enabled` flag in the flowcollectors CRD to apply the chosen
fix.
---
Nitpick comments:
In `@bundle/manifests/netobserv-operator.clusterserviceversion.yaml`:
- Around line 469-486: The CSV UI descriptors should mirror existing patterns:
change the descriptor for processor.informers.enabled to use booleanSwitch (like
consolePlugin.enable) and mark the processor.informers.advanced section and its
child fields (processor.informers.advanced.batchSize,
processor.informers.advanced.processorPort,
processor.informers.advanced.resyncInterval,
processor.informers.advanced.sendTimeout,
processor.informers.advanced.updateBufferSize) as advanced/hidden; add a
fieldDependency on processor.informers.enabled:true so the advanced fields only
appear when the Informers toggle is enabled and the OpenShift form behavior
matches other components.
In `@internal/controller/flp/flp_common_objects.go`:
- Around line 98-124: The validatePortConflicts function repeats the same error
template for multiple ports; refactor it by extracting the error message
template and building a small map/slice of (portName, portValue) entries using
advancedConfig.Port, advancedConfig.HealthPort,
desired.Processor.GetMetricsPort(), and advancedConfig.ProfilePort, then iterate
and return fmt.Errorf(fmtTemplate, val, k8scachePort) when a conflict is found;
update references to the unique symbols validatePortConflicts, advancedConfig,
desired.Processor.GetMetricsPort, and k8scachePort to implement the loop-based
check and remove duplicated branches — also consider moving this spec validation
into the webhook near validateFLP so misconfigs are rejected early instead of
being swallowed by builders.
- Line 161: Promote the literal "k8scache" to a named constant by adding
k8scachePortName = "k8scache" alongside the existing k8scachePort constant, and
replace the literal Name: "k8scache" with Name: k8scachePortName (or use the
constant wherever that port name is referenced). Update references to the string
in the same file so usage is consistent with
healthPortName/prometheusPortName/profilePortName and the existing k8scachePort
symbol.
- Line 30: Replace the hardcoded k8scachePort = 9090 with a configurable
Advanced option: add a field (e.g., Advanced.K8sCachePort int) to the existing
configuration struct and supply a default of 9090 when loading/constructing the
config, then replace uses of the package-level k8scachePort symbol with reads
from that config (falling back to 9090 if unset); also update any config
parsing/CLI flags and documentation to expose the new Advanced option and keep
existing validation (or simplify validatePortConflicts if desired).
In `@internal/controller/flp/flp_controller_test.go`:
- Around line 168-174: Add assertions in the "With Kafka" and "Back without
Kafka" test contexts to verify the flp-informers Deployment and ServiceAccount
persist across the mono→Kafka→mono transitions: use Eventually with
k8sClient.Get and the existing flpKeyInformer key to ensure Get(ctx,
flpKeyInformer, &appsv1.Deployment{}) and Get(ctx, flpKeyInformer,
&corev1.ServiceAccount{}) both succeed (similar to the existing
ClusterRoleBinding check using rb3), so the shared informer Deployment/SA is not
accidentally torn down when switching models.
In `@internal/controller/flp/flp_informer_reconciler.go`:
- Around line 66-70: Replace the manual nil-check that initializes the local
variable enabled from desired.Spec.Processor.Informers.Enabled with a single
ptr.Deref call: set enabled :=
ptr.Deref(desired.Spec.Processor.Informers.Enabled, true) (or the equivalent
expression) so the default true is used when the pointer is nil; add the import
"k8s.io/utils/ptr". Update the block that references
desired.Spec.Processor.Informers.Enabled and remove the explicit if nil { ... }
branch.
- Around line 101-108: reconcileServiceAccount in informerReconciler currently
only creates the ServiceAccount via builder.serviceAccount() when
r.Managed.Exists(r.serviceAccount) is false, so
label/annotation/imagePullSecrets drift won't be corrected; update
reconcileServiceAccount to fetch the existing SA, compare labels, annotations
and imagePullSecrets with builder.serviceAccount() and perform a patch/update
(using r.Update or a strategic merge/patch method) when differences exist while
preserving managed fields, or explicitly document/guard that this resource is
intentionally immutable; reference reconcileServiceAccount, informerReconciler,
builder.serviceAccount(), r.Managed.Exists and r.CreateOwned when making the
change.
In `@internal/controller/flp/flp_test.go`:
- Around line 814-850: TestPortConflictValidation only exercises
validatePortConflicts via newMonolithBuilder; add a mirror set of assertions
that call newTransfoBuilder to ensure validatePortConflicts is enforced there
too (or refactor TestPortConflictValidation into a table-driven loop that runs
each test case against both newMonolithBuilder and newTransfoBuilder). Keep the
same mutation functions and error substring checks (use the existing cases for
Processor.Advanced.Port, HealthPort, Metrics.Server.Port, ProfilePort and the
no-conflict case) and assert the same error messages when invoking
newTransfoBuilder (or both builders) so the coverage is identical for
validatePortConflicts, referencing the existing TestPortConflictValidation,
newMonolithBuilder, newTransfoBuilder and validatePortConflicts symbols.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 46903d42-c9f4-4324-9404-64e76dab93f5
⛔ Files ignored due to path filters (1)
api/flowcollector/v1beta2/zz_generated.deepcopy.gois excluded by!**/zz_generated.deepcopy.go
📒 Files selected for processing (26)
api/flowcollector/v1beta2/flowcollector_types.goapi/flowcollector/v1beta2/flowcollector_validation_webhook.gobundle/manifests/flows.netobserv.io_flowcollectors.yamlbundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yamlbundle/manifests/netobserv-operator.clusterserviceversion.yamlconfig/crd/bases/flows.netobserv.io_flowcollectors.yamlconfig/rbac/component_roles.yamlconfig/rbac/role.yamlconfig/samples/flows_v1beta2_flowcollector.yamldocs/FlowCollector.mdhelm/crds/flows.netobserv.io_flowcollectors.yamlhelm/templates/clusterrole.yamlhelm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yamlhelm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yamlinternal/controller/flp/flp_common_objects.gointernal/controller/flp/flp_controller.gointernal/controller/flp/flp_controller_test.gointernal/controller/flp/flp_informer_builder.gointernal/controller/flp/flp_informer_reconciler.gointernal/controller/flp/flp_monolith_objects.gointernal/controller/flp/flp_monolith_reconciler.gointernal/controller/flp/flp_test.gointernal/controller/flp/flp_transfo_objects.gointernal/controller/flp/flp_transfo_reconciler.gointernal/pkg/manager/manager.gointernal/pkg/manager/status/status_manager.go
💤 Files with no reviewable changes (2)
- internal/controller/flp/flp_monolith_reconciler.go
- internal/controller/flp/flp_transfo_reconciler.go
✅ Files skipped from review due to trivial changes (11)
- internal/pkg/manager/manager.go
- internal/controller/flp/flp_transfo_objects.go
- bundle/manifests/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
- internal/pkg/manager/status/status_manager.go
- helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrole.yaml
- config/samples/flows_v1beta2_flowcollector.yaml
- helm/templates/netobserv-informers_rbac.authorization.k8s.io_v1_clusterrolebinding.yaml
- helm/templates/clusterrole.yaml
- internal/controller/flp/flp_informer_builder.go
- internal/controller/flp/flp_controller.go
- docs/FlowCollector.md
🚧 Files skipped from review as they are similar to previous changes (5)
- config/rbac/role.yaml
- internal/controller/flp/flp_monolith_objects.go
- config/rbac/component_roles.yaml
- api/flowcollector/v1beta2/flowcollector_types.go
- bundle/manifests/flows.netobserv.io_flowcollectors.yaml
| func (v *validator) validateInformers() { | ||
| if v.fc.Processor.Informers == nil { | ||
| return | ||
| } | ||
|
|
||
| // Check if enabled | ||
| enabled := v.fc.Processor.Informers.Enabled != nil && *v.fc.Processor.Informers.Enabled | ||
|
|
||
| if enabled { | ||
| // When enabled, replicas must be at least 2 for high availability | ||
| replicas := int32(2) // default | ||
| if v.fc.Processor.Informers.Replicas != nil { | ||
| replicas = *v.fc.Processor.Informers.Replicas | ||
| } | ||
| if replicas < 2 { | ||
| v.errors = append( | ||
| v.errors, | ||
| fmt.Errorf("spec.processor.informers.replicas must be at least 2 when informers are enabled (got %d). Centralized informers require high availability to avoid losing the entire flow collection pipeline in case of failure", replicas), | ||
| ) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Default-enabled nil check may skip validation.
Enabled has kubebuilder default true, but this code treats nil as disabled. If the webhook runs before CRD defaulting (or on an object that bypassed defaulting), a user with informers: {} sneaks past the replica check while effectively running enabled. Consider treating nil as true to match the declared default:
🔧 Suggested tweak
- // Check if enabled
- enabled := v.fc.Processor.Informers.Enabled != nil && *v.fc.Processor.Informers.Enabled
+ // Check if enabled (defaults to true per CRD)
+ enabled := v.fc.Processor.Informers.Enabled == nil || *v.fc.Processor.Informers.EnabledAlso note: the CRD declares +kubebuilder:validation:Minimum=1 on Replicas, but this webhook enforces >=2 when enabled. Worth aligning the docstring on Replicas or the CRD marker so users aren't surprised by a "1 is valid here, but not really" situation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@api/flowcollector/v1beta2/flowcollector_validation_webhook.go` around lines
460 - 481, In validateInformers(), treat a nil Enabled pointer as true (respect
the kubebuilder default) by changing the enabled calculation to consider nil as
enabled instead of disabled (refer to v.fc.Processor.Informers.Enabled and the
validateInformers function); keep the replicas >=2 enforcement when enabled but
update the error message or coordinate with the CRD (Replicas'
+kubebuilder:validation:Minimum) so documentation/CRD marker and the webhook
behavior are aligned (either relax webhook to match CRD or change CRD/marker to
indicate minimum 2).
| informers: | ||
| description: |- | ||
| `informers` configuration for centralized Kubernetes informers that push cache updates to flowlogs-pipeline processors. | ||
| This reduces load on the Kubernetes API server by having a single component (flp-informers) query the API instead of N FLP processors. | ||
| When enabled, a dedicated `flp-informers` deployment is created that watches Kubernetes resources and pushes updates via gRPC. | ||
| properties: | ||
| advanced: | ||
| description: '`advanced` allows setting some technical parameters of the informers component.' | ||
| properties: | ||
| batchSize: | ||
| default: 100 | ||
| description: '`batchSize` defines the maximum number of cache entries to send in a single update batch.' | ||
| minimum: 1 | ||
| type: integer | ||
| processorPort: | ||
| default: 9090 | ||
| description: '`processorPort` defines the gRPC port where flowlogs-pipeline processors listen for k8s cache updates.' | ||
| format: int32 | ||
| maximum: 65535 | ||
| minimum: 1 | ||
| type: integer | ||
| resyncInterval: | ||
| default: 60 | ||
| description: '`resyncInterval` defines the interval in seconds to rediscover processors and sync state.' | ||
| minimum: 1 | ||
| type: integer | ||
| sendTimeout: | ||
| default: 10 | ||
| description: '`sendTimeout` defines the timeout in seconds for sending updates to processors.' | ||
| minimum: 1 | ||
| type: integer | ||
| updateBufferSize: | ||
| default: 100 | ||
| description: '`updateBufferSize` defines the size of the internal update channel buffer.' | ||
| minimum: 1 | ||
| type: integer | ||
| type: object | ||
| enabled: | ||
| default: true | ||
| description: |- | ||
| `enabled` controls whether to deploy centralized Kubernetes informers. | ||
| When `true`, a dedicated `flp-informers` deployment watches K8s resources and pushes cache updates via gRPC to FLP processors. | ||
| When `false`, each FLP processor uses local informers (previous behavior). | ||
| type: boolean | ||
| replicas: | ||
| default: 2 | ||
| description: |- | ||
| `replicas` defines the number of replicas for the flp-informers deployment. | ||
| For high availability, a minimum of 2 replicas is required when `enabled` is `true`. | ||
| format: int32 | ||
| minimum: 1 | ||
| type: integer | ||
| resources: | ||
| default: | ||
| limits: | ||
| cpu: 200m | ||
| memory: 256Mi | ||
| requests: | ||
| cpu: 50m | ||
| memory: 128Mi | ||
| description: |- | ||
| `resources` are the compute resources required by the informers container. | ||
| For more information, see https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ | ||
| properties: | ||
| claims: | ||
| description: |- | ||
| Claims lists the names of resources, defined in spec.resourceClaims, | ||
| that are used by this container. | ||
|
|
||
| This field depends on the | ||
| DynamicResourceAllocation feature gate. | ||
|
|
||
| This field is immutable. It can only be set for containers. | ||
| items: | ||
| description: ResourceClaim references one entry in PodSpec.ResourceClaims. | ||
| properties: | ||
| name: | ||
| description: |- | ||
| Name must match the name of one entry in pod.spec.resourceClaims of | ||
| the Pod where this field is used. It makes that resource available | ||
| inside a container. | ||
| type: string | ||
| request: | ||
| description: |- | ||
| Request is the name chosen for a request in the referenced claim. | ||
| If empty, everything from the claim is made available, otherwise | ||
| only the result of this request. | ||
| type: string | ||
| required: | ||
| - name | ||
| type: object | ||
| type: array | ||
| x-kubernetes-list-map-keys: | ||
| - name | ||
| x-kubernetes-list-type: map | ||
| limits: | ||
| additionalProperties: | ||
| anyOf: | ||
| - type: integer | ||
| - type: string | ||
| pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ | ||
| x-kubernetes-int-or-string: true | ||
| description: |- | ||
| Limits describes the maximum amount of compute resources allowed. | ||
| More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ | ||
| type: object | ||
| requests: | ||
| additionalProperties: | ||
| anyOf: | ||
| - type: integer | ||
| - type: string | ||
| pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$ | ||
| x-kubernetes-int-or-string: true | ||
| description: |- | ||
| Requests describes the minimum amount of compute resources required. | ||
| If Requests is omitted for a container, it defaults to Limits if that is explicitly specified, | ||
| otherwise to an implementation-defined value. Requests cannot exceed Limits. | ||
| More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/ | ||
| type: object | ||
| type: object | ||
| type: object |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== status.components fields in CRDs =="
rg -n -C2 '^\s{20}(agent|processor|plugin|informers):$' \
helm/crds/flows.netobserv.io_flowcollectors.yaml \
config/crd/bases/flows.netobserv.io_flowcollectors.yaml \
bundle/manifests/flows.netobserv.io_flowcollectors.yaml || true
echo
echo "== controller / API references to informer component status =="
rg -n -C3 'status\.components|components.*informers|informers.*state|Informer.*State|Informers.*State' \
api/flowcollector/v1beta2 \
internal/controller || trueRepository: netobserv/netobserv-operator
Length of output: 5798
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Go status types for components =="
rg -n -B2 -A10 'type.*ComponentsStatus|type.*Status.*components' api/flowcollector/v1beta2/
echo
echo "== Check for Informers in status types =="
rg -n 'type.*Informers.*Status|Informers.*\*.*Status|StatusInformers' api/flowcollector/v1beta2/
echo
echo "== Controller writes to status.components =="
rg -n -B2 -A2 '\.Status\.Components' internal/controller/ | head -50
echo
echo "== Check if controller references informers in status =="
rg -n 'informers.*status|status.*informers|components\.Informers|Informers.*=|status.*Informers' internal/controller/ || trueRepository: netobserv/netobserv-operator
Length of output: 1355
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Full FlowCollectorComponentsStatus struct =="
sed -n '1736,1760p' api/flowcollector/v1beta2/flowcollector_types.go
echo
echo "== Check controller reconciler for informers status writes =="
rg -n 'components\.' internal/controller/ | grep -i informer || echo "No informers status writes found"
echo
echo "== Check if informers deployment is managed =="
rg -n -B2 -A2 'flp-informers|informersDeployment' internal/controller/ | head -40
echo
echo "== Look for any recent TODO/FIXME about informers status =="
rg -n 'TODO|FIXME' internal/controller/ | grep -i informer || echo "No TODOs found"Repository: netobserv/netobserv-operator
Length of output: 3528
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check CRD description of status.components =="
rg -n -A2 'components.*reports the status' config/crd/bases/flows.netobserv.io_flowcollectors.yaml helm/crds/flows.netobserv.io_flowcollectors.yaml
echo
echo "== Verify informers deployment is actually created by reconciler =="
rg -n 'flp_informer_reconciler\|informer.*reconciler' internal/controller/ | head -20Repository: netobserv/netobserv-operator
Length of output: 899
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for any status update code in flp_informer files =="
find internal/controller -name '*informer*' -type f
echo
echo "== Look at flp_informer_reconciler.go for status updates =="
cat internal/controller/flp/flp_informer_reconciler.go | head -100
echo
echo "== Check if status.components is set anywhere for informers =="
rg -n 'Informers.*:=|\.Informers\s*=' internal/controller/ || echo "No Informers status assignments"Repository: netobserv/netobserv-operator
Length of output: 3585
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check what r.Status.SetReady() actually does =="
rg -n -B5 -A10 'func.*SetReady|SetUnused' internal/pkg/manager/status/ | head -50
echo
echo "== Look for where Components status is actually updated =="
rg -n 'Components\.Agent|Components\.Processor|Components\.Plugin' internal/controller/ | head -20Repository: netobserv/netobserv-operator
Length of output: 4515
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for status manager component types =="
rg -n 'EBPFAgents|FLPProcessor|ConsolePlugin|Informers' internal/pkg/manager/status/status_manager.go | head -30
echo
echo "== Check status manager initialization =="
cat internal/pkg/manager/status/status_manager.go | head -100Repository: netobserv/netobserv-operator
Length of output: 3257
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check populateComponentStatuses function for informers =="
rg -n -A50 'func.*populateComponentStatuses' internal/pkg/manager/status/status_manager.go
echo
echo "== Check ForComponent to see if informers can be accessed =="
rg -n -B2 -A10 'func.*ForComponent' internal/pkg/manager/status/status_manager.goRepository: netobserv/netobserv-operator
Length of output: 2757
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check what status the informer reconciler is using =="
rg -n 'ForComponent\|getStatus' internal/controller/flp/flp_informer_reconciler.go
echo
echo "== Check the full informer reconciler status tracking =="
cat -n internal/controller/flp/flp_informer_reconciler.go | grep -A5 -B5 'Status'
echo
echo "== Verify FLPInformers is actually used anywhere =="
rg -n 'FLPInformers' internal/controller/Repository: netobserv/netobserv-operator
Length of output: 128
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Check Instance type in informer reconciler =="
grep -n 'Instance\|Status' internal/controller/flp/flp_informer_reconciler.go | head -20
echo
echo "== Look at reconcilers.Instance to see Status field =="
rg -n -B2 -A15 'type Instance struct' internal/controller/reconcilers/
echo
echo "== Search for where FLPInformers is used =="
rg -n 'FLPInformers' internal/Repository: netobserv/netobserv-operator
Length of output: 2746
Add status.components.informers field to track the flp-informers component.
The controller tracks informer status via status.ForComponent(status.FLPInformers) and calls SetReady()/SetUnused() on it, but populateComponentStatuses() has no case for FLPInformers, so the status is silently dropped. The CRD lacks a corresponding Components.Informers field entirely, and the description still lists only (agent, processor, plugin). Update FlowCollectorComponentsStatus to include Informers *FlowCollectorComponentStatus and add the matching case in populateComponentStatuses().
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@helm/crds/flows.netobserv.io_flowcollectors.yaml` around lines 5110 - 5230,
The CRD and status population are missing the informers component; add an
Informers *FlowCollectorComponentStatus field to the
FlowCollectorComponentsStatus struct and update populateComponentStatuses() to
handle status.FLPInformers (mirror the existing cases for
Agent/Processor/Plugin) so that when the controller calls
status.ForComponent(status.FLPInformers) and uses SetReady()/SetUnused() the
value is preserved and emitted in status.components.informers; ensure the CRD
description list is updated to include informers as well.
| replicas: | ||
| default: 2 | ||
| description: |- | ||
| `replicas` defines the number of replicas for the flp-informers deployment. | ||
| For high availability, a minimum of 2 replicas is required when `enabled` is `true`. | ||
| format: int32 | ||
| minimum: 1 | ||
| type: integer |
There was a problem hiding this comment.
The HA requirement is stronger than the schema.
Line 5158 says 2 replicas are required, but the schema still allows replicas: 1. Either enforce that with validation or reword this to “recommended for HA”.
Suggested doc-only fix
replicas:
default: 2
description: |-
`replicas` defines the number of replicas for the flp-informers deployment.
- For high availability, a minimum of 2 replicas is required when `enabled` is `true`.
+ Set this to `2` or more for high availability.
format: int32
minimum: 1
type: integer📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| replicas: | |
| default: 2 | |
| description: |- | |
| `replicas` defines the number of replicas for the flp-informers deployment. | |
| For high availability, a minimum of 2 replicas is required when `enabled` is `true`. | |
| format: int32 | |
| minimum: 1 | |
| type: integer | |
| replicas: | |
| default: 2 | |
| description: |- | |
| `replicas` defines the number of replicas for the flp-informers deployment. | |
| Set this to `2` or more for high availability. | |
| format: int32 | |
| minimum: 1 | |
| type: integer |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@helm/crds/flows.netobserv.io_flowcollectors.yaml` around lines 5154 - 5161,
The description for the replicas field claims "a minimum of 2 replicas is
required" but the schema still allows minimum: 1; update the CRD by either
changing the schema to enforce minimum: 2 (replace "minimum: 1" with "minimum:
2" for the replicas property) or, if you don't want to enforce it, reword the
description to say "recommended for HA" instead of "required"; locate the
replicas block (contains "replicas:", "default: 2", and the description) and
apply the chosen change so description and validation are consistent.
|
@leandroberetta: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
0d13eee to
d1de54c
Compare
Description
Adds a dedicated flp-informers deployment (and RBAC) managed by the NetObserv operator, and wires flowlogs-pipeline processors to expose the k8scache gRPC endpoint so centralized Kubernetes informers can push cache updates to FLP.
Dependencies
netobserv/flowlogs-pipeline#1221
Checklist
Summary by CodeRabbit
New Features
Configuration Changes
Chores
Tests