Skip to content

Conversation

@vbotbuildovich
Copy link
Contributor

This PR auto-updates the acceptance tests we use as examples in our Kubernetes docs.

@vbotbuildovich vbotbuildovich requested a review from a team as a code owner November 5, 2025 18:47
@netlify
Copy link

netlify bot commented Nov 5, 2025

Deploy Preview for redpanda-docs-preview ready!

Name Link
🔨 Latest commit 90f5d65
🔍 Latest deploy log https://app.netlify.com/projects/redpanda-docs-preview/deploys/690b9bcc4caff90008755c88
😎 Deploy Preview https://deploy-preview-1445--redpanda-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 5, 2025

📝 Walkthrough

Walkthrough

This change introduces a comprehensive suite of Kubernetes-based BDD test features for Redpanda cluster management. Eight new feature files are added covering cluster administration (admin port updates, rack awareness, scaling, decommissioning), operator workflows (Helm chart deployment, operator upgrades, Helm-to-Operator migration), CRD management (Console, Role, ShadowLink), metrics authentication, and node failure tolerance. Three existing feature files are updated with variant tags and authentication configuration. All new features include embedded YAML manifests and Gherkin-style test steps that validate cluster state, perform kubectl operations, and verify expected outputs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • All changes are test specification files (Gherkin/BDD format) with consistent, repetitive structure across new files
  • No modifications to application code, business logic, or public APIs
  • Primary review concerns are semantic correctness of test scenarios and embedded manifests
  • Tag modifications in existing files are straightforward and isolated

Areas warranting attention during review:

  • Verification that embedded YAML manifests in each scenario are syntactically correct and represent realistic test configurations
  • Consistency of assertions and expected behaviors across similar scenarios (e.g., stability checks, port validations)
  • Correctness of cloud provider skip tags (@skip for gke, aks, eks) where specified
  • The migration scenario logic, which involves StatefulSet generation tracking and OwnerReference verification

Possibly related PRs

Suggested reviewers

  • chrisseto
  • sago2k8
  • andrewstucki

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is minimal but explains the PR purpose. However, it lacks required template sections like JIRA ticket, review deadline, page previews, and checkboxes. Complete the template by adding JIRA ticket reference, review deadline, page previews of updated K8s documentation, and checking appropriate change type boxes.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: updating Kubernetes acceptance tests for documentation examples.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch update-acceptance-tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modules/manage/examples/kubernetes/user-crds.feature (1)

54-93: Authentication config contradicts "authorization-only" scenario intent.

The scenario is named "Manage authorization-only users" and the documentation comment states "This example assumes that you already have a user called 'travis' in your cluster." However, lines 74-80 add authentication configuration, which contradicts the "authorization-only" intent.

Additionally, there's a mechanism mismatch:

  • Line 58: Pre-existing user "travis" is created with SCRAM-SHA-256
  • Line 75: New authentication config specifies scram-sha-512
  • Line 93: Final assertion expects authentication with SCRAM-SHA-256

This could cause the test to fail or confuse users about the intended use case.

Consider one of these solutions:

Option 1: Remove the authentication block to keep this truly authorization-only:

       clusterRef:
         name: sasl
-      authentication:
-        type: scram-sha-512
-        password:
-          valueFrom:
-            secretKeyRef:
-              name: travis-password
-              key: password
       authorization:
         acls:

Option 2: If authentication must be included, align the mechanism and update the scenario name/description to reflect that this manages both authentication and authorization:

-  Scenario: Manage authorization-only users
+  Scenario: Manage users with authentication and authorization

And change line 75 to match the pre-existing mechanism:

-        type: scram-sha-512
+        type: scram-sha-256
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between eaabee5 and 90f5d65.

📒 Files selected for processing (13)
  • modules/manage/examples/kubernetes/cluster.feature (1 hunks)
  • modules/manage/examples/kubernetes/console.feature (1 hunks)
  • modules/manage/examples/kubernetes/decommissioning.feature (1 hunks)
  • modules/manage/examples/kubernetes/helm-chart.feature (1 hunks)
  • modules/manage/examples/kubernetes/metrics.feature (1 hunks)
  • modules/manage/examples/kubernetes/migration.feature (1 hunks)
  • modules/manage/examples/kubernetes/operator-upgrades.feature (1 hunks)
  • modules/manage/examples/kubernetes/role-crds.feature (1 hunks)
  • modules/manage/examples/kubernetes/scaling.feature (1 hunks)
  • modules/manage/examples/kubernetes/schema-crds.feature (1 hunks)
  • modules/manage/examples/kubernetes/shadow-links.feature (1 hunks)
  • modules/manage/examples/kubernetes/topic-crds.feature (2 hunks)
  • modules/manage/examples/kubernetes/user-crds.feature (3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Redirect rules - redpanda-docs-preview
  • GitHub Check: Header rules - redpanda-docs-preview
  • GitHub Check: Pages changed - redpanda-docs-preview
🔇 Additional comments (12)
modules/manage/examples/kubernetes/schema-crds.feature (1)

1-1: LGTM! Tagging update aligns with PR-wide test categorization.

The addition of @variant:vectorized is consistent with similar tagging updates across other CRD feature files in this PR.

modules/manage/examples/kubernetes/topic-crds.feature (1)

1-1: LGTM! Consistent tagging update.

The @variant:vectorized tag addition aligns with the broader test categorization changes in this PR.

modules/manage/examples/kubernetes/decommissioning.feature (1)

1-13: LGTM! Well-structured decommissioning scenario.

The test flow is logical and includes helpful documentation about the required components (decommissioner and PVC unbinder) for the test to pass.

modules/manage/examples/kubernetes/user-crds.feature (1)

1-1: LGTM! Consistent tagging update.

The @variant:vectorized tag addition aligns with similar updates across CRD feature files in this PR.

modules/manage/examples/kubernetes/helm-chart.feature (1)

1-39: LGTM! Comprehensive node failure tolerance test.

The scenario effectively validates Helm chart behavior during node failures, including:

  • Pod state transitions (Pending → Running)
  • Broker decommissioning (broker 2 removed, broker 3 added)
  • Proper verification of both active and decommissioned broker lists

The short timeout values (15s) are appropriate for accelerating test execution.

modules/manage/examples/kubernetes/shadow-links.feature (1)

1-51: LGTM! Well-structured ShadowLink test.

The scenario provides good coverage of ShadowLink CRD functionality:

  • Proper setup of source and shadow clusters with the required feature flag
  • Trace logging enabled for easier debugging (line 9)
  • Clear test flow: create topic → establish link → verify replication
  • Appropriate sync interval (2s) for testing
modules/manage/examples/kubernetes/cluster.feature (3)

5-46: LGTM! Admin port update scenario is well-structured.

The scenario effectively demonstrates:

  • Initial cluster deployment with custom admin port (9645)
  • Verification of port configuration and rpk setup
  • Port update (9645 → 9640)
  • Re-verification after the change

This provides good coverage of the admin port configuration workflow.


49-75: LGTM! Rack awareness scenario with helpful explanatory comment.

The scenario clearly demonstrates rack awareness configuration, and the comment (lines 52-54) appropriately explains that kubernetes.io/os is used for testing purposes rather than as a production recommendation.


1-3: Review comment is based on incorrect characterization of the file.

This is a Gherkin test file (.feature), not user-facing documentation. The file contains executable test scenarios with Kubernetes manifests and step definitions, located in the examples/kubernetes/ directory alongside 12 other similar test files.

The TODO comment about evaluating whether to delete these tests is internal uncertainty about test maintenance value—not a blocker for inclusion in examples. Test/example code with maintenance questions is a normal part of any codebase and doesn't require documentation review approval.

The reviewer's concern about inclusion in "user-facing documentation" does not apply to test fixtures in an examples directory.

Likely an incorrect or invalid review comment.

modules/manage/examples/kubernetes/role-crds.feature (1)

1-110: LGTM! Well-structured Role CRD test scenarios.

The feature file demonstrates comprehensive testing of Role CRD lifecycle including principal management, authorization ACLs, and authorization-only role handling. The YAML manifests are properly formatted and the test flow is logical.

modules/manage/examples/kubernetes/operator-upgrades.feature (1)

1-40: LGTM! Clear operator upgrade test scenario.

The scenario properly tests the upgrade path from v25.1.3 to the dev version, with appropriate status checks for both old and new operator capabilities. The use of localhost image references is appropriate for local testing.

modules/manage/examples/kubernetes/migration.feature (1)

1-38: LGTM! Well-designed Helm-to-Operator migration test.

The scenario comprehensively tests the migration path from Helm chart to Redpanda custom resource, including validation of OwnerReference adoption and StatefulSet generation tracking. The use of fullnameOverride to maintain resource naming consistency across the migration is appropriate.

staticConfiguration:
kafka:
brokers:
- basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix DNS syntax error in Kafka broker address.

The DNS name has an extra period before the port: basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093. The correct format should have the colon directly after local without an additional period.

Apply this fix:

             brokers:
-            - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
+            - basic-0.basic.${NAMESPACE}.svc.cluster.local:9093
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
- basic-0.basic.${NAMESPACE}.svc.cluster.local:9093
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 47, the Kafka
broker DNS contains an extra period before the port causing an invalid address;
remove the stray dot so the host ends with svc.cluster.local:9093 (i.e., replace
"basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093" with
"basic-0.basic.${NAMESPACE}.svc.cluster.local:9093").

key: "ca.crt"
admin:
urls:
- https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix DNS syntax error in Admin API URL.

The DNS name has an extra period before the port: https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644. The correct format should have the colon directly after local without an additional period.

Apply this fix:

             urls:
-            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
urls:
- https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 54, the Admin
API URL contains an extra period before the port
("https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644"); remove the stray
dot so the host ends with ".local" and the port follows immediately, i.e. change
it to "https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644".

key: "ca.crt"
schemaRegistry:
urls:
- https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix DNS syntax error in Schema Registry URL.

The DNS name has an extra period before the port: https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081. The correct format should have the colon directly after local without an additional period.

Apply this fix:

             urls:
-            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9081

Note: Verify the correct Schema Registry port. Typically it's 8081, but ensure it matches your Redpanda configuration.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 61, the Schema
Registry URL contains an extra dot before the port ("...cluster.local.:8081");
remove the stray period so the host ends with "cluster.local" and the port
follows with a colon (e.g.,
"https://basic-0.basic.${NAMESPACE}.svc.cluster.local:8081"); verify and correct
the port number if your Redpanda configuration uses a different port.

Comment on lines +16 to +23
When I apply Kubernetes manifest:
"""
apiVersion: v1
kind: ServiceAccount
metadata:
name: testing
"""
And "testing" service account has bounded "redpanda-operator-.*-metrics-reader" regexp cluster role name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Missing ClusterRoleBinding in the manifest.

The manifest creates a ServiceAccount named "testing" but line 23 expects this account to be bound to a ClusterRole matching the regex redpanda-operator-.*-metrics-reader. However, no ClusterRoleBinding resource is defined in the manifest to establish this binding. This will cause the test to fail unless the binding is created elsewhere.

Add a ClusterRoleBinding to complete the authorization setup:

 apiVersion: v1
 kind: ServiceAccount
 metadata:
   name: testing
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+  name: testing-metrics-reader
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: ClusterRole
+  name: redpanda-operator-<namespace>-metrics-reader  # Replace <namespace> with actual namespace
+subjects:
+- kind: ServiceAccount
+  name: testing
+  namespace: <namespace>  # Replace with actual namespace

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/metrics.feature around lines 16 to 23, the
manifest creates a ServiceAccount "testing" but lacks a ClusterRoleBinding to
bind it to a ClusterRole matching the regex redpanda-operator-.*-metrics-reader;
add a ClusterRoleBinding resource in the manifest that references the
ServiceAccount "testing" in the appropriate namespace (or cluster-wide subject)
and binds it to the ClusterRole (or to a ClusterRole whose name matches that
regex) so the ServiceAccount has the required cluster-level permissions for
metrics reading.

Comment on lines +11 to +12
Given I create a basic cluster "scaleup" with 1 nodes
And cluster "scaleup" is stable with 1 nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix grammatical inconsistency.

"1 nodes" should be "1 node" (singular) for grammatical correctness.

Apply this diff:

-    Given I create a basic cluster "scaleup" with 1 nodes
-    And cluster "scaleup" is stable with 1 nodes
+    Given I create a basic cluster "scaleup" with 1 node
+    And cluster "scaleup" is stable with 1 node
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Given I create a basic cluster "scaleup" with 1 nodes
And cluster "scaleup" is stable with 1 nodes
Given I create a basic cluster "scaleup" with 1 node
And cluster "scaleup" is stable with 1 node
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/scaling.feature around lines 11 to 12, the
text uses "1 nodes" (plural) which is grammatically incorrect for a singular
count; update both occurrences to read "1 node" to match singular grammar and
keep consistency in the scenario steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants