-
Notifications
You must be signed in to change notification settings - Fork 48
auto-docs: Update K8s acceptance tests #1445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis change introduces a comprehensive suite of Kubernetes-based BDD test features for Redpanda cluster management. Eight new feature files are added covering cluster administration (admin port updates, rack awareness, scaling, decommissioning), operator workflows (Helm chart deployment, operator upgrades, Helm-to-Operator migration), CRD management (Console, Role, ShadowLink), metrics authentication, and node failure tolerance. Three existing feature files are updated with variant tags and authentication configuration. All new features include embedded YAML manifests and Gherkin-style test steps that validate cluster state, perform kubectl operations, and verify expected outputs. Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Areas warranting attention during review:
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
modules/manage/examples/kubernetes/user-crds.feature (1)
54-93: Authentication config contradicts "authorization-only" scenario intent.The scenario is named "Manage authorization-only users" and the documentation comment states "This example assumes that you already have a user called 'travis' in your cluster." However, lines 74-80 add authentication configuration, which contradicts the "authorization-only" intent.
Additionally, there's a mechanism mismatch:
- Line 58: Pre-existing user "travis" is created with
SCRAM-SHA-256- Line 75: New authentication config specifies
scram-sha-512- Line 93: Final assertion expects authentication with
SCRAM-SHA-256This could cause the test to fail or confuse users about the intended use case.
Consider one of these solutions:
Option 1: Remove the authentication block to keep this truly authorization-only:
clusterRef: name: sasl - authentication: - type: scram-sha-512 - password: - valueFrom: - secretKeyRef: - name: travis-password - key: password authorization: acls:Option 2: If authentication must be included, align the mechanism and update the scenario name/description to reflect that this manages both authentication and authorization:
- Scenario: Manage authorization-only users + Scenario: Manage users with authentication and authorizationAnd change line 75 to match the pre-existing mechanism:
- type: scram-sha-512 + type: scram-sha-256
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (13)
modules/manage/examples/kubernetes/cluster.feature(1 hunks)modules/manage/examples/kubernetes/console.feature(1 hunks)modules/manage/examples/kubernetes/decommissioning.feature(1 hunks)modules/manage/examples/kubernetes/helm-chart.feature(1 hunks)modules/manage/examples/kubernetes/metrics.feature(1 hunks)modules/manage/examples/kubernetes/migration.feature(1 hunks)modules/manage/examples/kubernetes/operator-upgrades.feature(1 hunks)modules/manage/examples/kubernetes/role-crds.feature(1 hunks)modules/manage/examples/kubernetes/scaling.feature(1 hunks)modules/manage/examples/kubernetes/schema-crds.feature(1 hunks)modules/manage/examples/kubernetes/shadow-links.feature(1 hunks)modules/manage/examples/kubernetes/topic-crds.feature(2 hunks)modules/manage/examples/kubernetes/user-crds.feature(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Redirect rules - redpanda-docs-preview
- GitHub Check: Header rules - redpanda-docs-preview
- GitHub Check: Pages changed - redpanda-docs-preview
🔇 Additional comments (12)
modules/manage/examples/kubernetes/schema-crds.feature (1)
1-1: LGTM! Tagging update aligns with PR-wide test categorization.The addition of
@variant:vectorizedis consistent with similar tagging updates across other CRD feature files in this PR.modules/manage/examples/kubernetes/topic-crds.feature (1)
1-1: LGTM! Consistent tagging update.The
@variant:vectorizedtag addition aligns with the broader test categorization changes in this PR.modules/manage/examples/kubernetes/decommissioning.feature (1)
1-13: LGTM! Well-structured decommissioning scenario.The test flow is logical and includes helpful documentation about the required components (decommissioner and PVC unbinder) for the test to pass.
modules/manage/examples/kubernetes/user-crds.feature (1)
1-1: LGTM! Consistent tagging update.The
@variant:vectorizedtag addition aligns with similar updates across CRD feature files in this PR.modules/manage/examples/kubernetes/helm-chart.feature (1)
1-39: LGTM! Comprehensive node failure tolerance test.The scenario effectively validates Helm chart behavior during node failures, including:
- Pod state transitions (Pending → Running)
- Broker decommissioning (broker 2 removed, broker 3 added)
- Proper verification of both active and decommissioned broker lists
The short timeout values (15s) are appropriate for accelerating test execution.
modules/manage/examples/kubernetes/shadow-links.feature (1)
1-51: LGTM! Well-structured ShadowLink test.The scenario provides good coverage of ShadowLink CRD functionality:
- Proper setup of source and shadow clusters with the required feature flag
- Trace logging enabled for easier debugging (line 9)
- Clear test flow: create topic → establish link → verify replication
- Appropriate sync interval (2s) for testing
modules/manage/examples/kubernetes/cluster.feature (3)
5-46: LGTM! Admin port update scenario is well-structured.The scenario effectively demonstrates:
- Initial cluster deployment with custom admin port (9645)
- Verification of port configuration and rpk setup
- Port update (9645 → 9640)
- Re-verification after the change
This provides good coverage of the admin port configuration workflow.
49-75: LGTM! Rack awareness scenario with helpful explanatory comment.The scenario clearly demonstrates rack awareness configuration, and the comment (lines 52-54) appropriately explains that
kubernetes.io/osis used for testing purposes rather than as a production recommendation.
1-3: Review comment is based on incorrect characterization of the file.This is a Gherkin test file (
.feature), not user-facing documentation. The file contains executable test scenarios with Kubernetes manifests and step definitions, located in theexamples/kubernetes/directory alongside 12 other similar test files.The TODO comment about evaluating whether to delete these tests is internal uncertainty about test maintenance value—not a blocker for inclusion in examples. Test/example code with maintenance questions is a normal part of any codebase and doesn't require documentation review approval.
The reviewer's concern about inclusion in "user-facing documentation" does not apply to test fixtures in an examples directory.
Likely an incorrect or invalid review comment.
modules/manage/examples/kubernetes/role-crds.feature (1)
1-110: LGTM! Well-structured Role CRD test scenarios.The feature file demonstrates comprehensive testing of Role CRD lifecycle including principal management, authorization ACLs, and authorization-only role handling. The YAML manifests are properly formatted and the test flow is logical.
modules/manage/examples/kubernetes/operator-upgrades.feature (1)
1-40: LGTM! Clear operator upgrade test scenario.The scenario properly tests the upgrade path from v25.1.3 to the dev version, with appropriate status checks for both old and new operator capabilities. The use of localhost image references is appropriate for local testing.
modules/manage/examples/kubernetes/migration.feature (1)
1-38: LGTM! Well-designed Helm-to-Operator migration test.The scenario comprehensively tests the migration path from Helm chart to Redpanda custom resource, including validation of OwnerReference adoption and StatefulSet generation tracking. The use of
fullnameOverrideto maintain resource naming consistency across the migration is appropriate.
| staticConfiguration: | ||
| kafka: | ||
| brokers: | ||
| - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix DNS syntax error in Kafka broker address.
The DNS name has an extra period before the port: basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093. The correct format should have the colon directly after local without an additional period.
Apply this fix:
brokers:
- - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
+ - basic-0.basic.${NAMESPACE}.svc.cluster.local:9093📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093 | |
| - basic-0.basic.${NAMESPACE}.svc.cluster.local:9093 |
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 47, the Kafka
broker DNS contains an extra period before the port causing an invalid address;
remove the stray dot so the host ends with svc.cluster.local:9093 (i.e., replace
"basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093" with
"basic-0.basic.${NAMESPACE}.svc.cluster.local:9093").
| key: "ca.crt" | ||
| admin: | ||
| urls: | ||
| - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix DNS syntax error in Admin API URL.
The DNS name has an extra period before the port: https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644. The correct format should have the colon directly after local without an additional period.
Apply this fix:
urls:
- - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
+ - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644 | |
| urls: | |
| - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644 |
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 54, the Admin
API URL contains an extra period before the port
("https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644"); remove the stray
dot so the host ends with ".local" and the port follows immediately, i.e. change
it to "https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644".
| key: "ca.crt" | ||
| schemaRegistry: | ||
| urls: | ||
| - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix DNS syntax error in Schema Registry URL.
The DNS name has an extra period before the port: https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081. The correct format should have the colon directly after local without an additional period.
Apply this fix:
urls:
- - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081
+ - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9081Note: Verify the correct Schema Registry port. Typically it's 8081, but ensure it matches your Redpanda configuration.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/console.feature around line 61, the Schema
Registry URL contains an extra dot before the port ("...cluster.local.:8081");
remove the stray period so the host ends with "cluster.local" and the port
follows with a colon (e.g.,
"https://basic-0.basic.${NAMESPACE}.svc.cluster.local:8081"); verify and correct
the port number if your Redpanda configuration uses a different port.
| When I apply Kubernetes manifest: | ||
| """ | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: testing | ||
| """ | ||
| And "testing" service account has bounded "redpanda-operator-.*-metrics-reader" regexp cluster role name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ClusterRoleBinding in the manifest.
The manifest creates a ServiceAccount named "testing" but line 23 expects this account to be bound to a ClusterRole matching the regex redpanda-operator-.*-metrics-reader. However, no ClusterRoleBinding resource is defined in the manifest to establish this binding. This will cause the test to fail unless the binding is created elsewhere.
Add a ClusterRoleBinding to complete the authorization setup:
apiVersion: v1
kind: ServiceAccount
metadata:
name: testing
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+ name: testing-metrics-reader
+roleRef:
+ apiGroup: rbac.authorization.k8s.io
+ kind: ClusterRole
+ name: redpanda-operator-<namespace>-metrics-reader # Replace <namespace> with actual namespace
+subjects:
+- kind: ServiceAccount
+ name: testing
+ namespace: <namespace> # Replace with actual namespaceCommittable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/metrics.feature around lines 16 to 23, the
manifest creates a ServiceAccount "testing" but lacks a ClusterRoleBinding to
bind it to a ClusterRole matching the regex redpanda-operator-.*-metrics-reader;
add a ClusterRoleBinding resource in the manifest that references the
ServiceAccount "testing" in the appropriate namespace (or cluster-wide subject)
and binds it to the ClusterRole (or to a ClusterRole whose name matches that
regex) so the ServiceAccount has the required cluster-level permissions for
metrics reading.
| Given I create a basic cluster "scaleup" with 1 nodes | ||
| And cluster "scaleup" is stable with 1 nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix grammatical inconsistency.
"1 nodes" should be "1 node" (singular) for grammatical correctness.
Apply this diff:
- Given I create a basic cluster "scaleup" with 1 nodes
- And cluster "scaleup" is stable with 1 nodes
+ Given I create a basic cluster "scaleup" with 1 node
+ And cluster "scaleup" is stable with 1 node📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Given I create a basic cluster "scaleup" with 1 nodes | |
| And cluster "scaleup" is stable with 1 nodes | |
| Given I create a basic cluster "scaleup" with 1 node | |
| And cluster "scaleup" is stable with 1 node |
🤖 Prompt for AI Agents
In modules/manage/examples/kubernetes/scaling.feature around lines 11 to 12, the
text uses "1 nodes" (plural) which is grammatically incorrect for a singular
count; update both occurrences to read "1 node" to match singular grammar and
keep consistency in the scenario steps.
This PR auto-updates the acceptance tests we use as examples in our Kubernetes docs.