redpanda-data · JakeSCahill · Nov 12, 2025 · Nov 5, 2025 · coderabbitai · Nov 5, 2025
@@ -0,0 +1,75 @@
+# This file contains some tests originally ported from our e2e-v2 tests.
+# We should really evaluate whether or not to just delete these.
+Feature: Basic cluster tests
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Updating admin ports
+    # replaces e2e-v2 "upgrade-values-check"   
+    Given I apply Kubernetes manifest:
+    """
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Redpanda
+    metadata:
+      name: upgrade
+    spec:
+      clusterSpec:
+        statefulset:
+          replicas: 1
+        listeners:
+          admin:
+            external:
+              default:
+                port: 9645
+    """
+    And cluster "upgrade" is stable with 1 nodes
+    And service "upgrade-external" has named port "admin-default" with value 9645
+    And rpk is configured correctly in "upgrade" cluster
+    When I apply Kubernetes manifest:
+    """
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Redpanda
+    metadata:
+      name: upgrade
+    spec:
+      clusterSpec:
+        statefulset:
+          replicas: 1
+        listeners:
+          admin:
+            external:
+              default:
+                port: 9640
+    """
+    Then cluster "upgrade" is stable with 1 nodes
+    And service "upgrade-external" should have named port "admin-default" with value 9640
+    And rpk is configured correctly in "upgrade" cluster
+
+
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Rack Awareness
+    Given I apply Kubernetes manifest:
+    # NB: You wouldn't actually use kubernetes.io/os for the value of rack,
+    # it's just a value that we know is both present and deterministic for the
+    # purpose of testing.
+    """
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Redpanda
+    metadata:
+      name: rack-awareness
+    spec:
+      clusterSpec:
+        console:
+          enabled: false
+        statefulset:
+          replicas: 1
+        rackAwareness:
+          enabled: true
+          nodeAnnotation: 'kubernetes.io/os'
+    """
+    And cluster "rack-awareness" is stable with 1 nodes
+    Then running `cat /etc/redpanda/redpanda.yaml | grep -o 'rack: .*$'` will output:
+    """
+    rack: linux
+    """
@@ -0,0 +1,80 @@
+@cluster:basic
+Feature: Console CRDs
+  Background: Cluster available
+    Given cluster "basic" is available
+
+  Scenario: Using clusterRef
+    When I apply Kubernetes manifest:
+    ```yaml
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Console
+    metadata:
+      name: console
+    spec:
+      cluster:
+        clusterRef:
+          name: basic
+    ```
+    Then Console "console" will be healthy
+    # These steps demonstrate that console is correctly connected to Redpanda (Kafka, Schema Registry, and Admin API).
+    And I exec "curl localhost:8080/api/schema-registry/mode" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"mode":"READWRITE"}
+    ```
+    And I exec "curl localhost:8080/api/topics" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"topics":[{"topicName":"_schemas","isInternal":false,"partitionCount":1,"replicationFactor":1,"cleanupPolicy":"compact","documentation":"NOT_CONFIGURED","logDirSummary":{"totalSizeBytes":117}}]}
+    ```
+    And I exec "curl localhost:8080/api/console/endpoints | grep -o '{[^{}]*DebugBundleService[^{}]*}'" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"endpoint":"redpanda.api.console.v1alpha1.DebugBundleService","method":"POST","isSupported":true}
+    ```
+
+  Scenario: Using staticConfig
+    When I apply Kubernetes manifest:
+    ```yaml
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Console
+    metadata:
+      name: console
+    spec:
+      cluster:
+        staticConfiguration:
+          kafka:
+            brokers:
+            - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
-            - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
+            - basic-0.basic.${NAMESPACE}.svc.cluster.local:9093
-            - basic-0.basic.${NAMESPACE}.svc.cluster.local.:9093
+            - basic-0.basic.${NAMESPACE}.svc.cluster.local:9093
+            tls:
+              caCertSecretRef:
+                name: "basic-default-cert"
+                key: "ca.crt"
+          admin:
+            urls:
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
-            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
+            urls:
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644
-            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:9644
+            urls:
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local:9644
+            tls:
+              caCertSecretRef:
+                name: "basic-default-cert"
+                key: "ca.crt"
+          schemaRegistry:
+            urls:
+            - https://basic-0.basic.${NAMESPACE}.svc.cluster.local.:8081
+            tls:
+              caCertSecretRef:
+                name: "basic-default-cert"
+                key: "ca.crt"
+    ```
+    Then Console "console" will be healthy
+    # These steps demonstrate that console is correctly connected to Redpanda (Kafka, Schema Registry, and Admin API).
+    And I exec "curl localhost:8080/api/schema-registry/mode" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"mode":"READWRITE"}
+    ```
+    And I exec "curl localhost:8080/api/topics" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"topics":[{"topicName":"_schemas","isInternal":false,"partitionCount":1,"replicationFactor":1,"cleanupPolicy":"compact","documentation":"NOT_CONFIGURED","logDirSummary":{"totalSizeBytes":117}}]}
+    ```
+    And I exec "curl localhost:8080/api/console/endpoints | grep -o '{[^{}]*DebugBundleService[^{}]*}'" in a Pod matching "app.kubernetes.io/instance=console", it will output:
+    ```
+    {"endpoint":"redpanda.api.console.v1alpha1.DebugBundleService","method":"POST","isSupported":true}
+    ```
@@ -0,0 +1,13 @@
+Feature: Decommissioning brokers
+  # note that this test requires both the decommissioner and pvc unbinder
+  # run in order to pass  
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Pruning brokers on failed nodes
+    Given I create a basic cluster "decommissioning" with 3 nodes
+    And cluster "decommissioning" is stable with 3 nodes
+    When I physically shutdown a kubernetes node for cluster "decommissioning"
+    And cluster "decommissioning" is unhealthy
+    And cluster "decommissioning" has only 2 remaining nodes
+    And I prune any kubernetes node that is now in a NotReady status
+    Then cluster "decommissioning" should recover
+    And cluster "decommissioning" should be stable with 3 nodes
@@ -0,0 +1,39 @@
+@operator:none
+Feature: Redpanda Helm Chart
+
+  Scenario: Tolerating Node Failure
+    Given I helm install "redpanda" "../charts/redpanda/chart" with values:
+    ```yaml
+     nameOverride: foobar
+     fullnameOverride: bazquux
+
+     statefulset:
+       sideCars:
+         image:
+           tag: dev
+           repository: localhost/redpanda-operator
+         pvcUnbinder:
+           enabled: true
+           unbindAfter: 15s
+         brokerDecommissioner:
+           enabled: true
+           decommissionAfter: 15s
+    ```
+    When I stop the Node running Pod "bazquux-2"
+    And Pod "bazquux-2" is eventually Pending
+    Then Pod "bazquux-2" will eventually be Running
+    And kubectl exec -it "bazquux-0" "rpk redpanda admin brokers list | sed -E 's/\s+/ /gm' | cut -d ' ' -f 1,6" will eventually output:
+    ```
+    ID MEMBERSHIP
+    0 active
+    1 active
+    3 active
+    ```
+    And kubectl exec -it "bazquux-0" "rpk redpanda admin brokers list --include-decommissioned | sed -E 's/\s+/ /gm' | cut -d ' ' -f 1,6" will eventually output:
+    ```
+    ID MEMBERSHIP
+    0 active
+    1 active
+    3 active
+    2 -
+    ```
@@ -0,0 +1,24 @@
+Feature: Metrics endpoint has authentication and authorization
+
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Reject request without TLS
+    Given the operator is running
+    Then its metrics endpoint should reject http request with status code "400"
+
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Reject unauthenticated token
+    Given the operator is running
+    Then its metrics endpoint should reject authorization random token request with status code "500"
+
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Accept request
+    Given the operator is running
+    When I apply Kubernetes manifest:
+    """
+    apiVersion: v1
+    kind: ServiceAccount
+    metadata:
+      name: testing
+    """
+    And "testing" service account has bounded "redpanda-operator-.*-metrics-reader" regexp cluster role name
+    Then its metrics endpoint should accept https request with "testing" service account token
@@ -0,0 +1,38 @@
+Feature: Helm chart to Redpanda Operator migration
+
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Migrate from a Helm chart release to a Redpanda custom resource
+        Given I helm install "redpanda-migration-example" "../charts/redpanda/chart" with values:
+        """
+    # tag::helm-values[]
+        fullnameOverride: name-override
+    # end::helm-values[]
+        # Without the below values, the operator would have to modify the cluster after the migration.
+        # As this is test specific because we use a local version of the operator, this block is excluded from the helm-values tag above.
+        statefulset:
+          sideCars:
+            image:
+              repository: localhost/redpanda-operator
+              tag: dev
+        """
+        And I store "{.metadata.generation}" of Kubernetes object with type "StatefulSet.v1.apps" and name "name-override" as "generation"
+        When I apply Kubernetes manifest:
+        """
+    # tag::redpanda-custom-resource-manifest[]
+        ---
+        apiVersion: cluster.redpanda.com/v1alpha2
+        kind: Redpanda
+        metadata:
+          name: redpanda-migration-example
+        spec:
+          # This manifest is a copy of Redpanda release Helm values
+          clusterSpec:
+            fullnameOverride: name-override
+    # end::redpanda-custom-resource-manifest[]
+        """
+        Then cluster "redpanda-migration-example" is available
+        And the Kubernetes object of type "StatefulSet.v1.apps" with name "name-override" has an OwnerReference pointing to the cluster "redpanda-migration-example"
+        And the helm release for "redpanda-migration-example" can be deleted by removing its stored secret
+        And the cluster "redpanda-migration-example" is healthy
+        # this winds up being incremented due to us forcibly swapping the cluster's StatefulSets to leverage OnDelete semantics
+        And the recorded value "generation" is one less than "{.metadata.generation}" of the Kubernetes object with type "StatefulSet.v1.apps" and name "name-override"
@@ -0,0 +1,40 @@
+@operator:none @vcluster
+Feature: Upgrading the operator
+  @skip:gke @skip:aks @skip:eks
+  Scenario: Operator upgrade from 25.1.3
+    Given I helm install "redpanda-operator" "redpanda/operator" --version v25.1.3 with values:
+    """
+    crds:
+      enabled: true
+    """
+    And I apply Kubernetes manifest:
+    """
+    ---
+    apiVersion: cluster.redpanda.com/v1alpha2
+    kind: Redpanda
+    metadata:
+      name: operator-upgrade
+    spec:
+      clusterSpec:
+        console:
+          enabled: false
+        statefulset:
+          replicas: 1
+          sideCars:
+            image:
+              tag: dev
+              repository: localhost/redpanda-operator
+    """
+    # use just a Ready status check here since that's all the
+    # old operator supports
+    And cluster "operator-upgrade" is available
+    Then I can helm upgrade "redpanda-operator" "../operator/chart" with values:
+    """
+    image:
+      tag: dev
+      repository: localhost/redpanda-operator
+    crds:
+      experimental: true
+    """
+    # use the new status as this will eventually get set
+    And cluster "operator-upgrade" should be stable with 1 nodes