Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI pods restart when network policy for GUI is deleted #1058

Open
saurabhwani5 opened this issue Nov 7, 2023 · 3 comments
Open

CSI pods restart when network policy for GUI is deleted #1058

saurabhwani5 opened this issue Nov 7, 2023 · 3 comments
Assignees
Labels
Customer Impact: Localized high impact (3) Reduction of function. Significant impact to workload. Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Found In: 2.10.0 Severity: 3 Indicates the the issue is on the priority list for next milestone. Type: Bug Indicates issue is an undesired behavior, usually caused by code error.

Comments

@saurabhwani5
Copy link
Member

saurabhwani5 commented Nov 7, 2023

Describe the bug

When network policy for GUI is deleted and configmap is change in this case CSI pods restarts and sidecars go into CrashLoopBackOff

How to Reproduce?

Please list the steps to help development teams reproduce the behavior

  1. Install CSI with Fix for tunable network policy #1050 images as following:
[root@OCP network]# oc get pods
NAME                                                  READY   STATUS    RESTARTS      AGE
csi-scale-fsetdemo-pod-5                              1/1     Running   0             4d20h
ibm-spectrum-scale-csi-attacher-775c787cd7-6tlkv      1/1     Running   0             4d9h
ibm-spectrum-scale-csi-attacher-775c787cd7-w2bhl      1/1     Running   0             4d9h
ibm-spectrum-scale-csi-gp946                          3/3     Running   0             4d9h
ibm-spectrum-scale-csi-mrwbj                          3/3     Running   0             4d9h
ibm-spectrum-scale-csi-operator-7fb8d8f6f9-sls6c      1/1     Running   9 (32h ago)   5d
ibm-spectrum-scale-csi-pctst                          3/3     Running   0             4d9h
ibm-spectrum-scale-csi-provisioner-74dc9dff59-rhdwc   1/1     Running   0             4d9h
ibm-spectrum-scale-csi-resizer-78f7684fff-46zx2       1/1     Running   0             4d9h
ibm-spectrum-scale-csi-snapshotter-5f77874594-pc9zk   1/1     Running   0             4d9h

[root@OCP network]# oc get cso
NAME                     VERSION   SUCCESS
ibm-spectrum-scale-csi   2.10.0    True
[root@OCP network]# oc describe pod | grep quay
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
    Image:         quay.io/hemalatha_gajendran/tunable_host_network_latest
    Image ID:      quay.io/hemalatha_gajendran/tunable_host_network_latest@sha256:1e25ff135f6b7ac9cf4d845d2341c5403f210f90014ce69fc9bbb2d799812cb5
      CSI_DRIVER_IMAGE:          quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
    Image:         quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver:v2.10.0-011123
    Image ID:      quay.io/ibm-spectrum-scale-dev/ibm-spectrum-scale-csi-driver@sha256:05d24a16359c9479a7917d8086a94e1ddde8918f86f75c34ba4d1f9498362ea2
  1. Apply Configmap as following:
[root@OCP pr1050]# cat cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ibm-spectrum-scale-csi-config
  namespace: ibm-spectrum-scale-csi
data:
  HOST_NETWORK: DISABLED
[root@OCP pr1050]# oc apply -f cm.yaml
configmap/ibm-spectrum-scale-csi-config created
  1. Create network policies as following :
[root@OCP network]# oc get networkpolicy
NAME                     POD-SELECTOR   AGE
allow-dns-acces          <none>         5d3h
allow-egress-apiserver   <none>         5d3h
allow-gui-route          <none>         4d23h
[root@OCP network]# oc get networkpolicy -oyaml
apiVersion: v1
items:
- apiVersion: networking.k8s.io/v1
  kind: NetworkPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-dns-acces","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":5353,"protocol":"UDP"},{"port":5353,"protocol":"TCP"}]}],"ingress":[{"ports":[{"port":5353,"protocol":"UDP"},{"port":5353,"protocol":"TCP"}]}],"policyTypes":["Egress","Ingress"]}}
    creationTimestamp: "2023-11-02T04:13:33Z"
    generation: 1
    name: allow-dns-acces
    namespace: ibm-spectrum-scale-csi
    resourceVersion: "7879033"
    uid: 0059c64d-4f72-4ea7-8383-7b2730c630b1
  spec:
    egress:
    - ports:
      - port: 5353
        protocol: UDP
      - port: 5353
        protocol: TCP
    ingress:
    - ports:
      - port: 5353
        protocol: UDP
      - port: 5353
        protocol: TCP
    podSelector: {}
    policyTypes:
    - Egress
    - Ingress
  status: {}
- apiVersion: networking.k8s.io/v1
  kind: NetworkPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-egress-apiserver","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.19.200/32"}}]},{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.25.22/32"}}]},{"ports":[{"port":443,"protocol":"TCP"},{"port":6443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"10.13.26.216/32"}}]}],"podSelector":{},"policyTypes":["Egress"]}}
    creationTimestamp: "2023-11-02T04:30:15Z"
    generation: 1
    name: allow-egress-apiserver
    namespace: ibm-spectrum-scale-csi
    resourceVersion: "7885041"
    uid: dc70a12b-ce0d-4953-b7ee-e865348d36aa
  spec:
    egress:
    - ports:
      - port: 443
        protocol: TCP
      - port: 6443
        protocol: TCP
      to:
      - ipBlock:
          cidr: 10.13.19.200/32
    - ports:
      - port: 443
        protocol: TCP
      - port: 6443
        protocol: TCP
      to:
      - ipBlock:
          cidr: 10.13.25.22/32
    - ports:
      - port: 443
        protocol: TCP
      - port: 6443
        protocol: TCP
      to:
      - ipBlock:
          cidr: 10.13.26.216/32
    podSelector: {}
    policyTypes:
    - Egress
  status: {}
- apiVersion: networking.k8s.io/v1
  kind: NetworkPolicy
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"networking.k8s.io/v1","kind":"NetworkPolicy","metadata":{"annotations":{},"name":"allow-gui-route","namespace":"ibm-spectrum-scale-csi"},"spec":{"egress":[{"ports":[{"port":443,"protocol":"TCP"}],"to":[{"ipBlock":{"cidr":"<OCP-IP>/32"}},{"ipBlock":{"cidr":"10.11.125.246/32"}}]}],"podSelector":{},"policyTypes":["Egress"]}}
    creationTimestamp: "2023-11-02T08:48:02Z"
    generation: 1
    name: allow-gui-route
    namespace: ibm-spectrum-scale-csi
    resourceVersion: "7986486"
    uid: 936c2545-3c35-4eea-88a8-3db85949eb46
  spec:
    egress:
    - ports:
      - port: 443
        protocol: TCP
      to:
      - ipBlock:
          cidr: <OCP-IP>/32
      - ipBlock:
          cidr: 10.11.125.246/32
    podSelector: {}
    policyTypes:
    - Egress
  status: {}
kind: List
metadata:
  resourceVersion: ""
[root@OCP network]#
[root@OCP network]# kubectl get endpoints kubernetes -n default
NAME         ENDPOINTS                                              AGE
kubernetes   10.13.19.200:6443,10.13.25.22:6443,10.13.26.216:6443   18d
  1. Delete GUI network policy as following :
[root@OCP network]# oc delete networkpolicy allow-gui-route
networkpolicy.networking.k8s.io "allow-gui-route" deleted
  1. Add the configmap values here I'm changing log level to trace:
[root@OCP pr1050]# cat cm.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: ibm-spectrum-scale-csi-config
  namespace: ibm-spectrum-scale-csi
data:
  HOST_NETWORK: DISABLED
  VAR_DRIVER_LOGLEVEL: TRACE
[root@OCP pr1050]# oc apply -f cm.yaml
configmap/ibm-spectrum-scale-csi-config configured
  1. Now check CSI Pods :
[root@OCP pr1050]# oc get pods
NAME                                                  READY   STATUS             RESTARTS      AGE
ibm-spectrum-scale-csi-5j98c                          3/3     Running            2 (27s ago)   2m28s
ibm-spectrum-scale-csi-64bfh                          3/3     Running            2 (34s ago)   2m35s
ibm-spectrum-scale-csi-attacher-6f4ddd869b-h2tcx      1/1     Running            5 (14s ago)   2m36s
ibm-spectrum-scale-csi-attacher-6f4ddd869b-rhvtn      1/1     Running            5 (56s ago)   2m36s
ibm-spectrum-scale-csi-operator-7fb8d8f6f9-sls6c      1/1     Running            9 (33h ago)   5d
ibm-spectrum-scale-csi-provisioner-846ddb9cb8-jrjr2   0/1     CrashLoopBackOff   4 (36s ago)   2m36s
ibm-spectrum-scale-csi-q5rbz                          3/3     Running            2 (30s ago)   2m31s
ibm-spectrum-scale-csi-resizer-5fb9fb844b-gwzv9       0/1     CrashLoopBackOff   4 (35s ago)   2m36s
ibm-spectrum-scale-csi-snapshotter-76f97d78fc-w7zjn   0/1     CrashLoopBackOff   4 (35s ago)   2m36s

[root@OCP pr1050]# oc get cso
NAME                     VERSION   SUCCESS
ibm-spectrum-scale-csi   2.10.0    False
  1. Check CSO description :
[root@OCP pr1050]# oc describe cso
Name:         ibm-spectrum-scale-csi
Namespace:    ibm-spectrum-scale-csi
Labels:       <none>
Annotations:  <none>
API Version:  csi.ibm.com/v1
Kind:         CSIScaleOperator
Metadata:
  Creation Timestamp:  2023-10-27T04:05:14Z
  Finalizers:
    finalizer.csiscaleoperators.csi.ibm.com
  Generation:  2
  Managed Fields:
    API Version:  csi.ibm.com/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          v:"finalizer.csiscaleoperators.csi.ibm.com":
        f:ownerReferences:
          k:{"uid":"33f69ae8-082a-4986-90dc-2e698a078924"}:
      f:spec:
        f:attacherNodeSelector:
        f:clusters:
        f:consistencyGroupPrefix:
        f:nodeMapping:
        f:pluginNodeSelector:
        f:provisionerNodeSelector:
        f:resizerNodeSelector:
        f:snapshotterNodeSelector:
        f:tolerations:
    Manager:      /ibm-spectrum-scale
    Operation:    Apply
    Time:         2023-10-27T04:05:14Z
    API Version:  csi.ibm.com/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"finalizer.csiscaleoperators.csi.ibm.com":
      f:spec:
        f:consistencyGroupPrefix:
    Manager:      manager
    Operation:    Update
    Time:         2023-10-27T04:05:27Z
    API Version:  csi.ibm.com/v1
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        .:
        f:conditions:
        f:versions:
    Manager:      CSIScaleOperator
    Operation:    Update
    Subresource:  status
    Time:         2023-11-07T07:56:50Z
  Owner References:
    API Version:           scale.spectrum.ibm.com/v1beta1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Cluster
    Name:                  ibm-spectrum-scale
    UID:                   33f69ae8-082a-4986-90dc-2e698a078924
  Resource Version:        10810726
  UID:                     34c6b7cc-5148-4562-a9d8-88765ff8f953
Spec:
  Attacher Node Selector:
    Key:    scale
    Value:  true
  Clusters:
    Id:  10383269897192936933
    Primary:
      Primary Fs:      fs0
      Primary Fset:    primary-fileset-fs0-10383269897192936933
      Remote Cluster:  4345204465007136554
    Rest API:
      Gui Host:       ibm-spectrum-scale-gui-ibm-spectrum-scale.apps.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Secrets:          ibm-spectrum-scale-gui-csiadmin
    Secure Ssl Mode:  false
    Id:               4345204465007136554
    Rest API:
      Gui Host:              remote-shrutikanipane148a-2.fyre.ibm.com
      Gui Port:              443
    Secrets:                 csi-remote-mount-storage-cluster-1
    Secure Ssl Mode:         false
  Consistency Group Prefix:  a5ac401f-882a-4246-a03f-55592a7a07d7
  Node Mapping:
    k8sNode:             worker0.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  worker0
    k8sNode:             worker1.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  worker1
    k8sNode:             worker2.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  worker2
    k8sNode:             master0.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  master0
    k8sNode:             master1.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  master1
    k8sNode:             master2.cnsa-shrutikanipane148a.cp.fyre.ibm.com
    Spectrumscale Node:  master2
  Plugin Node Selector:
    Key:    scale
    Value:  true
  Provisioner Node Selector:
    Key:    scale
    Value:  true
  Resizer Node Selector:
    Key:    scale
    Value:  true
  Snapshotter Node Selector:
    Key:    scale
    Value:  true
  Tolerations:
    Effect:    NoSchedule
    Operator:  Exists
    Effect:    NoExecute
    Operator:  Exists
    Key:       CriticalAddonsOnly
    Operator:  Exists
Status:
  Conditions:
    Last Transition Time:  2023-11-07T07:56:50Z
    Message:               Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs
    Reason:                UpdateFailed
    Status:                False
    Type:                  Success
  Versions:
    Name:     ibm-spectrum-scale-csi
    Version:  2.10.0
Events:
  Type     Reason         Age                 From              Message
  ----     ------         ----                ----              -------
  Normal   CSIConfigured  41s (x13 over 33h)  CSIScaleOperator  The CSI driver resources have been created/updated successfully
  Warning  UpdateFailed   41s (x4 over 11h)   CSIScaleOperator  Failed to set defaults on the instance ibm-spectrum-scale-csi. Please check Operator logs

Expected behavior

Above we can see CSI pods are restarting and shouldn't restart

Logs:

/scale-csi/D.1058
mustgather.tar.gz

@saurabhwani5 saurabhwani5 added Severity: 3 Indicates the the issue is on the priority list for next milestone. Type: Bug Indicates issue is an undesired behavior, usually caused by code error. Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Customer Impact: Localized high impact (3) Reduction of function. Significant impact to workload. Found In: 2.10.0 labels Nov 7, 2023
@saurabhwani5
Copy link
Member Author

similar type of issue is also when we there is no network policy applied and Host Network is enabled .GUI of primary fs is not reachable and we apply configmap or change the configmap

@saurabhwani5
Copy link
Member Author

Improvement for future : There should be prerequisite check at operator before restarting driver pods when confimap is changed or applied. It will stop all CSI pod restarts

@deeghuge
Copy link
Member

deeghuge commented Feb 2, 2024

@hemalathagajendran What is the issue here ? is this expected behaviour if not we should document the same

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Customer Impact: Localized high impact (3) Reduction of function. Significant impact to workload. Customer Probability: Medium (3) Issue occurs in normal path but specific limited timing window, or other mitigating factor Found In: 2.10.0 Severity: 3 Indicates the the issue is on the priority list for next milestone. Type: Bug Indicates issue is an undesired behavior, usually caused by code error.
Projects
None yet
Development

No branches or pull requests

3 participants