Skip to content

Conversation

@gnufied
Copy link
Contributor

@gnufied gnufied commented Aug 11, 2023

Health check should check if registration socket is really responsive.

/healthz doesn't merely check the presence of the registration socket, but also check if it responds to grpc requests

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 11, 2023
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 11, 2023

infoRequest := &registerapi.InfoRequest{}

info, err := client.GetInfo(ctx, infoRequest)
Copy link
Contributor

@jsafrane jsafrane Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probe() would be more appropriate GetInfo is correct, this is registration socket

return fmt.Errorf("error connecting to node-registrar socket %s: %v", socketFile, err)
}

defer closeGrpcConnection(socketFile, csiConn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to close the connection? All CSI sidecars re-use the same connection for all requests. gRPC does some recovery when the socket is closed, AFAIK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for sidecars I guess it makes sense. I am less sure about periodic checks. I think closing the connection here is fine.

Comment on lines +162 to +163
if info.Name == csiDriverName {
return nil
}
return fmt.Errorf("invalid driver name %s", info.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this be useful? Did we ever saw CSI drivers changing their name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think driver name should change without requiring re-registration right?

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 11, 2023
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 11, 2023
os.Exit(1)
}

klog.V(2).Infof("CSI driver name: %q", csiDriverName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this log could have been useful


defer closeGrpcConnection(socketFile, grpcConn)

klog.V(1).Infof("Calling node registrar to check if it still responds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: V(2)

@jsafrane
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 14, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 14, 2023
@k8s-ci-robot k8s-ci-robot merged commit c9e19f2 into kubernetes-csi:master Aug 14, 2023
@mowangdk
Copy link
Contributor

We have a similar problem on our environment, csi-node-driver-registrar process still exists, but the socket has been deactivated, what is the root cause of the problem? Is it an OS issue?

@mauriciopoppe
Copy link
Member

I changed the release note to /healthz doesn't merely check the presence of the registration socket, but also check if it responds to grpc requests, unfortunately I did it after I saw that this was already released 😅.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants