-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC-967 Add more details to the node upgrade doc for Kubernetes #960
Conversation
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please have a look at comment re diagram. Otherwise, well-written!
Co-authored-by: Joyce Fee <[email protected]>
Sorry for taking so long on this one. IIRC this ask was kicked off as we were trying to determine how best to migrate an Azure cluster as we had written up some instructions. That operation ended up going the worst way imaginable which got me thinking more deeply about this topic, especially because the landscape of self hosted is much broader than cloud. Here's a heavily annotated flow chart that should capture most cases. It doesn't touch on how to perform the operation manually, just whether or not it needs to be manual 😓 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed with Jake on a hang out. We came away with a few changes to make but I'm going to mark myself as an approver so I don't accidentally become a blocker.
Loose notes:
- Use
kubectl drain
instead ofkubectl delete <pod>
as drain will use eviction and cordon the node. - Updating the STS strategy to OnDelete is generally useful but should be required if taints/tolerations/node selectors need to be upgraded.
- Guides for NodePool Upgrades and Kubernetes Upgrades can be consolidated as it's largely the same operation.
- Adding a buffer node isn't required for network backed storage but having replicas >= 3 is.
- For local volumes the flow is:
kubectl drain
,kubectl delete pvc <pvc name>
,kubectl delete pod <pod name>
Description
Review deadline: 25 Jan
We recently had a P1 related to ephemeral data loss due to how Azure handles automated node upgrades. We already implemented a clarification in docs to recommend disabling automated node upgrades: https://redpandadata.atlassian.net/browse/DOC-875
However, @chrisseto and I met and discussed improvements we can make to the node upgrade guide, including:
Also fixes https://redpandadata.atlassian.net/browse/DOC-170
Page previews
https://deploy-preview-960--redpanda-docs-preview.netlify.app/current/upgrade/k-upgrade-kubernetes/
Checks