-
Notifications
You must be signed in to change notification settings - Fork 128
OCPBUGS-60510: Rebase v1.32.8 to 4.19 #2411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Kubernetes official release v1.32.4
OCPBUGS-55265: Bump 1.32.4
…very feature status
…cated. We inadvertently began serving these in the default feature set beginning with 4.17.0. We intend to stop serving them in 4.20.0 and to treat this as a typical deprecated API removal.
…rry-pick-2282-to-release-4.19 [release-4.19] OCPBUGS-55895: Fix node expansion on older kubelets
The test that checks that a volume can be accessed from multiple nodes should create ReadWriteMany volume and not ReadWriteOnce.
…rry-pick-2279-to-release-4.19 [release-4.19] OCPBUGS-56193: UPSTREAM: 131236: RWX tests should create RWX volumes
Co-authored-by: Hemant Kumar <[email protected]> Signed-off-by: carlory <[email protected]>
…rry-pick-2288-to-release-4.19 [release-4.19] OCPBUGS-56256: UPSTREAM: 131495: Handle unsupported node expansion for RWX volumes
Kubernetes official release v1.32.5
OCPBUGS-56437: Bump 4.19 1.32.5
…rry-pick-2277-to-release-4.19 [release-4.19] OCPBUGS-56082: UPSTREAM: 130047: adjusting loopback certificate validity in kube-apiserver
Once received job deletion event, it cleans the backoff records for that job before enqueueing this job so that we can avoid a race condition that the syncJob() may incorrect use stale backoff records for a newly created job with same key. Co-authored-by: Michal Wozniak <[email protected]>
…rry-pick-2287-to-release-4.19 [release-4.19] OCPBUGS-55937: UPSTREAM: <carry>: Mark admissionregistration.k8s.io/v1beta1 as deprecated.
….patch instead of just major.minor
…icies in VAP integration tests
…rry-pick-2346-to-release-4.19 [release-4.19] OCPBUGS-58175: Fix flake caused by invalid detection of active policies in VAP integration tests
Kubernetes official release v1.32.6
…rry-pick-2319-to-release-4.19 [release-4.19] NO-JIRA: UPSTREAM: <carry>: Update rebase.sh to handle go versions major.minor.patch
[release-1.32][go] Bump images, dependencies and versions to go 1.23.11 and distroless iptables
Kubernetes official release v1.32.7
…tial, add TODOs see: kubernetes#130271
…62-release-1.32 Cherrypick 133262 remove broken test that depends on expired credential onto Release 1.32
The podresources API List implementation uses the internal data of the resource managers as source of truth. Looking at the implementation here: https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/apis/podresources/server_v1.go#L60 we take care of syncing the device allocation data before querying the device manager to return its pod->devices assignment. This is needed because otherwise the device manager (and all the other resource managers) would do the cleanup asynchronously, so the `List` call will return incorrect data. But we don't do this syncing neither for CPUs or for memory, so when we report these we will get stale data as the issue kubernetes#132020 demonstrates. For CPU manager, we however have the reconcile loop which cleans the stale data periodically. Turns out this timing interplay was actually the reason the existing issue kubernetes#119423 seemed fixed (see: kubernetes#119423 (comment)). But it's actually timing. If in the reproducer we set the `cpuManagerReconcilePeriod` to a time very high (>= 5 minutes), then the issue still reproduces against current master branch (https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/test/e2e_node/podresources_test.go#L983). Taking a step back, we can see multiple problems: 1. not syncing the resource managers internal data before to query for pod assignment (no removeStaleState calls) but most importantly 2. the List call iterate overs all the pod known to the kubelet. But the resource managers do NOT hold resources for non-running pod, so it is better, actually it's correct to iterate only over the active pods. This will also avoid issue 1 above. Furthermore, the resource managers all iterate over the active pods anyway: `List` is using all the pods known about: 1. https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/kubelet.go#L3135 goes in 2. https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/pod/pod_manager.go#L215 But all the resource managers are using the list of active pods: 1. https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/kubelet.go#L1666 goes in 2. https://github.com/kubernetes/kubernetes/blob/v1.34.0-alpha.0/pkg/kubelet/kubelet_pods.go#L198 So this change will also make the `List` view consistent with the resource managers view, which is also a promise of the API currently broken. We also need to acknowledge the the warning in the docstring of GetActivePods. Arguably, having the endpoint using a different podset wrt the resource managers with the related desync causes way more harm than good. And arguably, it's better to fix this issue in just one place instead of having the `List` use a different pod set for unclear reason. For these reasons, while important, I don't think the warning per se invalidated this change. We need to further acknowledge the `List` endpoint used the full pod list since its inception. So, we will add a Feature Gate to disable this fix and restore the old behavior. We plan to keep this Feature Gate for quite a long time (at least 4 more releases) considering how stable this change was. Should a consumer of the API being broken by this change, we have the option to restore the old behavior and to craft a more elaborate fix. The old `v1alpha1` endpoint will be not modified intentionally. ***RELEASE-4.19 BACKPORT NOTE*** dropped the versioned feature gate entry as we don't have the versioned geature gates in this version. Signed-off-by: Francesco Romani <[email protected]>
In order to facilitate backports (see OCPBUGS-56785) we prefer to remove the feature gate added as safety measure upstream and disable this escape hatch upstream added. This commit must be dropped once we rebase on top of 1.34. Signed-off-by: Francesco Romani <[email protected]>
OCPBUGS-59534: Rebase v1.32.7 to 4.19
…ease-1.32 Update NodeRestriction to prevent nodes from updating their OwnerReferences
…ive-pods-backport-4.19 OCPBUGS-60074: UPSTREAM: 132028: podresources: list: use active pods
Kubernetes official release v1.32.8
|
@dusk125: This pull request references Jira Issue OCPBUGS-60510, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@dusk125: This pull request references Jira Issue OCPBUGS-60510. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dusk125 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@dusk125: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
No description provided.