Skip to content

feat: add a controller to label Nodes with PD compatibility labels#2257

Merged
k8s-ci-robot merged 1 commit intokubernetes-sigs:masterfrom
hdp617:feat/pd-node-label
Feb 4, 2026
Merged

feat: add a controller to label Nodes with PD compatibility labels#2257
k8s-ci-robot merged 1 commit intokubernetes-sigs:masterfrom
hdp617:feat/pd-node-label

Conversation

@hdp617
Copy link
Copy Markdown
Contributor

@hdp617 hdp617 commented Dec 18, 2025

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

/kind feature

What this PR does / why we need it:
This PR introduces a new component gce-pd-node-labeler. It's a controller implemented with controller-runtime, which automatically labels Nodes with PD labels. The source of truth is a ConfigMap that can be configured with --configmap-name and --configmap-namespace flags.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:
Re. testing, I've verified with a k8s-on-gce cluster with 2 gen2 nodes and 2 gen4 nodes:

kubectl get nodes -L node.kubernetes.io/instance-type
NAME                               STATUS   ROLES           AGE     VERSION   INSTANCE-TYPE
control-plane-us-central1-a-xp2s   Ready    control-plane   7h18m   v1.33.6   e2-medium
nodes-gen2-6vfp                    Ready    node            7h10m   v1.33.6   n2-standard-4
nodes-gen2-dshk                    Ready    node            7h6m    v1.33.6   n2-standard-4
nodes-gen4-g4z2                    Ready    node            7h1m    v1.33.6   n4-standard-4
nodes-gen4-l17x                    Ready    node            6h57m   v1.33.6   n4-standard-4

Add a configmap (not actually SoT):

apiVersion: v1
kind: ConfigMap
metadata:
  name: machine-pd-compatibility
  namespace: gce-pd-csi-driver
data:
  machine-pd-compatibility.json: |-
    {
      "n2": {
        "pd-standard": true
      },
      "n4": {
        "hyperdisk-balanced": true,
      },
    }

Build the container image and add this component as a sidecar in the csi-gce-pd-controller:

      - args:
        - --v=5
        - --configmap-name=machine-pd-compatibility
        - --configmap-namespace=gce-pd-csi-driver
        image: us-central1-docker.pkg.dev/<PROJECT_ID>/gce-pd-node-labeler/gce-pd-node-labeler:test_linux_amd64
        imagePullPolicy: Always
        name: gce-pd-node-labeler
        resources:
          requests:
            cpu: 10m
            memory: 20Mi
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File

And verify that the labels are added.

Does this PR introduce a user-facing change?:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Dec 18, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @hdp617!

It looks like this is your first PR to kubernetes-sigs/gcp-compute-persistent-disk-csi-driver 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gcp-compute-persistent-disk-csi-driver has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @hdp617. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 18, 2025
@hdp617 hdp617 marked this pull request as ready for review December 18, 2025 17:19
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 18, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 22, 2025
@hdp617 hdp617 force-pushed the feat/pd-node-label branch from fed4ded to d823964 Compare December 22, 2025 21:15
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 22, 2025
@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 5, 2026
@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Jan 12, 2026

/retest

}

// UpdateMachinePDCompatibility updates the controller's machine PD compatibility map from a ConfigMap.
func (r *Reconciler) UpdateMachinePDCompatibility(ctx context.Context) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not seem thread-safe, will there be some race of updating the map?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using a mutex in updateMap() below when updating the map. Any race conditions that you're seeing?

@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

Last but not least, how about documentation update in the repo(feel free to do it in another PR as well)?

@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Jan 13, 2026

@sunnylovestiramisu Thanks for looking into this PR. I've replied to a few comments and will address the rest soon.

@hdp617 hdp617 force-pushed the feat/pd-node-label branch from d823964 to 360a4bb Compare January 13, 2026 17:55
@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Jan 13, 2026

/retest-required

@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Jan 15, 2026

@sunnylovestiramisu @mattcary @hajiler Would you mind taking another look? Thanks!

@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

Please provide end to end testing results before the PR review meeting.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2026
@hdp617 hdp617 force-pushed the feat/pd-node-label branch from 65c4b9c to 3938196 Compare January 30, 2026 19:09
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 30, 2026
@hajiler
Copy link
Copy Markdown
Contributor

hajiler commented Feb 2, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 2, 2026
@hdp617 hdp617 force-pushed the feat/pd-node-label branch from 3938196 to 761f970 Compare February 3, 2026 17:56
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2026
@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Feb 3, 2026

Quick summary of the offline discussion with @hajiler below. Feel free to add if I missed something. Would you mind re-LGTM?

  • Added the logic to remove labels to align with GKE's behavior. This is more for consistency and completeness, in practice this case is unlikely as removing disk support is a breaking change.
  • One consideration is with confidential nodes which can have different PD support. Customers can override the ConfigMap. We can also consider adding an overlay with a ConfigMap for confidential nodes.

I've squashed the commits. @sunnylovestiramisu Could you give the final approval? Thanks!

@hdp617
Copy link
Copy Markdown
Contributor Author

hdp617 commented Feb 3, 2026

/retest-required

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@hdp617: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2019 761f970 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2019
pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022 761f970 link false /test pull-gcp-compute-persistent-disk-csi-driver-e2e-windows-2022

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hdp617, sunnylovestiramisu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2026
@hajiler
Copy link
Copy Markdown
Contributor

hajiler commented Feb 4, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 4, 2026
@k8s-ci-robot k8s-ci-robot merged commit 2866b0c into kubernetes-sigs:master Feb 4, 2026
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants