Skip to content

Retrieve node name from metadata service#2279

Merged
k8s-ci-robot merged 1 commit into
kubernetes-sigs:masterfrom
cemakd:urgent
Mar 2, 2026
Merged

Retrieve node name from metadata service#2279
k8s-ci-robot merged 1 commit into
kubernetes-sigs:masterfrom
cemakd:urgent

Conversation

@cemakd
Copy link
Copy Markdown
Contributor

@cemakd cemakd commented Mar 2, 2026

What type of PR is this?
/kind bug

What this PR does / why we need it:
Use the MetadataServer to retrieve Node name similar to #2277

Which issue(s) this PR fixes:
Fixes #2276

Without this fix we see:

Warning  FailedAttachVolume  4m57s (x39 over 80m)  attachdetach-controller  AttachVolume.Attach failed for
 volume "pvc-507bb954-97fa-40aa-98c1-c78e849cac0d" : rpc error: code = InvalidArgument desc = Failed to get
 instance: googleapi: Error 400: Invalid value for field 'instance': 'test-gcp10-wf7zf-worker-c-6kgqn.c.ocpstrat-
1278.internal'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?|[1-9][0-9]{0,19}'

For GCE operations like attach, it is still using the short name as identifier for the operation, which is what we got from the nodeID. We need to switch this back to use the short name instead of the long name case.

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix Attach Failure for VM with Long Name Format

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 2, 2026
@k8s-ci-robot k8s-ci-robot requested review from mattcary and tyuchn March 2, 2026 18:14
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 2, 2026
Undo the changes except line 762

Fix unit tests
@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

Not clear of what error will this cause, because we can use the long name as the fully qualified name.

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 2, 2026
@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

sunnylovestiramisu commented Mar 2, 2026

A sample error is:

Warning  FailedAttachVolume  58m (x6 over 80m)     attachdetach-controller  AttachVolume.Attach failed for volume
 "pvc-507bb954-97fa-40aa-98c1-c78e849cac0d" : rpc error: code = InvalidArgument desc = ControllerPublish not
 permitted on node "projects/ocpstrat-1278/zones/us-central1-c/instances/test-gcp10-wf7zf-worker-c-
6kgqn.c.ocpstrat-1278.internal" due to backoff condition

Warning  FailedAttachVolume  4m57s (x39 over 80m)  attachdetach-controller  AttachVolume.Attach failed for
 volume "pvc-507bb954-97fa-40aa-98c1-c78e849cac0d" : rpc error: code = InvalidArgument desc = Failed to get
 instance: googleapi: Error 400: Invalid value for field 'instance': 'test-gcp10-wf7zf-worker-c-6kgqn.c.ocpstrat-
1278.internal'. Must be a match of regex '[a-z](?:[-a-z0-9]{0,61}[a-z0-9])?|[1-9][0-9]{0,19}'

We should fix the regex instead?

@suzezhang
Copy link
Copy Markdown

We don't want to change the regex for GCE API here. The real issue is that projects/ocpstrat-1278/zones/us-central1-c/instances/test-gcp10-wf7zf-worker-c-6kgqn.c.ocpstrat-1278.internal does not exist in GCE. There is a mismatch between the GCE Instance name and the cluster name:

  • test-gcp10-wf7zf-worker-c-6kgqnis GCE Instance name.
  • test-gcp10-wf7zf-worker-c-6kgqn.c.ocpstrat-1278.internal is the cluster Node name.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Mar 2, 2026
@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 2, 2026
@hajiler
Copy link
Copy Markdown
Contributor

hajiler commented Mar 2, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 2, 2026
@sunnylovestiramisu
Copy link
Copy Markdown
Contributor

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cemakd, sunnylovestiramisu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 2, 2026
@k8s-ci-robot k8s-ci-robot merged commit 0ac530d into kubernetes-sigs:master Mar 2, 2026
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

node-restriction.kubernetes.io/gke-volume-attach-limit-override does not work if k8 cluster node name is different from VM instance name

5 participants