Skip to content

Fix GCP label issues#42466

Closed
atburke wants to merge 4 commits intomasterfrom
atburke/gcp-metadata-yandex
Closed

Fix GCP label issues#42466
atburke wants to merge 4 commits intomasterfrom
atburke/gcp-metadata-yandex

Conversation

@atburke
Copy link
Copy Markdown
Contributor

@atburke atburke commented Jun 5, 2024

This change:

  • Fixes Teleport starting up on isolated cloud nodes.
  • Changes imds errors to logs in lib/service, so an unexpected imds error doesn't prevent Teleport from starting.

Fixes #42312.

Changelog: Fixed crashes related to importing GCP labels

@atburke atburke added test-plan-problem Issues which have been surfaced by running the manual release test plan backport/branch/v15 labels Jun 5, 2024
@github-actions github-actions Bot requested review from greedy52 and jakule June 5, 2024 00:08
Comment thread lib/cloud/gcp/vm.go Outdated
Comment thread lib/service/service.go Outdated
return nil, trace.Wrap(err)
}
id, err := strconv.ParseUint(idStr, 10, 64)
id, err := client.getNumericID(ctx)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if this fails, are we going to continue to call GetTags on a periodic basis and continue to return the same error over and over again?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For getNumericID() in particular, it's called in IsAvailable(), so if it fails (e.g. because we're on Yandex) Teleport won't create a GCP metadata client at all. In general, yes errors here are returned periodically, but we only log them once.

func (l *CloudImporter) Sync(ctx context.Context) error {
tags, err := l.Client.GetTags(ctx)
if err != nil {
if trace.IsNotFound(err) || trace.IsAccessDenied(err) {
// Only show the error the first time around.
l.instanceTagsNotFoundOnce.Do(func() {
l.Log.Warning(l.instanceMetadataHint)
})
return nil
}
return trace.Wrap(err)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this fix #42312?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't, that issue is fixed by the logging changes in lib/service (I originally also tried to parse the *url.Errors, but per feedback on this PR I removed that as it was redundant).

Copy link
Copy Markdown
Contributor

@greedy52 greedy52 Jun 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the logging change fix that?

I see, now we are just logging the error instead of failing. Ignore this comment.

Comment thread lib/service/service.go Outdated
return nil, trace.Wrap(err)
}
id, err := strconv.ParseUint(idStr, 10, 64)
id, err := client.getNumericID(ctx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this fix #42312?

@klizhentas klizhentas closed this Jun 8, 2024
@klizhentas klizhentas deleted the atburke/gcp-metadata-yandex branch June 8, 2024 01:40
@klizhentas
Copy link
Copy Markdown
Contributor

This PR is closed as Yandex Cloud is not supported. The fix should be rescoped to address

GCP and AWS only.

#42312

Please resubmit once you address the comments and changes.

@gravitational gravitational deleted a comment from greedy52 Jun 8, 2024
@gravitational gravitational deleted a comment from atburke Jun 8, 2024
@gravitational gravitational deleted a comment from greedy52 Jun 8, 2024
@gravitational gravitational deleted a comment from atburke Jun 8, 2024
@gravitational gravitational deleted a comment from greedy52 Jun 8, 2024
@gravitational gravitational deleted a comment from atburke Jun 8, 2024
@roman-timoshevskii
Copy link
Copy Markdown

Hi @klizhentas,

According to the documentation Teleport should work on any k8s deployment including bare-metal installations as far as I understand.
So why we cannot just configure it not to check some specific clouds related metadata?
It worked for years in our cloud and now we cannot just update it anymore because of this issue.

Could you please tell if there's any chance to implement some workaround so we can continue using it as before?

Thanks in advance.

@drybalka-s
Copy link
Copy Markdown

drybalka-s commented Jun 11, 2024

I confirm that Yandex Cloud has broken since version 15.3.6
error example

WARN falling back to IMDSv1: operation error ec2imds: getToken, http response error StatusCode: 405, request to EC2 IMDS failed
ERROR: initialization failed
  strconv.ParseUint: parsing "fv53ug7t483ddt97mnih": invalid syntax

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/sm test-plan-problem Issues which have been surfaced by running the manual release test plan

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CrashLoopBackOff in Teleport clusteron GCP with v15.3.7

6 participants