Skip to content

Log troubleshooting information when InvalidInstanceID errors are found during EC2 discovery#25257

Merged
lxea merged 5 commits intomasterfrom
lxea/instance-state-error-message
May 4, 2023
Merged

Log troubleshooting information when InvalidInstanceID errors are found during EC2 discovery#25257
lxea merged 5 commits intomasterfrom
lxea/instance-state-error-message

Conversation

@lxea
Copy link
Copy Markdown
Contributor

@lxea lxea commented Apr 27, 2023

Fix for #24176

@lxea lxea force-pushed the lxea/instance-state-error-message branch 2 times, most recently from 61663c1 to 92d6e76 Compare April 27, 2023 13:10
Comment thread docs/pages/server-access/guides/ec2-discovery.mdx Outdated
Comment thread docs/pages/server-access/guides/ec2-discovery.mdx Outdated
Comment thread docs/pages/server-access/guides/ec2-discovery.mdx Outdated
@lxea lxea force-pushed the lxea/instance-state-error-message branch 2 times, most recently from 4f4be99 to 86586f1 Compare April 27, 2023 14:29
@ptgott
Copy link
Copy Markdown
Contributor

ptgott commented Apr 27, 2023

@lxea Does this need to be backported?

@lxea lxea force-pushed the lxea/instance-state-error-message branch from 86586f1 to cf65ad2 Compare April 28, 2023 10:00
@lxea lxea enabled auto-merge April 28, 2023 10:00
Comment thread lib/srv/discovery/discovery.go Outdated
@lxea lxea force-pushed the lxea/instance-state-error-message branch from cf65ad2 to 159c9b5 Compare May 2, 2023 10:10
Comment thread docs/pages/server-access/guides/ec2-discovery.mdx
Comment thread lib/srv/discovery/discovery.go Outdated
if trace.IsNotFound(err) {
var aErr awserr.Error
if errors.As(err, &aErr) && aErr.Code() == ssm.ErrCodeInvalidInstanceId {
s.Log.WithError(err).Error("Invalid instance ID found. This can happen if the instance does not have a running SSM agent registered with the SSM endpoint (may require reinstalling the SSM Agent, or giving the instance IAM permissions to receive SSM commands), or the discovery instance does not have permissions to access the node.")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rephrase the error message to:

SSM RunCommand on instance i-xyz123 failed with ErrCodeInvalidInstanceId. Make sure that the instance has AmazonSSMManagedInstanceCore policy assigned. Also check that SSM agent is running and registered with the SSM endpoint on that instance and try restarting or reinstalling it in case of issues. See <link to AWS reference page> for more details.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive updated this with the link but there doesnt seem to be a way to get the actual instances it failed with in the message, but they're included in the error

@lxea lxea force-pushed the lxea/instance-state-error-message branch from 159c9b5 to 2f487f9 Compare May 4, 2023 10:36
Comment thread lib/srv/discovery/discovery.go Outdated
@lxea lxea added this pull request to the merge queue May 4, 2023
@lxea lxea removed this pull request from the merge queue due to a manual request May 4, 2023
@lxea lxea enabled auto-merge May 4, 2023 15:19
@lxea lxea added this pull request to the merge queue May 4, 2023
Merged via the queue into master with commit 73bc17c May 4, 2023
@lxea lxea deleted the lxea/instance-state-error-message branch May 4, 2023 15:50
@public-teleport-github-review-bot
Copy link
Copy Markdown

@lxea See the table below for backport results.

Branch Result
branch/v11 Failed
branch/v12 Create PR
branch/v13 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants