Skip to content
This repository was archived by the owner on Jan 8, 2024. It is now read-only.

serverinstall/nomad: Spend more time looking at waypoint-runner allocation on install to ensure start up #2698

Merged
merged 4 commits into from
Nov 10, 2021

Conversation

briancain
Copy link
Member

Prior to this commit, the server install command would look for the very
first instance for when an allocation was considered running. This is
generally ok, however if the Nomad job fails to get started a moment
later (like a static runner failing to connect back to Waypoint Server),
the runner allocation would exit later but the server install command
would consider the installation successful.

This is especially important when users are configuring Consul DNS
with Waypoint Server installed to Nomad. The static runner might
fail to properly make a connection through the Consul DNS hostname
and fail, leaving the installation without a static runner but the CLI claiming
the install succeeded.

This commit fixes that by doing a few retries for a few seconds on the scheduled
allocation once its in a "running" state to validate it properly started
up beyond the first few moments of the job.

Fixes #2683

This commit introduces a simple function to wait for a Nomads evaluation
to be running and DRYs up a few places in the install func where we use
it.
Prior to this commit, the server install command would look for the very
first instance for when an allocation was considered running. This is
generally ok, however if the Nomad job fails to get started a moment
later (like a static runner failing to connect back to Waypoint Server),
the runner allocation would exit later but the server install command
would consider the installation successful.

This commit fixes that by doing a few retries on the scheduled
allocation once its in a "running" state to ensure it properly started
up beyond the first few moments of the job.

Fixes #2683
@briancain briancain force-pushed the serverinstall/nomad/wait-longer-for-runner branch from 2295abf to 7dbbc98 Compare November 8, 2021 17:18
Copy link
Contributor

@krantzinator krantzinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this will be a bit friendlier UX when errors happen!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

serverinstall/nomad: Spend more time looking at waypoint-runner allocation on install to ensure start up
3 participants