Added release server publishing retry#34605
Conversation
|
The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with |
| - docker run -i -v /tmpfs/creds:/tmpfs/creds -e DRONE_REPO -e DRONE_TAG -e RELCLI_BASE_URL | ||
| -e RELCLI_CERT -e RELCLI_KEY $RELCLI_IMAGE auto_publish -f -v 6 || true | ||
| - docker run -i -v /tmpfs/creds:/tmpfs/creds -e DRONE_REPO -e DRONE_TAG -e RELCLI_BASE_URL | ||
| -e RELCLI_CERT -e RELCLI_KEY $RELCLI_IMAGE auto_publish -f -v 6 || true | ||
| - docker run -i -v /tmpfs/creds:/tmpfs/creds -e DRONE_REPO -e DRONE_TAG -e RELCLI_BASE_URL | ||
| -e RELCLI_CERT -e RELCLI_KEY $RELCLI_IMAGE auto_publish -f -v 6 |
There was a problem hiding this comment.
Wouldn't this just always call the command 3 times? Can we instead wrap this in a "for" loop with a few iterations and abort it on success? Also, why 3? I think @logand22 said for him it worked on like 25th attempt, can we just make it retry 50 times or something?
There was a problem hiding this comment.
It worked on second attempt for me. I mentioned 25 because I called the command many times after the first success to make sure it never occurred after the first success.
There was a problem hiding this comment.
If it works for Logan on the second attempt and for me (consistently) on the third, then three or more times should be sufficient.
I can put a shell for loop in here, but adding shell logic to dronegen to be outputted into a yaml shell script is... messy. I took this approach because I am reasonably confident that it will work, and is simple enough to debug issues that may arise with it.
There was a problem hiding this comment.
I think all it needs is || \ on the end instead of || true. Then it is one long list that stops when one of the commands succeeds and fails when all fail.
There was a problem hiding this comment.
But if we do want to do 50 times or whatever, then a loop is a lot better. It shouldn't be that messy - let me knock something up.
Change the drone generation to use a loop to run the `auto_publish`
relcli command instead of listing them one-by-one and loop 10 times
instead of 3. The loop will terminate the first time `relcli` succeeds.
The loop has an `|| false` at the end to ensure the loop command fails
if all invocations of `relcli` fail. With `set -e`, even though the exit
status of the loop is non-zero, the shell seems to continue. With the
`|| false` at the end, it makes it exit on failure. I'm not sure exactly
how drone runs the commands so this may not be necessary but it seems
safer.
e.g.
set -e
for i in $(seq 10); do false && break; done
echo hello
This will echo "hello" even though all invocations inside the loop
failed.
set -e
for i in $(seq 10); do false && break; done || false
echo hello
This will not echo "hello" - `set -e` causes an exit before that command
due to the `|| false`.
|
@fheinecke See the table below for backport results.
|
The release server publishing is failing due to a TLS issue. While it is being resolved I've added a retry to our pipelines to publish three times instead of once.
This should be removed once the issue is fixed.