Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -151,12 +151,26 @@ objects:
break
done

function isRouterReady() {
local deployment
deployment=${1:-"router/deploy"}
local ns
ns=${2:-"openshift-ingress"}
if [[ "${deployment}" == "ds/router-default" ]]; then
ready=$(oc get "${deployment}" -n "${ns}" -o go-template='{{ne "0" (print .status.numberReady)}}')
[[ "${ready}" == "true" ]]
return
fi

oc wait "${deployment}" -n "${ns}" --for condition=available --timeout=10m
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this --timeout cause each iteration to wait 10 minutes in addition to the 60 seconds in the loop?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that was the original wait code, it was (oc wait 10 mins + 1 min sleep) * 10 retries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why? Seems like a good way to waste like 10 minutes during setup... it takes a matter of seconds for the router to deploy

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or does this have to account for a large portion of the general cluster setup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it would depend on how fast the deployment occurs ... this will return back (fail) if there is no deployment object but once the deployment object exists, it would still need to wait on it becoming available. So I guess rather than loop and retry the oc wait operation, just waiting on it inline makes sense.
And it wouldn't necessarily be 10 minutes (that's the worst case scenario) - if the object is available it would return back prior to the timeout.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were looping waiting because of a bug in wait.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a sketch of something that may be more robust for looping on blocking calls.

}

i=0
MAX_RETRIES=10
until oc wait "${ROUTER_DEPLOYMENT}" -n "${ROUTER_NAMESPACE}" --for condition=available --timeout=10m || [ $i -eq $MAX_RETRIES ]; do
until isRouterReady "${ROUTER_DEPLOYMENT}" "${ROUTER_NAMESPACE}" || [ $i -eq $MAX_RETRIES ]; do
i=$((i+1))
[ $i -eq $MAX_RETRIES ] && echo "timeout waiting for router to be available" && exit 1
echo "error deploy/router did not come up"
[ $i -eq $MAX_RETRIES ] && echo "timeout waiting for ${ROUTER_NAMESPACE}/${ROUTER_DEPLOYMENT} to be available" && exit 1
echo "error ${ROUTER_NAMESPACE}/${ROUTER_DEPLOYMENT} did not come up"
sleep 60
done

Expand Down