Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -373,8 +373,7 @@ objects:
exit 1
fi

/bin/openshift-install --dir=/tmp/artifacts/installer create cluster &
wait "$!"
/bin/openshift-install --dir=/tmp/artifacts/installer create cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an artifact of some previous refactoring. Backgrounding the process and then immediately waiting on it should be equivalent to just running it in the foreground.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed #2680, rolling this back and adding an additional wait to our TERM handlers. As discussed in that PR, we need the backgrounded create cluster so the shell will fire the TERM handler which TERMs the child processes. As master stands (with this PR but without #2680), aborting an e2e job will TERM the setup container's shell process, but that process will not forward the TERM to openshift-install and will wait until it exits normally. I haven't actually tested with a wrapping container, but my guess is that this means that jobs being aborted during openshift-install are mostly hitting their gentle-shutdown timeout and getting KILLed. With #2680, the forwarded TERM gives the installer a chance at a gentle shutdown (although at the moment, I don't think the installer takes advantage of that opportunity).


# Performs cleanup of all created resources
- name: teardown
Expand Down Expand Up @@ -421,7 +420,34 @@ objects:
export PATH=$PATH:/tmp/shared

echo "Gathering artifacts ..."
mkdir -p /tmp/artifacts/pods /tmp/artifacts/nodes /tmp/artifacts/metrics
mkdir -p /tmp/artifacts/pods /tmp/artifacts/nodes /tmp/artifacts/metrics /tmp/artifacts/bootstrap

if [ -f /tmp/artifacts/installer/terraform.tfstate ]
then
# we don't have jq, so the python equivalent of
# jq '.modules[].resources."aws_instance.bootstrap".primary.attributes."public_ip" | select(.)'
bootstrap_ip=$(python -c \
'import sys, json; d=reduce(lambda x,y: dict(x.items() + y.items()), map(lambda x: x["resources"], json.load(sys.stdin)["modules"])); k="aws_instance.bootstrap"; print d[k]["primary"]["attributes"]["public_ip"] if k in d else ""' \
< /tmp/artifacts/installer/terraform.tfstate
)

if [ -n "${bootstrap_ip}" ]
then
for service in bootkube openshift kubelet crio
do
queue "/tmp/artifacts/bootstrap/${service}.service" curl \
--insecure \
--silent \
--connect-timeout 5 \
--retry 3 \
--cert /tmp/artifacts/installer/tls/journal-gatewayd.crt \
--key /tmp/artifacts/installer/tls/journal-gatewayd.key \
--url "https://${bootstrap_ip}:19531/entries?_SYSTEMD_UNIT=${service}.service"
done
fi
else
echo "No terraform statefile found. Skipping collection of bootstrap logs."
fi

oc --request-timeout=5s get nodes -o jsonpath --template '{range .items[*]}{.metadata.name}{"\n"}{end}' > /tmp/nodes
oc --request-timeout=5s get pods --all-namespaces --template '{{ range .items }}{{ $name := .metadata.name }}{{ $ns := .metadata.namespace }}{{ range .spec.containers }}-n {{ $ns }} {{ $name }} -c {{ .name }}{{ "\n" }}{{ end }}{{ range .spec.initContainers }}-n {{ $ns }} {{ $name }} -c {{ .name }}{{ "\n" }}{{ end }}{{ end }}' > /tmp/containers
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -338,8 +338,7 @@ objects:
exit 1
fi

/bin/openshift-install --dir=/tmp/artifacts/installer create cluster &
wait "$!"
/bin/openshift-install --dir=/tmp/artifacts/installer create cluster
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you touch this? Won't it break the /tmp/exit-setting trap we set at the beginning of this script? Previous discussion in d862f6f (#1957).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, previous discussion here. I still think this change breaks the trap.


# Performs cleanup of all created resources
- name: teardown
Expand Down Expand Up @@ -386,7 +385,34 @@ objects:
export PATH=$PATH:/tmp/shared

echo "Gathering artifacts ..."
mkdir -p /tmp/artifacts/pods /tmp/artifacts/nodes /tmp/artifacts/metrics
mkdir -p /tmp/artifacts/pods /tmp/artifacts/nodes /tmp/artifacts/metrics /tmp/artifacts/bootstrap

if [ -f /tmp/artifacts/installer/terraform.tfstate ]
then
# we don't have jq, so the python equivalent of
# jq '.modules[].resources."aws_instance.bootstrap".primary.attributes."public_ip" | select(.)'
bootstrap_ip=$(python -c \
'import sys, json; d=reduce(lambda x,y: dict(x.items() + y.items()), map(lambda x: x["resources"], json.load(sys.stdin)["modules"])); k="aws_instance.bootstrap"; print d[k]["primary"]["attributes"]["public_ip"] if k in d else ""' \
< /tmp/artifacts/installer/terraform.tfstate
)

if [ -n "${bootstrap_ip}" ]
then
for service in bootkube openshift kubelet crio
do
queue "/tmp/artifacts/bootstrap/${service}.service" curl \
--insecure \
--silent \
--connect-timeout 5 \
--retry 3 \
--cert /tmp/artifacts/installer/tls/journal-gatewayd.crt \
--key /tmp/artifacts/installer/tls/journal-gatewayd.key \
--url "https://${bootstrap_ip}:19531/entries?_SYSTEMD_UNIT=${service}.service"
done
fi
else
echo "No terraform statefile found. Skipping collection of bootstrap logs."
fi

oc --request-timeout=5s get nodes -o jsonpath --template '{range .items[*]}{.metadata.name}{"\n"}{end}' > /tmp/nodes
oc --request-timeout=5s get pods --all-namespaces --template '{{ range .items }}{{ $name := .metadata.name }}{{ $ns := .metadata.namespace }}{{ range .spec.containers }}-n {{ $ns }} {{ $name }} -c {{ .name }}{{ "\n" }}{{ end }}{{ range .spec.initContainers }}-n {{ $ns }} {{ $name }} -c {{ .name }}{{ "\n" }}{{ end }}{{ end }}' > /tmp/containers
Expand Down