Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,77 @@ objects:
/bin/openshift-install --dir=/tmp/artifacts/installer create cluster &
wait "$!"

# Stream "journalctl -u bootkube" and "kubectl get all -a --all-namespaces -w" into artifact
- name: stream-bootstrapping
image: ${IMAGE_INSTALLER}
volumeMounts:
- name: shared-tmp
mountPath: /tmp/shared
- name: cluster-profile
mountPath: /etc/openshift-installer
- name: artifacts
mountPath: /tmp/artifacts
command:
- /bin/bash
- -c
- |
#!/bin/bash
trap 'kill $(jobs -p); exit 0' TERM

mkdir -p .ssh
chmod 700 .ssh
cp /etc/openshift-installer/ssh-privatekey .ssh/id_rsa
cp /etc/openshift-installer/ssh-publickey .ssh/id_rsa.pub

function stream-bootkube () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super complicated. Why isn't the installer fetching the bootkube logs if anything fails?

ip=${1}
while true; do
ssh -o "StrictHostKeyChecking=no" core@${ip} sudo journalctl -u bootkube.service -f --no-tail 2>&1
echo "=================== journalctl terminated ==================="
date
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we embed this in the termination marker? It feels like it might be associated with the next iteration's logs if it comes after a big banner. Something like:

echo "========== journalctl terminated $(date --iso=s --utc) =========="

would do it.

sleep 5
done > /tmp/artifacts/bootstrap/bootkube.log
}

function stream-kubectl-get-all () {
ip=${1}
while true; do
ssh -o "StrictHostKeyChecking=no" core@${ip} sudo kubectl get all --kubeconfig=/var/opt/tectonic/auth/kubeconfig -a --all-namespaces -w 2>&1
echo "=================== kubectl get all ... -w terminated ==================="
date
sleep 5
done > /tmp/artifacts/bootstrap/kubectl-get-all.log
}

function start-streams () {
# wait for terraform to start up
while [ ! -f /tmp/artifacts/installer/terraform.tfstate ]; do sleep 5; done

# wait for the bootstrap node to show up with an IP
ip=""
while [ -z "${ip}" ]; do
ip=$(terraform state show -state=terraform.tfstate module.bootstrap.aws_instance.bootstrap | sed -n 's/^public_ip *= *//p')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smarterclayton wanted this IP in the build logs (I think?), I'm just not clear if there's a way to get information there from a pod container. Any ideas?

done

# try to login
while ! ssh -o "StrictHostKeyChecking=no" core@${ip} /bin/true; do
sleep 5
done

# start streaming
mkdir -p /tmp/artifacts/bootstrap
stream-bootkube ${ip} &
stream-kubectl-get-all ${ip} &
}

start-streams
for i in `seq 1 180`; do
if [[ -f /tmp/shared/exit ]]; then
exit 0
fi
sleep 60 & wait
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the backgrounded sleep here.

$ (sleep 10 && echo 'done sleeping') & (wait && echo 'done waiting')
done waiting  # this shows up very quickly
done sleeping  # this is delayed by 10 seconds

Why have a non-blocking sleep? I'd expect sleep here with a trailing kill:

for i in $(seq 1 120); do
  if [[ -f /tmp/shared/exit ]]; then
    break
  fi
  sleep 60
done
kill-streams

where kill-streams had internal wait calls (possibly using explicit PIDs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the loop with the sleep is copied from the teardown script below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the kill-streams is not necessary due to the TERM trap

done

# Performs cleanup of all created resources
- name: teardown
image: ${IMAGE_INSTALLER}
Expand Down