-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: add self-hosted runners #278
Comments
A self-hosted runner is ready. I just need to boot it. |
This issue is about having more than one. We run quite some jobs in parallel, and it would quickly become another bottleneck |
@Itxaka I'll take care of the single node only setup then 👍 |
I will have a look at some heat templates to easily add/remove ecp self-hosted runners |
@Itxaka added a sample k8s deployment of gh runner here: https://github.com/rancher-sandbox/cOS-toolkit/wiki/Github-runner-on-k8s |
This is a POC of a script to deploy github runners in ECP: #304 Tested and working, see readme for details Works...ok-ish. Workflows would need some adaptations to fully work, user-data migth need also adaptations if we want to use this, but it makes no sense to develop it further.
|
#319 should fix the problem for the time being. The templating mechanism support to switch to local-runners - I've added ~8 of them without noticing notable peformance gains except the increase parallelism. Although that wouldn't last long as we run many more parallel jobs than 8 for each run. Pipeline has been reworked and build times have been shrinked - the template supports using the local-runner only as build node and not as test-node (as requires virtualization and such). To bring up the workers, I've created a cOS VM with the following cloud-init config: name: "Default user"
stages:
boot:
- name: "Hostname and setup"
hostname: "cos-node-1"
commands:
- echo 1 > /proc/sys/net/ipv6/conf/all/disable_ipv6
dns:
nameservers:
- X
# commands:
# - passwd -d root
network:
- name: "Setup SSH keys"
authorized_keys:
admincos:
- github:mudler
root:
- github:mudler
- if: '[ -z "$(blkid -L COS_SYSTEM || true)" ]'
name: "Load persisted ssh fingerprint"
commands:
- |
# load ssh fingerprint
if [ ! -d /usr/local/etc/ssh ]; then
systemctl start sshd
mkdir /usr/local/etc/ssh || true
for i in /etc/ssh/*.pub; do cp -rf $i /usr/local/etc/ssh; done
fi
- name: "Setup k3s"
if: '[ -z "$(blkid -L COS_SYSTEM || true)" ]'
directories:
- path: "/usr/local/bin"
permissions: 0755
owner: 0
group: 0
commands:
- |
curl -sfL https://get.k3s.io | \
INSTALL_K3S_VERSION="v1.20.4+k3s1" \
INSTALL_K3S_EXEC="--tls-san additional-outside-ip" \
INSTALL_K3S_SELINUX_WARN="true" \
sh -
initramfs:
- if: '[ -z "$(blkid -L COS_SYSTEM || true)" ]'
name: "Persist"
commands:
- |
target=/usr/local/.cos-state
# Always want the latest update of systemd conf from the image
mkdir -p ${target}/etc/systemd/
rsync -av /etc/systemd/ ${target}/etc/systemd/
# Only populate ssh conf once
if [ ! -e ${target}/etc/ssh ]; then
mkdir -p ${target}/etc/ssh/
rsync -av /etc/ssh/ ${target}/etc/ssh/
fi
# make /tmp tmpfs
cp -f /usr/share/systemd/tmp.mount ${target}/etc/systemd/system/
# undo /home /opt mount from cos immutable-rootfs module
sed -i '/overlay \/home /d' /etc/fstab
sed -i '/overlay \/opt /d' /etc/fstab
umount /home
umount /opt
# setup directories as persistent
for i in root opt home var/lib/rancher var/lib/kubelet etc/systemd etc/rancher etc/ssh usr/libexec; do
mkdir -p ${target}/$i /$i
mount ${target}/$i /$i -t none -o bind
done
# This is hidden so that if you run some selinux label checking or relabeling the bind
# mount won't screw up things. If you have two files at different paths they will get
# labeled with two different labels.
mkdir -p ${target}/empty
mount ${target}/empty ${target} -o bind,ro
# persist machine-id
if [ -s /usr/local/etc/machine-id ]; then
cat /usr/local/etc/machine-id > /etc/machine-id
else
mkdir -p /usr/local/etc
cp /etc/machine-id /usr/local/etc
fi
# ensure /var/log/journal exists so it's labeled correctly
mkdir -p /var/log/journal
- name: "Setup users"
users:
admincos:
homedir: "/home/admincos"
- name: "groups"
ensure_entities:
- entity: |
kind: "group"
group_name: "wheel"
password: "x"
gid: 1020
users: "admincos"
files:
- path: "/etc/sudoers.d/wheel"
owner: 0
group: 0
permission: 0600
content: |
%wheel ALL=(ALL) NOPASSWD: ALL
- path: "/etc/modprobe.d/ipv6.conf"
owner: 0
group: 0
permission: 0664
content: |
alias net-pf-10 off
alias ipv6 off
options ipv6 disable_ipv6=1 For the GH deployment, I've followed https://github.com/rancher-sandbox/cOS-toolkit/wiki/Github-runner-on-k8s |
…her#278) This commit makes upgrade|reset|install to create and upgrade `state.yaml` file including system wide data (deployed images, partition labels, etc.) It introduces the concept of installation state and stores such a data in `state.yaml` file in two different locations, state partition root and recovery partition root. The purpose of this duplication is to be able to always find the state.yaml file in a known location regardless of the image we are booting.
We are at capacity with the GHA concurrent jobs limits, and this slows down development quite a lot.
Let's see if we can configure AWS spot instances or w/e else can provide a bunch of runner to be run in this repository.
e.g. by following among the lines of https://github.com/philips-labs/terraform-aws-github-runner to run our tests on. We have also to figure out if we can run vbox (or qemu) to run our test suite on top.
The text was updated successfully, but these errors were encountered: