-
Notifications
You must be signed in to change notification settings - Fork 1.5k
AGENT-863: node-joiner cluster script #8242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
openshift-merge-bot
merged 9 commits into
openshift:master
from
andfasano:agent-day2-cluster-script
Apr 26, 2024
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
3a580b0
modify the Dockerfile.ci to build and ship node-joiner tool, along wi…
andfasano 3615943
add node-joiner.sh script and related documentation
andfasano 5a3e543
removed usage of pull-secret.
andfasano 2fd74d7
Apply suggestions from code review
andfasano a60f6d1
move add node docs into its own folder
andfasano 9367c3b
rename add node iso artifact
andfasano 364b2a6
lint fix
andfasano 93890a4
move the node-joiner binary into the baremetal-installer image
andfasano 89bcfdf
update the script to use the cluster pull-secret to retrieve the bare…
andfasano File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| # Adding a node via the node-joiner tool | ||
|
|
||
| ## Pre-requisites | ||
| 1. The `oc` tool must be available in the execution environment (the "user host"). | ||
| 2. The user host has a valid network connection to the target OpenShift cluster to be expanded. | ||
|
|
||
| ## Setup | ||
| 1. Download the [node-joiner.sh](./node-joiner.sh) script in a working directory in | ||
| the user host (the "assets folder"). | ||
| 2. Create a `nodes-config.yaml` in the assets folder. This configuration file must contain the | ||
| list of all the nodes that the user wants to add to the target cluster. At minimum, the name and primary interface MAC address must be specified. For example: | ||
| ``` | ||
| hosts: | ||
| - hostname: extra-worker-0 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:7c | ||
| - hostname: extra-worker-1 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:8c | ||
| - hostname: extra-worker-2 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:9c | ||
| ``` | ||
| 3. Optionally, it's possible to specify - for each node - an `NMState` configuration block denoted below as `networkConfig` | ||
| (it will be applied during the first boot), for example: | ||
| ``` | ||
| hosts: | ||
| - hostname: extra-worker-0 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:7c | ||
| networkConfig: | ||
| interfaces: | ||
| - name: eth0 | ||
| type: ethernet | ||
| state: up | ||
| mac-address: 00:02:46:e3:9e:7c | ||
| ipv4: | ||
| enabled: true | ||
| address: | ||
| - ip: 192.168.111.90 | ||
| prefix-length: 24 | ||
| dhcp: false | ||
| dns-resolver: | ||
| config: | ||
| server: | ||
| - 192.168.111.1 | ||
| routes: | ||
| config: | ||
| - destination: 0.0.0.0/0 | ||
| next-hop-address: 192.168.111.1 | ||
| next-hop-interface: eth0 | ||
| table-id: 254 | ||
| - hostname: extra-worker-1 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:8c | ||
| - hostname: extra-worker-2 | ||
| interfaces: | ||
| - name: eth0 | ||
| macAddress: 00:02:46:e3:9e:9c | ||
|
|
||
| ## ISO generation | ||
| Run the [node-joiner.sh](./node-joiner.sh): | ||
| ```bash | ||
| $ ./node-joiner.sh | ||
| ``` | ||
| The script will generate a temporary namespace prefixed with `openshift-node-joiner` in the target cluster, | ||
| where a pod will be launched to execute the effective node-joiner workload. | ||
| In case of success, the `node.x86_64.iso` ISO image will be downloaded in the assets folder. | ||
|
|
||
| ### Configuration file name | ||
| By default the script looks for a configuration file named `nodes-config.yaml`. It's possible to specify a | ||
| different config file name, as the first parameter of the script: | ||
|
|
||
| ```bash | ||
| $ ./node-joiner.sh config.yaml | ||
| ``` | ||
|
|
||
| ## Nodes joining | ||
| Use the iso image to boot all the nodes listed in the configuration file, and wait for the related | ||
| certificate signing requests (CSRs) to appear. When adding a new node to the cluster, two pending CSRs will | ||
| be generated, and they must be manually approved by the user. | ||
| Use the following command to monitor the pending certificates: | ||
| ``` | ||
| $ oc get csr | ||
| ``` | ||
| User the `oc` `approve` command to approve them: | ||
| ``` | ||
| $ oc adm certificate approve <csr_name> | ||
| ``` | ||
| Once all the pendings certificates will be approved, then the new node will become available: | ||
| ``` | ||
| $ oc get nodes | ||
| NAME STATUS ROLES AGE VERSION | ||
| extra-worker-0 Ready worker 1h v1.29.3+8628c3c | ||
| master-0 Ready control-plane,master 31h v1.29.3+8628c3c | ||
| master-1 Ready control-plane,master 32h v1.29.3+8628c3c | ||
| master-2 Ready control-plane,master 32h v1.29.3+8628c3c | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,144 @@ | ||
| #!/bin/bash | ||
|
|
||
| set -eu | ||
|
|
||
| # Config file | ||
| nodesConfigFile=${1:-"nodes-config.yaml"} | ||
| if [ ! -f "$nodesConfigFile" ]; then | ||
| echo "Cannot find the config file $nodesConfigFile" | ||
| exit 1 | ||
| fi | ||
|
|
||
| # Setup a cleanup function to ensure to remove the temporary | ||
| # file when the script will be completed. | ||
| cleanup() { | ||
| if [ -f "$pullSecretFile" ]; then | ||
| echo "Removing temporary file $pullSecretFile" | ||
| rm "$pullSecretFile" | ||
| fi | ||
| } | ||
| trap cleanup EXIT TERM | ||
|
|
||
| # Retrieve the pullsecret and store it in a temporary file. | ||
| pullSecretFile=$(mktemp -p "/tmp" -t "nodejoiner-XXXXXXXXXX") | ||
| oc get secret -n openshift-config pull-secret -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d > "$pullSecretFile" | ||
|
|
||
| # Extract the baremetal-installer image pullspec from the current cluster. | ||
| nodeJoinerPullspec=$(oc adm release info --image-for=baremetal-installer --registry-config="$pullSecretFile") | ||
|
|
||
| # Use the same random temp file suffix for the namespace. | ||
| namespace=$(echo "openshift-node-joiner-${pullSecretFile#/tmp/nodejoiner-}" | tr '[:upper:]' '[:lower:]') | ||
|
|
||
| # Create the namespace to run the node-joiner, along with the required roles and bindings. | ||
| staticResources=$(cat <<EOF | ||
| apiVersion: v1 | ||
| kind: Namespace | ||
| metadata: | ||
| name: ${namespace} | ||
| --- | ||
| apiVersion: v1 | ||
| kind: ServiceAccount | ||
| metadata: | ||
| name: node-joiner | ||
| namespace: ${namespace} | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRole | ||
| metadata: | ||
| name: node-joiner | ||
| rules: | ||
| - apiGroups: | ||
| - config.openshift.io | ||
| resources: | ||
| - clusterversions | ||
| - proxies | ||
| verbs: | ||
| - get | ||
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - secrets | ||
| - configmaps | ||
| - nodes | ||
| verbs: | ||
| - get | ||
| - list | ||
| --- | ||
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: node-joiner | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| name: node-joiner | ||
| namespace: ${namespace} | ||
| roleRef: | ||
| kind: ClusterRole | ||
| name: node-joiner | ||
| apiGroup: rbac.authorization.k8s.io | ||
| EOF | ||
| ) | ||
| echo "$staticResources" | oc apply -f - | ||
|
|
||
| # Generate a configMap to store the user configuration | ||
| oc create configmap nodes-config --from-file=nodes-config.yaml="${nodesConfigFile}" -n "${namespace}" -o yaml --dry-run=client | oc apply -f - | ||
|
|
||
| # Run the node-joiner pod to generate the ISO | ||
| nodeJoinerPod=$(cat <<EOF | ||
| apiVersion: v1 | ||
| kind: Pod | ||
| metadata: | ||
| name: node-joiner | ||
| namespace: ${namespace} | ||
| annotations: | ||
| openshift.io/scc: anyuid | ||
| labels: | ||
| app: node-joiner | ||
| spec: | ||
| restartPolicy: Never | ||
| serviceAccountName: node-joiner | ||
| securityContext: | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| containers: | ||
| - name: node-joiner | ||
| imagePullPolicy: IfNotPresent | ||
| image: $nodeJoinerPullspec | ||
| volumeMounts: | ||
| - name: nodes-config | ||
| mountPath: /config | ||
| - name: assets | ||
| mountPath: /assets | ||
| command: ["/bin/sh", "-c", "cp /config/nodes-config.yaml /assets; HOME=/assets node-joiner add-nodes --dir=/assets --log-level=debug; sleep 600"] | ||
| volumes: | ||
| - name: nodes-config | ||
| configMap: | ||
| name: nodes-config | ||
| namespace: ${namespace} | ||
| - name: assets | ||
| emptyDir: | ||
| sizeLimit: "4Gi" | ||
| EOF | ||
| ) | ||
| echo "$nodeJoinerPod" | oc apply -f - | ||
|
|
||
| while true; do | ||
| if oc exec node-joiner -n "${namespace}" -- test -e /assets/exit_code >/dev/null 2>&1; then | ||
| break | ||
| else | ||
| echo "Waiting for node-joiner pod to complete..." | ||
| sleep 10s | ||
| fi | ||
| done | ||
|
|
||
| res=$(oc exec node-joiner -n "${namespace}" -- cat /assets/exit_code) | ||
| if [ "$res" = 0 ]; then | ||
| echo "node-joiner successfully completed, extracting ISO image..." | ||
| oc cp -n "${namespace}" node-joiner:/assets/node.x86_64.iso node.x86_64.iso | ||
| else | ||
| oc logs node-joiner -n "${namespace}" | ||
| echo "node-joiner failed" | ||
| fi | ||
|
|
||
| echo "Cleaning up" | ||
andfasano marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| oc delete namespace "${namespace}" --grace-period=0 >/dev/null 2>&1 & | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RBAC for Secrets in all namespaces is something to tidy up at some point.