Snippets and notes about how to fix problems where a task was to complex to set up.
Look at these links on how to spin up a new cluster from backups:
Manual backups:
pg_dump -h 192.168.20.204 -U nextcloud -W -d nextcloud > ./nextcloud_backup.sql
psql -h 192.168.20.204 -U nextcloud -W -d nextcloud < ./nextcloud_backup.sql
There is no easy way of doing this, Cloudnative-PG does not support upgrading major versions.
Checklist:
- Create new manifests for a new cluster in
kubernetes/main/apps/databases/cloudnative-pg/clusters
. Don't forget to add version to names.- DO NOT add a new loadbalancer just yet.
- See https://cloudnative-pg.io/documentation/1.20/database_import/ for more information.
- Scale down services that uses postgres
- Create a new database backup:
kubectl create job --from=cronjob/postgres-backup -n databases major-upgrade-pg-backup
- Deploy the new cluster.
- Update
ext-postgres-operator
config to start using the new cluster - Add a new cronjob for
simple-pg-backup
with matching version. - Migrate each service to the new cluster and don't forget to move backups from old version to new version.
- Delete the old postgres cluster by removing the manifests in
kubernetes/main/apps/databases/cloudnative-pg/clusters
. - Deploy new loadbalancer
I ran in to an issue where I had to reset the Rook-Ceph cluster due to restructuring the repo. I should have been more careful but it was also a learning experience. To fully reset the cluster I had to go through the following steps.:
- Suspend Flux reconciliations:
flux suspend reconciliation kustomization rook-ceph-cluster
andflux suspend reconciliation kustomization rook-ceph-operator
- Delete the file system:
kubectl delete cephfilesystem -n rook-ceph myfs
- Might need to handle finalizers in some caes:
kubectl patch cephfilesystem -n rook-ceph myfs -p '{"metadata":{"finalizers":[]}}' --type=merge
- Delete the cluster:
kubectl delete cephclusters.ceph.rook.io -n rook-ceph rook-ceph
- Handle cluster finalizers:
kubectl patch cephclusters.ceph.rook.io -n rook-ceph rook-ceph -p '{"metadata":{"finalizers":[]}}' --type=merge
- Delete all resources:
kubectl delete all -n rook-ceph --force --grace-period=0
- Delete all CRDs that starts with ceph*
- Wipe disks:
kubectl apply -f kubernetes/tools/rook/wipe-job.yaml
- Reset nodes and reboot:
talosctl reset --system-labels-to-wipe=STATE,EPHEMERAL --reboot --graceful=true -n <IP>
- Apply config again:
talosctl apply-config -n <IP> -f infrastructure/talos/clusterconfig/<CONFIG FILE>.yaml --insecure
- Apply config again:
I had an issue where the /var directory on some of my nodes was filling up. Seems to have been containerd cache that didn't clear correctly. I fixed this by reseting the node and applying the Talos config again:
- Reset nodes and reboot:
talosctl reset --system-labels-to-wipe=STATE,EPHEMERAL --reboot --graceful=true -n <IP>
- Apply config again:
talosctl apply-config -n <IP> -f infrastructure/talos/clusterconfig/<CONFIG FILE>.yaml --insecure
Note to self: do not update over WIFI and remember to scale down zigbee2mqtt pod in the cluster
First upgrade the firmware:
- Goto devices esphome page: http://192.168.70.56/
- Toggle Prep the cc2652p2 for firmware update
- Run:
git clone https://github.com/JelmerT/cc2538-bsl.git
curl -L \
-o CC1352P2_CC2652P_launchpad_coordinator_20210708.zip \
https://github.com/Koenkk/Z-Stack-firmware/blob/master/coordinator/Z-Stack_3.x.0/bin/CC1352P2_CC2652P_launchpad_coordinator_20210708.zip?raw=true
unzip CC1352P2_CC2652P_launchpad_coordinator_20210708.zip
cd cc2538-bsl
python3 ./cc2538-bsl.py -p socket://192.168.70.56:6638 -evw ../CC1352P2_CC2652P_launchpad_coordinator_20210708.hex
If we get the pg_basebackup: error: backup failed: ERROR: file name too long for tar format
error then we need to:
DROP INDEX clip_index;
DROP INDEX face_index;
Get all replicas up and running and then:
SET vectors.pgvector_compatibility=on;
CREATE INDEX IF NOT EXISTS clip_index ON smart_search
USING hnsw (embedding vector_cosine_ops)
WITH (ef_construction = 300, m = 16);
CREATE INDEX IF NOT EXISTS face_index ON face_search
USING hnsw (embedding vector_cosine_ops)
WITH (ef_construction = 300, m = 16);
Fixed this by changeing: k rook-ceph ceph config set mds mds_log_max_segments 256
Use k rook-ceph ceph health detail
to get how far behind it is.