This repository has been archived by the owner on Jan 5, 2022. It is now read-only.
forked from poseidon/typhoon
-
Notifications
You must be signed in to change notification settings - Fork 0
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Set Kubelet search path for flexvolume plugins to /var/lib/kubelet/volumeplugins * Add support for flexvolume plugins on AWS, GCE, and DO * See 9548572 which added flexvolume support for bare-metal
* Author no longer works for CoreOS / Red Hat * Typhoon development continues as usual
* Fix digital-ocean module to pass ssh_fingerprints as a list since the module accepts a list
* Upcoming releases may begin to use features that require the `terraform-provider-ct` plugin v0.2.1 * New users should use `terraform-provider-ct` v0.2.1. Existing users can safely drop-in replace their v0.2.0 plugin with v0.2.1 as well (location referenced in ~/.terraformrc). * See poseidon#145
* Template terraform-render-bootkube's multi-line kubeconfig output using the right indentation * Add `kubeconfig` variable to google-cloud controllers and workers Terraform submodules * Remove `kubeconfig_*` variables from google-cloud controllers and workers Terraform submodules
* etcd_service_ip dates back to deprecated self-hosted etcd
* Don't need to define a specific dated image. Managed instance groups do not delete instances when new images are released to a channel
* Set defaults for internal worker module's count, machine_type, and os_image * Allow "pools" of homogeneous workers to be created using the google-cloud/kubernetes/workers module
* Allow groups of workers to be defined and joined to a cluster (i.e. worker pools) * Move worker resources into a Terraform submodule * Output variables needed for passing to worker pools * Add usage docs for AWS worker pools (advanced)
* This reverts commit cce4537. * Provider passing to child modules is complex and the behavior changed between Terraform v0.10 and v0.11. We're continuing to allow both versions so this change should be reverted. For the time being, those using our internal Terraform modules will have to be aware of the minimum version for AWS and GCP providers, there is no good way to do enforcement.
* Fix issue where worker firewall rules didn't apply to additional workers attached to a GCP cluster using the new "worker pools" feature (unreleased, poseidon#148). Solves host connection timeouts and pods not being scheduled to attached worker pools. * Add `name` field to GCP internal worker module to represent the unique name of of the worker pool * Use `cluster_name` field of GCP internal worker module for passing the name of the cluster to which workers should be attached
* Ensure consistency between AWS and GCP platforms
* Annotate Prometheus service to scrape metrics from Prometheus itself (enables Prometheus* alerts) * Update kube-state-metrics addon-resizer to 1.7 * Use port 8080 for kube-state-metrics * Add PrometheusNotIngestingSamples alert rule * Change K8SKubeletDown alert rule to fire when 10% of kubelets are down, not 1% * prometheus-operator/prometheus-operator#1032
* Calico isn't viable on Digital Ocean because their firewalls do not support IP-IP protocol. Its not viable to run a cluster without firewalls just to use Calico. * Remove the caveat note. Don't allow users to shoot themselves in the foot
* Remove optional machine_type variable on Google Cloud * Use controller_type and worker_type instead
* Previously, etcd secrets were erroneously distributed to worker nodes (permissions 500, ownership etc:etcd).
* AWS and Google Cloud make use of auto-scaling groups and managed instance groups, respectively. As such, the kubeconfig is already held in cloud user-data * Controller instances are provisioned with a kubeconfig from user-data. Its redundant to use a Terraform remote file copy step for the kubeconfig.
This reverts commit c59a9c6.
* Change EBS volume type from `standard` ("prior generation) to `gp2`. Prometheus alerts are tuned for SSDs * Other platforms have fast enough disks by default
* Use etcd v3.3 --listen-metrics-urls to expose only metrics data via http://0.0.0.0:2381 on controllers * Add Prometheus discovery for etcd peers on controller nodes * Temporarily drop two noisy Prometheus alerts
* Expose etcd metrics to workers so Prometheus can run on a worker, rather than a controller * Drop temporary firewall rules allowing Prometheus to run on a controller and scrape targes * Related to poseidon#175
* Kubernetes recommends using the alias to fetch images from the nearest GCR regional mirror, to abstract the use of GCR, and to drop names containing 'google' * https://groups.google.com/forum/#!msg/kubernetes-dev/ytjk_rNrTa0/3EFUHvovCAAJ
* Terraform v0.11.4 introduced changes to remote-exec that mean Typhoon bare-metal clusters require multiple runs of terraform apply to ssh and bootstrap. * Bare-metal installs PXE boot a live instance to install to disk and then reboot from disk as controllers/workers. Terraform remote-exec has no way to "know" to wait until the reboot has occurred to kickoff Kubernetes bootstrap. Previously Typhoon created a "debug" user during this install phase to allow an admin to SSH, but remote-exec would hang, trying to connect as user "core". Terraform v0.11.4 changes this behavior so remote-exec fails and a user must re-run terraform apply until succeeding. * A new way to "trick" remote-exec into waiting for the reboot into the disk install is to run SSH on a non-standard port during the disk install. This retains the ability for an admin to SSH during install (most distros don't have this) and fixes the issue so only a single run of terraform apply is needed. * hashicorp/terraform#17359 (comment)
* poseidon#145 * Additional users can be easily added upstream
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merges our changes with Typhoon.
New features
Removals
var.ssh_authorized_keys
(list) is reverted tovar.ssh_authorized_key
(single string). I will generate a CL config with users/keys in the kubernetes project.