-
-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Container Linux config additive customizations #140
Comments
The issue with terraform-provider-ct patch#22 causing Terraform diffs was determined to be related to the provider updating the Ignition version. See the full description. In short, users can safely update between |
Btw, the current documentation explains how to install this in |
Thanks @alban, may incorporate that suggestion soon. For the moment, since we instructed users to use the |
Container Linux Config snippets should be shipping for AWS, Digital Ocean, and Google Cloud in the next release (v1.9.5). I need to put some more work and thought into the bare-metal variant, its harder. |
any progress for bare metal? it would be great to be able to natively set things like http/https proxy or custom CA's via this functionality. Its the main reason why I have to maintain a fork. |
There are two paths forward I've been pondering for some time. Background:The story for cloud providers is simpler. Workers are homogenous, they are identical. Controllers are homogenous (at least, I feel ok accepting clc snippets that apply equally to all controllers). Any need for different groups of workers (e.g. stable vs alpha channel, instance type, GPU) is effectively handled by worker pools, groups of homogenous nodes that accept the same clc_snippets. Feature requests to treat any individual node as special and have a special clc_snippet for that node can be shot down. On bare-metal there are plenty of legitimate use-cases where the clc snippet customizations must vary by node. For example, different nodes have different numbers of disks so you'd create disk arrays differently per-node. Or different nodes with different NIC arrangements (some of mine have 4 bonded NICs, some just have 2, some just have 1). In the same way we ask users to supply an ordered list of names, domains, mac addresses, to support customization on bare-metal properly we can't take just a list of snippets.
Flavor 1To support real-world bare-metal use-cases where there are clusters with heterogenous hardware, we'd the clc snippet mechanism to be an optional list of lists, the outer list being sorted. So the first element is the list of snippets for the 0th controller, etc. It becomes tricky to read, explain to people, and reason about so I'm still kinda unhappy with what that would look like. Its not a question of whether we could, its whether there is a better way. I'd like to revisit schemes to try to organize these fields as a list of maps if possible. It was unreasonable to represent in Terraform last I checked, so still in the "wait until it is right" mode. Flavor 2The 2nd path is to treat bare-metal like cloud providers and not permit per-node customization. Controllers must all have the same customization. Workers must all have the same customizations. This would be the fastest route to adding the feature today, but one that I'm quite unlikely to take. Bare-metal is different from cloud. This approach misses many real-world use-cases and this feels like the wrong approach. I do care about this issue too. For one of my specialized bare-metal clusters (heterogeneous hardware), I too have a small fork https://github.com/dghubble/typhoon/tree/disk-retention and want to see the right solution in place. |
Thanks for the detailed update. I agree with the complexity of Flavor 1. I currently leverage controller_networkds/worker_networkds in this fashion to build bonds. My hardware is heterogeneous, but luckily with systemd network wildcarding I can define them in a way that works on all hosts and having the list is somewhat redundant as each value is the same. The same applies for the rest of my additional config. While being able to apply special configs per node can be handy, I often find there are ways to make a single unified definition work across the fleet. Granted I am not doing anything disk or storage related as you have mentioned. At the moment Flavor 2 would be enough for me to throw away my branch. |
Very much looking forward to having this feature available on bare metal as well! Maybe there is a third option, consisting of leveraging the fact that config templates can contain Go templates elements? If it's made easy, using variables, to define node-specific metadata (that goes in the matchbox group file for that node), then I think it could be acceptable for all nodes to share the same snippets. It's in the snippet that, if required, some logic is put to behave differently based on the node's metadata. That's the user's responsibility to write them. Example: I have nodes with 3 NICs, and on those I want to statically assign IPs in the 3 different subnets they are connected to. I run terraform apply to generate all the items, including the matchbox items. Using controller_networkds, today, the following custom static snippet is pushed to ALL controllers
The only thing I have left to do is, before booting a node, to edit its generated matchbox group file and add a metadata section in there, like so
This way I have host-specific config which is generated, based on the metadata associated with each node, even if the snippet is shared by all nodes. An easy way to pass this metadata to Terraform is all that is missing, at least for that kind of use cases. |
I made a little skeleton of another potential implementation, which I think would be fairly simple to use from a end-user's perspective. I applied it to custom networkd fragments only, using the existing controller_networkds and worker_networkds. Of course something similar could be applied to other parts of the genereted ignition templates. Could be used to generate custom metadata as well (put in the matchbox group files) It goes something like this. A new variable is introduced, to tell if you want to define custom networkd config fragments or not. In this folder, must be present one fragment per node (if a node does not require any custom config, an empty fragment called like the node must still exist) And that's basically it. If you activate the feature, you drop fragments in the folder (one fragment per node) and they're put in the generated ignition templates for the respective nodes. Implementation new variable
generate the list of fragments
and finally pass them along to Typhoon
What do you think? |
I wanted a solution that supported merging multiple Container Linux config snippets, with potentially different sets for each bare-metal machine. Much more real-world testing is needed, but I believe #269 will achieve this goal. Rejected ApproachesA list of lists turns out to be more difficult than imaged. Even working around the usual HCL type system problems (conditionals evaluate both sides of expressions which surprises most, empty lists therefore need a concat), Terraform element may only be used with flat lists. This idea appears to work at first glance (default no custom snippets, allows some machines to be customized, some not, etc.) but if you try to provide multiple snippets to a single machine, any after the first are silently ignored because
A map of lists keyed by machine name seems promising. There is a lot of subtlety here too. First we merge with a dummy map to workaround the issue where lookups on an empty map fail and conditionals evaluate both sides. Next,
However, using bracket notation Candidate#269 uses a map of lists with clever defaulting and overriding. I used a zipmap approach to ensure there is a key for every controller/worker with a default (that must be |
@dghubble Hey. This is an awesome feature for bare-bone clusters. However sometimes one requires the same snippets applied to coreos-installer also, for example if the OS drive name should be the same on all the machines, but some hardware has NVMe and some SATA, hence the need to write custom udev rules, but it's not possible currently to cleanly implement this. Basically it would be awesome to have the snippets support for installer configs also. |
On bare-metal, for common snippets, you can put them in a list and reference that list using existing Terraform mechanisms. #279 (comment) Snippets aren't supportable in the install phase today. Snippets use the Matchbox raw Ignition API (snippet merges/validation are now done where you run terraform) rather than the Container Linux Config API, raw Ignition serving doesn't evaluate Matchbox-specific variables, and the |
Feature
Introduce the ability for users to supply Container Linux Config "snippets" for controllers or workers. Snippets are valid Container Linux Configs which are additively merged into the base Container Linux Configs that Typhoon uses to provision controllers and workers. Configs are validated, merged into a single config for use in instance user-data. Validation errors are shown with line numbers during
terraform plan
.This is a major feature and has been long awaited.
I've made changes to the
terraform-provider-ct
provider plugin in v0.2.1 to enable this feature.Caveats:
append
) will allow. You cannot remove bits of the base config added by Typhoon. This is part of the reason Typhoon focuses on minimalism.Components
Problems
terraform-provider-ct
is not currently an official terraform provider. This actually presents substantial migration difficulties. Admins who already manage a bunch of clusters and then swap the provider binary on their system will notice diff's to those existing clusters. Applying those diffs can destroy clusters as terraform believes its task is to replace every node. I'm still investigating this.The text was updated successfully, but these errors were encountered: