-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Network Configuration #2469
Comments
Thanks for the issue @cprivitere ! This is an area of Bottlerocket we're looking to continuously improve on, and these types of suggestions are really helpful. A few quick thoughts as I read over these:
This "match" use case is compelling; I'm wondering how we determine which NIC becomes primary if we're bringing up anything matching
Currently the "config generation" is a separate step from "bring up the NIC and wait for a lease", the latter of which is taken care of by the network manager (currently
We've got an open issue for that! #2293
I like this idea, especially for quick development and a sane default. It does however, raise the same question around "primary interface". 🤔 |
As I wrote in #2293, we are running into the same issue (EKS-A Bare Metal + no one knows the interface name ahead). Apart from this, one cannot do anything to fix the problem itself even with access to the console. Please bear with me in case this is a stupid question as I do not have too much experience with bottlerocket yet, but is there a way to somehow let the user fix this problem with console access? |
@Cajga Thanks for the extra info. I totally understand the pain. As I mentioned above, the "interface config generation" is separate from bringing up the actual interfaces. It brings up the philosophical question "What is the wrong interface name?". Perhaps the user specifies an interface that should be in the box, but the hardware malfunctions and doesn't come up properly. Another thing we have to consider when thinking about a solution is All that being said, we do want to make this more straightforward and easier to debug. I do like the idea of using MAC address or some type of name match to make configuration easier. A thought: if we used MAC address we could ensure a NIC exists (regardless of the name) with that MAC and if not, fail the config generation which would print a message to the console for easier debugging. 🤔 To answer your last question - Bottlerocket doesn't include a shell so there really isn't a way to drop into some sort of recovery shell. |
So I would challenge the following presumptions:
|
The netplan solution (https://netplan.io/design) seems to be, to leave it to the user. The network devices are subgrouped into "primary" is not a meaningful distinction since kernel networking doesn't share that concept. If the significance of a "primary" interface is that bootup will wait for this device to be configure, then perhaps that behavior should be the parameter:
One of netplan's features is that it can |
As you say, this is being challenged pretty actively by @zmrow at the moment. It's clear that a larger change is needed to keep up with the rate of feature requests for network configuration. However, netdog + wicked is what's in place at the moment, and for a lot of reasons (support, maintenance) I really want a consistent approach across all platforms. That means some sort of helper to handle network configuration chores on platforms where it isn't a preoccupying concern (vmware, aws), which is the role that
The admin container is only accessible if the network interface needed to access the container registry is configured, so there's an unfortunate bootstrapping problem. This is a longstanding pain point (e.g. #385) where the pragmatic approach ("add a shell") conflicts squarely with the larger architectural requirement ("no shell"). I've mulled over the idea of some sort of "safe mode" partition with a shell that could be rebooted into if the first boot fails, and would be inaccessible thereafter. That wouldn't be a silver bullet but at least the journal would be available, assuming setup for local storage worked. |
Primary is currently used in two ways:
These are both higher level concerns leaking down into the exposed settings. Hostname in particular gets this treatment because the "correct" hostname is needed for EC2 nodes to authenticate to EKS. @zmrow it'd be good to take a hard look at whether a "primary" field is actually needed in the environment where the "match" style interface would be used. If hostname + DNS servers are coming in via provisioning then it might be possible to just not have a primary.
Adding I think that would end up in the same place from a functionality perspective, if I understand the goal correctly: supporting |
This is fair. At network config generation time, the API server isn't up yet and we haven't read user data so we don't know what has come in via provisioning. We'll mull this over a bit more... 🤔
I recently found that crate! It's another option we can consider supporting for sure. |
Just for reference - here's the issue where we are evaluating |
I think I am running into similar problems as the OP. In our case we are migrating from AL2 and using In the case of the other interfaces minimally it seems to require manually adding IPs via the admin container. Updating the wicked config to add eth1, etc and restarting the daemons doesn't seem to help so I assume it's something deeper and we'll need to build an AMI and |
@joewilliams You mentioned the official Bottlerocket AMI so I'm assuming you're using an AWS variant. This issue isn't really applicable to AWS variants since the majority of network config is handled via CNI plugins, not via |
@zmrow Thanks! Right, the CNI network stuff is working great. We also have host-based networking config in addition to what the CNI is doing. |
@zmrow do you know if there are any specific actions that can be tracked with this issue? Or are there other "work in progress" items being tracked elsewhere? Just wondering how to move this issue forward. |
We are using the BottlerRocket ECS variant. And it fails to detect other ENIs that has been attached. Anyway to solve this? |
What I'd like:
We're supporting using bottlerocket on bare metal as part of EKS Anywhere running on Equinix Metal.
One of the most limiting and difficult to configure aspects is the NIC configuration. This is due to bare metal servers (even those of a particular plan type like m3.large.x86 or c3.medium.x86) not always having identical configurations. Sometimes network cards aren't the same, sometimes they don't all get plugged into the same slot. The current method of specifying the exact name of the networking card makes supporting this sort of environment near impossible.
Ideas we think would be good:
enp5s0f0np0
, just specifyens*
So in this scenario we would have a net.toml that looks like this (primary=auto just a made up idea to express the desired behavior):
The text was updated successfully, but these errors were encountered: