-
Notifications
You must be signed in to change notification settings - Fork 519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metal: support network bonding #2369
Comments
Thanks for the issue @Cajga ! This is a great idea and something we'd like to support in the future. Could you provide some additional detail about your setup? Any configuration files (netplan or otherwise), etc. you're willing to share would be great! |
@zmrow We have an Equinix Metal EKS-Anywhere on Baremetal project which is tracking related concerns: equinix-labs/terraform-equinix-metal-eks-anywhere#30 Any insights into how to configure the networking would be helpful (an example cloud-config is provided in the issue comments). |
@zmrow, This is the documentation about it for RHEL9. The linked ticket from @displague contains links for examples for netplan as well. |
Thanks for the additional details @displague and @Cajga ! Currently bonding is not supported by Bottlerocket, but we're going to begin work on it. There are lots of options for bonding in the kernel documentation. @Cajga supplied the subset they are using. @displague I looked through the issue you mentioned and saw that you're using a different set of options, as well as using the bond in a VLAN. Is that a separate requirement? If so, I'd like to open a separate issue for that support. What I'd like to do is narrow in on a reasonable set of options as a first pass. How does the below list look? Once I dig into these options and understand their related functionality as well as what's supported in
|
It would be great to get networking bonding support in Bottlerocket! I think this would be a great topic to bring to one of the community meetings to spread awareness and see if anyone else has interest. The next community meeting is scheduled for September 7: https://www.meetup.com/bottlerocket-community/events/288085016/ If that doesn't work, these should be taking place every two weeks going forward. Details will be posted under the Meetup group for Bottlerocket: https://www.meetup.com/bottlerocket-community Looking forward to seeing this work! |
Yes, @zmrow. I think it makes sense to track VLAN support separately. I'd be careful in implementing support for bonding without considering the VLAN support if only because the two will be frequently used together in mature network configurations. Here's the link to how we would intend to use VLANs in Bottlerocket for EKS-A: https://github.com/equinix-labs/terraform-metal-hybrid-gateway/blob/main/modules/backend/cloud-config.cfg#L31 https://wiki.archlinux.org/title/VLAN could be helpful to compare notes against (vs Netplan and VMWare configuration docs, amongst others) Bonding does have lots of options 🙂: https://www.kernel.org/doc/html/v5.8/networking/bonding.html |
I noticed that https://github.com/equinix-labs/terraform-metal-hybrid-gateway/blob/main/modules/backend/cloud-config.cfg calls out mode 4 (802.3ad) but @Cajga provided options for mode 1 (active-backup). I'm hoping to confirm if you plan to use both modes or one over the other. We likely will support more than just the minimum but I'm hoping to get a good idea of the most important configuration to target first. |
I have a working configuration and the code is soon to be in code review for this work. The config has ended up looking something like this: # Mii Monitoring
[bond0]
kind = "bond"
mode = "active-backup"
interfaces = ["eno1" , "eno2"]
dhcp4 = true
[bond0.monitoring]
miimon-frequency = 100
miimon-updelay = 200
miimon-downdelay = 200
# Or for ARP Monitoring
[bond1]
kind = "bond"
mode = "active-backup"
interfaces = ["eno3" , "eno4"]
dhcp4 = true
[bond1.monitoring]
arpmon-interval = 2
arpmon-validate = "all"
arpmon-targets = ["192.168.1.1", "10.0.0.2"] For example, bond0 would look something like this on the actual box: bash-5.1# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
3: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc fq_codel master bond0 state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff permaddr 52:54:00:12:34:57
4: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 52:54:00:12:34:56 brd ff:ff:ff:ff:ff:ff
inet 10.0.2.15/24 brd 10.0.2.255 scope global bond0
valid_lft forever preferred_lft forever
inet6 fec0::5054:ff:fe12:3456/64 scope site dynamic mngtmpaddr
valid_lft 86089sec preferred_lft 14089sec
inet6 fe80::5054:ff:fe12:3456/64 scope link
valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:41:93:90:41 brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever We will also be adding in support for vlan tagging in the PR so they should all work together: [myvlan]
kind = "vlan"
device = "bond0"
id = 42
dhcp4 = true I'll update the ticket with the PR when its out. |
@yeazelm This looks awesome. We will give it a try in our env once it is ready. |
@enkelprifti98 Thanks - that's a great idea. Do you mind opening a new issue for this? Additional details on the setup and options used will be super helpful. Just FYI - we've been working on switching to networkd, which should provide a more familiar interface for configuring these types of things. |
Opened a new issue here. |
What I'd like:
We would like to use Bottlerocket for EKS-A Bare Metal. In our DC, we are using network bonding to avoid to have SPOF in the network stack (rack switches which have single power cable, network cables that can be loosened when touched etc. ).
In our case, we use mode 1 (active/backup) with miimon but of course bonding is a broad topic and different configurations can be valid for different use cases.
Please consider that bonding is widely used in DCs for HW for good reason and Linux has excellent support for it.
Any alternatives you've considered:
Well, without bonding, we have to accept the risk of having SPOF in networking for each node in the cluster.
Also, without bonding (with a single network interface connected) we have to make sure that the kubernetes nodes are not connected to the same "network leg" (like half of them are connected to rack switches which are connected to power line A and the other half of the nodes are connected to rack switches which are using power line B etc.)
The text was updated successfully, but these errors were encountered: