You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Context
I work in the K8s platform team at Nationwide Building Society in the UK. We have been using EKS for ~4 years now and currently have around ~200 clusters in the estate across production and non-production. We use both self-built AL2 EKS worker nodes and we consume the Bottlerocket public AMIs. Before every release of our internal platform we run a variety of NFR benchmarking and sanity checking tools. This includes the k8s project netperf test suite: https://github.com/kubernetes/perf-tests/tree/master/network/benchmarks/netperf
Problem
Recently we found the tests suddenly reporting a significant drop in metrics for Bottlerocket. Take the below results from one single run which are representative of all we've seen:
If we can just focus on 2 tests. Pod to pod traffic on the same node and across nodes (mbps).
Test
v1.19.3 (avg)
v1.19.4 (avg)
Delta
1 iperf TCP. Same VM using Pod IP
22,455
14,847
~33%
3 iperf TCP. Remote VM using Pod IP
5,394
3,687
~31%
In my case I've managed to narrow down the problem with some version details. I've found that all versions of Bottlerocket at v1.19.4 and above exhibit the issue. All versions of Bottlerocket at v1.19.3 and below do not have the issue.
I've done around ~20 test runs at the later versions of Bottlerocket but we also have 100s of test reports stored from various versions of Bottlerocket in the past ~2 years which show complete consistency without meaningful deviation up until v1.19.4. We also have the control of our self-built AL2 which has not seen a drop.
Our testing is always done on a 5 ON_DEMAND m5a.xlarge node cluster.
Checking
Looking in to the changes unless something is hidden or an accidental bug is introduced at some layer of the stack the only likely offender I can see is this: All kernels: remove network scheduling "Class Based Queuing" and "Differentiated Services Marker," formerly loadable kmods. - #3865
Impact
I don't have any data to suggest this will have a negative impact for our workloads at the moment. We are not usually anywhere near network throughput capacity on any given node and I have no data to suggest this slows down traffic more generally. But the results are significant and I think it would be important to look in to this for users overall.
Possibilities
This is a genuine max network throughput performance regression which might have other network performance implications.
Some change in the Bottlerocket system means that it has a different interaction with the iperf3 tool giving different results but performance has not actually regressed.
Something else.
Ask
Please can the Wizards 🧙 on the Amazon / Bottlerocket project side have a look in to this. I think that you will have better tools / efficiency / domain knowledge to confirm / falsify / explain what is going on to progress this.
Other
From what I've looked at so far this should be replicable with any version of EKS. To replicate the higher performance results you will need to use a version where v1.19.3 and below AMIs are available but you may also have other ways of replicating things more efficiently. I think raw iperf3 tests between pods should be sufficient to show results of the same order or otherwise help falsify.
Hello,
Context
I work in the K8s platform team at Nationwide Building Society in the UK. We have been using EKS for ~4 years now and currently have around ~200 clusters in the estate across production and non-production. We use both self-built AL2 EKS worker nodes and we consume the Bottlerocket public AMIs. Before every release of our internal platform we run a variety of NFR benchmarking and sanity checking tools. This includes the k8s project netperf test suite: https://github.com/kubernetes/perf-tests/tree/master/network/benchmarks/netperf
Problem
Recently we found the tests suddenly reporting a significant drop in metrics for Bottlerocket. Take the below results from one single run which are representative of all we've seen:
If we can just focus on 2 tests. Pod to pod traffic on the same node and across nodes (mbps).
In my case I've managed to narrow down the problem with some version details. I've found that all versions of Bottlerocket at
v1.19.4
and above exhibit the issue. All versions of Bottlerocket atv1.19.3
and below do not have the issue.I've done around ~20 test runs at the later versions of Bottlerocket but we also have 100s of test reports stored from various versions of Bottlerocket in the past ~2 years which show complete consistency without meaningful deviation up until
v1.19.4
. We also have the control of our self-built AL2 which has not seen a drop.Our testing is always done on a 5 ON_DEMAND m5a.xlarge node cluster.
Checking
Looking in to the changes unless something is hidden or an accidental bug is introduced at some layer of the stack the only likely offender I can see is this: All kernels: remove network scheduling "Class Based Queuing" and "Differentiated Services Marker," formerly loadable kmods. - #3865
Impact
I don't have any data to suggest this will have a negative impact for our workloads at the moment. We are not usually anywhere near network throughput capacity on any given node and I have no data to suggest this slows down traffic more generally. But the results are significant and I think it would be important to look in to this for users overall.
Possibilities
Ask
Please can the Wizards 🧙 on the Amazon / Bottlerocket project side have a look in to this. I think that you will have better tools / efficiency / domain knowledge to confirm / falsify / explain what is going on to progress this.
Other
From what I've looked at so far this should be replicable with any version of EKS. To replicate the higher performance results you will need to use a version where
v1.19.3
and below AMIs are available but you may also have other ways of replicating things more efficiently. I think raw iperf3 tests between pods should be sufficient to show results of the same order or otherwise help falsify.If you try to run the same netperf tooling then be aware that you will need to mess with the config to remove some incorrect arguments provided to the container which don't align with the version of the program and prevent it from running by default. https://github.com/kubernetes/perf-tests/blob/master/network/benchmarks/netperf/launch.go#L268
The text was updated successfully, but these errors were encountered: