Added Anti Affinity when HA is configured#2893
Conversation
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|
Thankx @Pothulapati ! |
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|
@grampelberg The changes in this PR, by default, adds the pod anti-affinity hostname and zone constraints, when in HA mode. The k8s doc recommends not adding these constraints for clusters with several hundred nodes. Shall we make these new pod anti-affinity rules optional? |
|
@ihcsim my vote is to leave them with the assumption that most clusters are under a few hundred nodes. The With the helm chart coming out, WDYT about making all these types of configuration individual settings in the chart, having |
|
@grampelberg Agreed with keeping a set of sane defaults, and let users use helm (or whatever tools) to handle more advanced use cases (like defining Following #1895 (comment), how do you feel about omitting the |
|
@Pothulapati This is working great on my GKE regional cluster! Can you update the |
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|
@ihcsim I’d actually go the other way, omit
Obviously, this will all get better when advanced users can tweak to their heart’s content. I’m thinking in the meantime we should have the most restrictive HA possible to surface any possible issues up front and fail fast. |
|
@grampelberg that approach feels right. Making the anti-affinity requirements fixed by default is fine, but that would not allow users to have other Now, Should we make both |
👍 Agreed. @grampelberg @Pothulapati So to summarize,
If the |
|
@Pothulapati let’s not do zone, I think you’re right, that’ll be an advanced option for folks who know they want it. |
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|
Updated the PR to have |
|
This is awesome! If the scheduler can't satisfy the anti-affinity hostname rule, |
|
@Pothulapati In anticipation of the upcoming Linkerd Helm chart work, can I trouble you to extract out the common anti-affinity rules into partial template. I tested the following on my GKE cluster, and everything works: diff --git a/chart/templates/_pod_affinity.yaml b/chart/templates/_pod_affinity.yaml
new file mode 100644
index 00000000..dd11873b
--- /dev/null
+++ b/chart/templates/_affinity.yaml
@@ -0,0 +1,22 @@
+{{- define "pod-affinity" }}
+affinity:
+ podAntiAffinity:
+ preferredDuringSchedulingIgnoredDuringExecution:
+ - weight: 100
+ podAffinityTerm:
+ labelSelector:
+ matchExpressions:
+ - key: {{ .Label }}
+ operator: In
+ values:
+ - {{ .Component }}
+ topologyKey: failure-domain.beta.kubernetes.io/zone
+ requiredDuringSchedulingIgnoredDuringExecution:
+ - labelSelector:
+ matchExpressions:
+ - key: {{ .Label }}
+ operator: In
+ values:
+ - {{ .Component }}
+ topologyKey: kubernetes.io/hostname
+{{- end }}
diff --git a/chart/templates/controller.yaml b/chart/templates/controller.yaml
index 960d3c3d..980ac140 100644
--- a/chart/templates/controller.yaml
+++ b/chart/templates/controller.yaml
@@ -131,26 +131,8 @@ spec:
- name: config
configMap:
name: linkerd-config
-{{- if .HighAvailability}}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - controller
- topologyKey: failure-domain.beta.kubernetes.io/zone
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - controller
- topologyKey: kubernetes.io/hostname
-{{ end }}
+ {{- if .HighAvailability }}
+ {{- $local := dict "Label" .ControllerComponentLabel "Component" "controller" }}
+ {{- include "pod-affinity" $local | nindent 6 }}
+ {{- end }}
{{end -}}
diff --git a/chart/templates/identity.yaml b/chart/templates/identity.yaml
index dbf274a2..1c035f1c 100644
--- a/chart/templates/identity.yaml
+++ b/chart/templates/identity.yaml
@@ -102,27 +102,9 @@ spec:
- name: identity-issuer
secret:
secretName: linkerd-identity-issuer
-{{- if .HighAvailability}}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - identity
- topologyKey: failure-domain.beta.kubernetes.io/zone
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - identity
- topologyKey: kubernetes.io/hostname
-{{ end }}
+ {{- if .HighAvailability }}
+ {{- $local := dict "Label" .ControllerComponentLabel "Component" "identity" }}
+ {{- include "pod-affinity" $local | nindent 6 }}
+ {{- end }}
{{end -}}
{{end -}}
diff --git a/chart/templates/proxy_injector.yaml b/chart/templates/proxy_injector.yaml
index 465a9157..9da1481f 100644
--- a/chart/templates/proxy_injector.yaml
+++ b/chart/templates/proxy_injector.yaml
@@ -67,29 +67,10 @@ spec:
- name: tls
secret:
secretName: linkerd-proxy-injector-tls
-{{- if .HighAvailability}}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - proxy-injector
- topologyKey: failure-domain.beta.kubernetes.io/zone
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - proxy-injector
- topologyKey: kubernetes.io/hostname
-{{ end }}
-
+ {{- if .HighAvailability}}
+ {{- $local := dict "Label" .ControllerComponentLabel "Component" "proxy-injector" }}
+ {{- include "pod-affinity" $local | nindent 6 }}
+ {{- end }}
---
kind: Service
apiVersion: v1
diff --git a/chart/templates/sp_validator.yaml b/chart/templates/sp_validator.yaml
index 18f2545a..dc36daa0 100644
--- a/chart/templates/sp_validator.yaml
+++ b/chart/templates/sp_validator.yaml
@@ -81,26 +81,8 @@ spec:
- name: tls
secret:
secretName: linkerd-sp-validator-tls
-{{- if .HighAvailability}}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - sp-validator
- topologyKey: failure-domain.beta.kubernetes.io/zone
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - sp-validator
- topologyKey: kubernetes.io/hostname
-{{ end }}
+ {{- if .HighAvailability}}
+ {{- $local := dict "Label" .ControllerComponentLabel "Component" "sp-validator" }}
+ {{- include "pod-affinity" $local | nindent 6 }}
+ {{- end }}
{{end -}}
diff --git a/chart/templates/tap.yaml b/chart/templates/tap.yaml
index 1b231f25..ab608b6f 100644
--- a/chart/templates/tap.yaml
+++ b/chart/templates/tap.yaml
@@ -71,26 +71,8 @@ spec:
{{ end -}}
securityContext:
runAsUser: {{.ControllerUID}}
-{{- if .HighAvailability}}
- affinity:
- podAntiAffinity:
- preferredDuringSchedulingIgnoredDuringExecution:
- - weight: 100
- podAffinityTerm:
- labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - tap
- topologyKey: failure-domain.beta.kubernetes.io/zone
- requiredDuringSchedulingIgnoredDuringExecution:
- - labelSelector:
- matchExpressions:
- - key: {{.ControllerComponentLabel}}
- operator: In
- values:
- - tap
- topologyKey: kubernetes.io/hostname
-{{ end }}
+ {{- if .HighAvailability}}
+ {{- $local := dict "Label" .ControllerComponentLabel "Component" "tap" }}
+ {{- include "pod-affinity" $local | nindent 6 }}
+ {{- end }}
{{end -}}
diff --git a/cli/cmd/install.go b/cli/cmd/install.go
index 9d6adfba..c5ac4d3d 100644
--- a/cli/cmd/install.go
+++ b/cli/cmd/install.go
@@ -703,6 +703,7 @@ func (values *installValues) render(w io.Writer, configs *pb.All) error {
if values.stage == "" || values.stage == controlPlaneStage {
files = append(files, []*chartutil.BufferedFile{
{Name: "templates/_resources.yaml"},
+ {Name: "templates/_affinity.yaml"},
{Name: "templates/config.yaml"},
{Name: "templates/identity.yaml"},
{Name: "templates/controller.yaml"}, |
Signed-off-by: Tarun Pothulapati <tarunpothulapati@outlook.com>
|
@ihcsim Updated PR accordingly |
The current HA documentation is outdated with respect to anti-affinity rules for critical pods as of 2.5.0 stable release. Document the impact of linkerd/linkerd2#2893 on anti-affinity rules for control plane pods. Related to #400 Signed-off-by: Christian Nicolai <cn@mycrobase.de>
Fixes #1895
The following PR adds anti-affinity rules to
proxy-injector,sp-validator,linkerd-controller,tapdeployments.The idea was to make anti-affinity rules both based on
kubernetes.io/hostnameandfailure-domain.beta.kubernetes.io/zonepreferred when only the the--haflag is configured.if the
--required-host-anti-affinityis also configured along with--ha, then thekubernetes.io/hostnameis required whilefailure-domain.beta.kubernetes.io/zoneis still preferred.@ihcsim @alpeb @grampelberg @olix0r