Modify all health checks to be specified via enums #2078

siggy · 2019-01-14T18:57:19Z

Modify all health checks to be specified via enums

The set of health checks to be executed were dependent on a combination
of check enums and boolean options.

This change modifies the health checks to be governed strictly by a set
of enums. This change does not add or remove any checks, but rather
moves checks into more granular categories, such that any set of checks
that are toggle-able are defined together under a single category.

This is a first step in cleaning up the linkerd check code, and moving towards #1471.

Next steps:

tightly couple category IDs to names
tightly couple checks to their parent categories
programmatic control over check ordering

Signed-off-by: Andrew Seigner [email protected]

The set of health checks to be executed were dependent on a combination of check enums and boolean options. This change modifies the health checks to be governed strictly by a set of enums. Next steps: - tightly couple category IDs to names - tightly couple checks to their parent categories - programmatic control over check ordering Signed-off-by: Andrew Seigner <[email protected]>

Signed-off-by: Andrew Seigner <[email protected]>

The `linkerd check` command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the `HealthChecker` level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <[email protected]>

klingerf

⭐️ This is great! Much more easy to reason about now that those boolean variables are gone.

klingerf · 2019-01-15T01:00:40Z

pkg/healthcheck/healthcheck.go

+		},
+	})
+
+	// TODO: refactor with LinkerdPreInstallSingleNamespaceChecks
 	roleType := "ClusterRole"
 	roleBindingType := "ClusterRoleBinding"


Now that's you've split the RBAC checks into multiple separate methods, I think it's clearer to hardcode everything, rather than worrying about code reuse. I'm inclined to just remove these local vars. Something like:

diff --git a/pkg/healthcheck/healthcheck.go b/pkg/healthcheck/healthcheck.go index 24b9722e..31c99ce2 100644 --- a/pkg/healthcheck/healthcheck.go +++ b/pkg/healthcheck/healthcheck.go @@ -316,23 +316,19 @@ func (hc *HealthChecker) addLinkerdPreInstallClusterChecks() { }, }) - // TODO: refactor with LinkerdPreInstallSingleNamespaceChecks - roleType := "ClusterRole" - roleBindingType := "ClusterRoleBinding" - hc.checkers = append(hc.checkers, &checker{ category: LinkerdPreInstallClusterCategory, - description: fmt.Sprintf("can create %ss", roleType), + description: "can create ClusterRoles", check: func() error { - return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", roleType) + return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", "ClusterRole") }, }) hc.checkers = append(hc.checkers, &checker{ category: LinkerdPreInstallClusterCategory, - description: fmt.Sprintf("can create %ss", roleBindingType), + description: "can create ClusterRoleBindings", check: func() error { - return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", roleBindingType) + return hc.checkCanCreate("", "rbac.authorization.k8s.io", "v1beta1", "ClusterRoleBinding") }, })

Same goes for the checks in the addLinkerdPreInstallSingleNamespaceChecks func.

heh, i did exactly that in the next PR: https://github.com/linkerd/linkerd2/pull/2080/files#diff-d4056ff163bcf2aeacefb2a34164563cR270

Awesome, yep, carry on!

The linkerd check command organized the various checks via loosely coupled category IDs, category names, and checkers themselves, all with ordering defined by consumers of this code. This change removes category IDs in favor of category names, groups all checkers by category, and enforces ordering at the HealthChecker level. Part of #1471, depends on #2078. Signed-off-by: Andrew Seigner <[email protected]>

In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see #11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: #11055 (comment) --- * Increase HTTP request queue capacity (linkerd/linkerd2-proxy#2449) Signed-off-by: Eliza Weisman <[email protected]>

In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see #11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: #11055 (comment) --- * Increase HTTP request queue capacity (linkerd/linkerd2-proxy#2449)

siggy added 2 commits January 14, 2019 10:51

fix integration tests

f319bc4

Signed-off-by: Andrew Seigner <[email protected]>

siggy self-assigned this Jan 14, 2019

siggy requested a review from klingerf January 14, 2019 18:57

siggy added the area/cli label Jan 14, 2019

siggy mentioned this pull request Jan 14, 2019

Group checkers by category #2080

Closed

klingerf approved these changes Jan 15, 2019

View reviewed changes

siggy merged commit 0437341 into master Jan 15, 2019

siggy mentioned this pull request Jan 15, 2019

Group checkers by category #2083

Merged

siggy deleted the siggy/check-enums branch January 15, 2019 01:26

hawkw mentioned this pull request Aug 3, 2023

proxy: v2.207.0 #11198

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify all health checks to be specified via enums #2078

Modify all health checks to be specified via enums #2078

siggy commented Jan 14, 2019

klingerf left a comment

klingerf Jan 15, 2019 •

edited

Loading

siggy Jan 15, 2019

klingerf Jan 15, 2019

Modify all health checks to be specified via enums #2078

Modify all health checks to be specified via enums #2078

Conversation

siggy commented Jan 14, 2019

klingerf left a comment

Choose a reason for hiding this comment

klingerf Jan 15, 2019 • edited Loading

Choose a reason for hiding this comment

siggy Jan 15, 2019

Choose a reason for hiding this comment

klingerf Jan 15, 2019

Choose a reason for hiding this comment

klingerf Jan 15, 2019 •

edited

Loading