-
Notifications
You must be signed in to change notification settings - Fork 220
NE-408: Allow configuring ELB connection idle timeout #451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NE-408: Allow configuring ELB connection idle timeout #451
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
04c7846 to
ff7701e
Compare
ff7701e to
5c5286b
Compare
|
@Miciah: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@Miciah: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
|
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
|
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/reopen |
|
@Miciah: Reopened this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
5c5286b to
d024778
Compare
|
In the last e2e-aws-operator run, the new That test has been failing on a few PRs over the past 26 hours. /retest |
|
In the last e2e-aws-operator run, I could add some retry logic around the /retest |
d024778 to
70f8c6b
Compare
|
Latest push adds a unit test that I had forgotten to commit. |
Rework the GCP load-balancer provider parameters defaulting logic in preparation for an upcoming commit. Besides simplifying the logic, this commit also changes the defaulting logic to ignore unknown provider parameters and to ignore provider parameters for platforms other than the actual platform. These changes should avoid surprises when more provider parameters are added to the API later on as well as prevent weird behavior when the user sets GCP provider parameters on non-GCP clusters. * pkg/operator/controller/ingress/controller.go (setDefaultPublishingStrategy): Rework defaulting logic for GCP load-balancer provider parameters.
90d4ea4 to
1a310ef
Compare
|
Rebased for #735. |
|
/assign frobware |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just some places in the e2e test that really should call t.Fatal() and not t.Error(). Plus a question regarding commit - does this belong to this PR; the commit message mentions it is for an upcoming change.
test/e2e/operator_test.go
Outdated
|
|
||
| return true, nil | ||
| }); err != nil { | ||
| t.Errorf("failed to observe expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Fatal()? Can we carry on without lookup succeeding?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it wouldn't hurt to make this Fatal.
test/e2e/operator_test.go
Outdated
|
|
||
| return false, nil | ||
| }); err != nil { | ||
| t.Errorf("failed to observe expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fatal too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it potentially useful to know whether a large value causes the expected behavior when diagnosing why a low value does not?
test/e2e/operator_test.go
Outdated
|
|
||
| return true, nil | ||
| }); err != nil { | ||
| t.Errorf("failed to observe expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fatal.
test/e2e/operator_test.go
Outdated
|
|
||
| return true, nil | ||
| }); err != nil { | ||
| t.Errorf("failed to observe expected condition: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fatal. Although it's at the end of the test we may as well be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're suggesting making every Error into a Fatal. What's an example of where you would advise using Error?
We should use Fatal if any subsequent testing can only produce garbage. However, I think in some of these instances where I used Error in this test, there could be value in continuing the test. For example, even if setting a 3-second timeout didn't behave as expected, we can still try a 120-second timeout to see whether it behaves as expected, and the result could be helpful in diagnosing why the 3-second timeout didn't behave as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're suggesting making every
Errorinto aFatal. What's an example of where you would advise usingError?
Table-driven unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use
Fatalif any subsequent testing can only produce garbage
Isn't that the case for all these e2e tests; we try to setup up the cluster/objects/resources/state in a very particular way and if that doesn't happen is it worth generating cascading failures messages if the test was to continue? If you were to debug a failing test you're likely to start with the first error message. Would further error messages help?
| if err := wait.PollImmediate(1*time.Second, 5*time.Minute, func() (bool, error) { | ||
| _, err := net.LookupIP(route.Spec.Host) | ||
| if err != nil { | ||
| t.Log(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be really chatty, particularly at 1s interval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's test output, it should be under 300 lines, and it's hidden unless a test fails. Is that too chatty? I could add some logic to suppress the log message if it's identical to the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just bump the interval to 3s?
On my sample size of 1 run, the lookup resolved in ~60s.
Perhaps I'm just a little wary if this becomes a parallel test candidate [1]. One downside of running some or a lot of the tests in parallel is the interleaved test output.
Not a blocker for me on the PR though. Was just an observation.
[1] PR #756.
| @@ -412,19 +412,63 @@ func setDefaultPublishingStrategy(ic *operatorv1.IngressController, infraConfig | |||
| changed = true | |||
| } | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does commit "rework GCP logic" belong to this PR? Can it go in the upcoming commit?
1a310ef to
fc7c323
Compare
|
I've changed |
|
/lgtm |
|
/retest |
fc7c323 to
eee2928
Compare
* pkg/operator/controller/ingress/controller.go (setDefaultPublishingStrategy): Handle changes to the connection idle timeout for an AWS ELB. * pkg/operator/controller/ingress/controller_test.go (TestSetDefaultPublishingStrategyHandlesUpdates): Add test cases for changing the ELB connection idle timeout. * pkg/operator/controller/ingress/load_balancer_service.go (awsELBConnectionIdleTimeoutAnnotation): New constant. (managedLoadBalancerServiceAnnotations): Add awsELBConnectionIdleTimeoutAnnotation. (desiredLoadBalancerService): Set the connection idle timeout annotation if the ingresscontroller specifies a non-nil connectionIdleTimeout value. * pkg/operator/controller/ingress/load_balancer_service_test.go (TestDesiredLoadBalancerServiceAWSIdleTimeout): New test. (TestLoadBalancerServiceChanged): Add a test case for the connection-idle-timeout annotation. * test/e2e/operator_test.go (TestAWSELBConnectionIdleTimeout): New test. * test/e2e/util.go (buildSlowHTTPDPod): New helper for TestAWSELBConnectionIdleTimeout.
eee2928 to
a635566
Compare
|
/retest |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: frobware, Miciah The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@Miciah: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Bump openshift/api for ELB connection idle timeout
Bump to github.com/openshift/api@b25f69a603a76ccc809f986c9f5811f0825febbb to get the AWS ELB connection idle timeout API.
go.mod: Update.go.sum:manifests/00-custom-resource-definition.yaml:pkg/manifests/bindata.go:vendor/github.com/openshift/api/*:vendor/modules.txt: Regenerate.test/e2e: import API errors package asapierrorsImport the apimachinery errors package as "apierrors".
test/e2e/operator_test.go: Import the "k8s.io/apimachinery/pkg/api/errors" package as "apierrors" so as not to conflict with the standard "errors" package.desiredLoadBalancerService: Simplify with "lb" varpkg/operator/controller/ingress/load_balancer_service.go(desiredLoadBalancerService): Introduce an "lb" variable to shorten some long lines.desiredLoadBalancerService: Check for nil LB statuspkg/operator/controller/ingress/load_balancer_service.go(desiredLoadBalancerService): Add a nil check just in casestatus.endpointPublishingStrategy.LoadBalanceris nil somehow.setDefaultPublishingStrategy: Rework GCP logic
Rework the GCP load-balancer provider parameters defaulting logic in preparation for the next change. Besides simplifying the logic, this change also changes the defaulting logic to ignore unknown provider parameters and to ignore provider parameters for platforms other than the actual platform. These changes should avoid surprises when more provider parameters are added to the API later on as well as prevent weird behavior when the user sets GCP provider parameters on non-GCP clusters.
pkg/operator/controller/ingress/controller.go(setDefaultPublishingStrategy): Rework defaulting logic for GCP load-balancer provider parameters.Allow configuring ELB connection idle timeout
pkg/operator/controller/ingress/controller.go(setDefaultPublishingStrategy): Handle changes to the connection idle timeout for an AWS ELB.pkg/operator/controller/ingress/controller_test.go(TestSetDefaultPublishingStrategyHandlesUpdates): Add test cases for changing the ELB connection idle timeout.pkg/operator/controller/ingress/load_balancer_service.go(awsELBConnectionIdleTimeoutAnnotation): New constant.(
managedLoadBalancerServiceAnnotations): AddawsELBConnectionIdleTimeoutAnnotation.(
desiredLoadBalancerService): Set the connection idle timeout annotation if the ingresscontroller specifies a non-nilconnectionIdleTimeoutvalue.pkg/operator/controller/ingress/load_balancer_service_test.go(TestDesiredLoadBalancerServiceAWSIdleTimeout): New test.test/e2e/operator_test.go(TestAWSELBConnectionIdleTimeout): New test.test/e2e/util.go(buildSlowHTTPDPod): New helper forTestAWSELBConnectionIdleTimeout.Related to openshift/enhancements#461.