Skip to content

Conversation

@ironcladlou
Copy link
Contributor

The router metrics tests assume they're talking to a single router pod. During
the v4 refactor, this assumption broke, and the test connection to the router
was load-balanced to router pods behind the router service. This made the test
flaky.

Refactor the test to discover and operate against a single ingress endpoint.

@openshift-ci-robot
Copy link

@ironcladlou: This pull request references Bugzilla bug 1683057, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 1683057: Fix router metrics test flakes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Nov 1, 2019
Copy link
Contributor

@frobware frobware left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, though I think doing _ := foo() was clearer without the assignment. Either way, it is ignored.

@ironcladlou
Copy link
Contributor Author

LGTM, though I think doing _ := foo() was clearer without the assignment. Either way, it is ignored.

Oops... my editor's linter auto-refactored these! Wonder if the std linter likes one or the other better.

And yeah, I guess I should actually do something with the error anyway. Too bad the editor couldn't do that for me too...

@ironcladlou ironcladlou force-pushed the fix-router-metrics-test branch from 0c9db13 to b9c5de4 Compare November 1, 2019 17:56
The router metrics tests assume they're talking to a single router pod. During
the v4 refactor, this assumption broke, and the test connection to the router
was load-balanced to router pods behind the router service. This made the test
flaky.

Refactor the test to discover and operate against a single ingress endpoint.
@ironcladlou ironcladlou force-pushed the fix-router-metrics-test branch from b9c5de4 to ea01775 Compare November 1, 2019 17:57
@ironcladlou
Copy link
Contributor Author

@frobware rolled back the unnecessary refactor, PTAL

@frobware
Copy link
Contributor

frobware commented Nov 1, 2019

@frobware rolled back the unnecessary refactor, PTAL

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 1, 2019
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: frobware, ironcladlou

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 78c71bf into openshift:master Nov 2, 2019
@openshift-ci-robot
Copy link

@ironcladlou: All pull requests linked via external trackers have merged. Bugzilla bug 1683057 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1683057: Fix router metrics test flakes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@smarterclayton
Copy link
Contributor

I think this substantially increased flakes in AWS

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_oc/143/pull-ci-openshift-oc-master-e2e-aws/478

Going to revert, then we can retest in e2e-aws premerge

ironcladlou added a commit to ironcladlou/origin that referenced this pull request Nov 4, 2019
On AWS, the default router speaks PROXY protocol. The fix in
openshift#24075 switched some router tests to
(correctly) speak to the router directly. However, the fix did not update client
code to speak PROXY to the router on AWS. The tests still sometimes pass on AWS
by coincidence (as other non-test clients send traffic to the router through the
LB, causing router stats to sometimes match test expectations.)

Fix the tests so that test clients talking to routers use PROXY protocol on AWS.
@ironcladlou
Copy link
Contributor Author

#24085 should fix the AWS flake

ironcladlou added a commit to ironcladlou/origin that referenced this pull request Nov 4, 2019
On AWS, the default router speaks PROXY protocol. The fix in
openshift#24075 switched some router tests to
(correctly) speak to the router directly. However, the fix did not update client
code to speak PROXY to the router on AWS. The tests still sometimes pass on AWS
by coincidence (as other non-test clients send traffic to the router through the
LB, causing router stats to sometimes match test expectations.)

Fix the tests so that test clients talking to routers use PROXY protocol on AWS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants