Skip to content

Conversation

@ramr
Copy link
Contributor

@ramr ramr commented Dec 11, 2018

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 11, 2018
# Name is set at runtime.
namespace: openshift-ingress
annotations:
service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This service is created for any cloud provider environments (AWS, GCP, etc.) but the new annotation is specific to AWS (or the annotation name is misleading?)
May be we could inject the annotation only when the platform is AWS? Similar to what we do here:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the service and the deployment both need configured as a unit when we're using a cloud HA strategy and we're on a supported cloud.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2018
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2018
@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

/retest

@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

@pravisankar addressed your comments. PTAL thx

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest of the changes LGTM

s.Spec.Selector["router"] = name

if f.installConfig.Platform.AWS != nil {
if s.Annotations == nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not needed, s.Annotations is already initialized here:

if s.Annotations == nil {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh yeah!! Am still zonked with all the meds. Fixing now ... Thanks.

@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

hmm, the /test e2e-aws is failing because of proxy support enabled.

Copy link

@pravisankar pravisankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Dec 12, 2018
@pravisankar
Copy link

/hold
e2e-aws tests need to be fixed

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 12, 2018
@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

It is like a catch-22-thingy. The feature is to automatically enable the proxy protocol on haproxy (and the aws LB) but the tests that run can't with proxy protocol enabled on haproxy I think. Going to retest a few times just to confirm.

@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

/retest

1 similar comment
@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

/retest

}
s.Spec.Selector["router"] = name

if f.installConfig.Platform.AWS != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be in RouterServiceCloud which is the LB service?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it was -- that's the reason why was also initializing the annotations which I removed last night as per Ravi's comments but didn't look closely. It somehow looks like rebasing it moved it to the previous block/function. I fixed this but there's another issue will put in a separate comment.

@ironcladlou
Copy link
Contributor

Could we do an e2e test for this?

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Dec 12, 2018
@ramr
Copy link
Contributor Author

ramr commented Dec 12, 2018

@ironcladlou there's a problem with automatically enabling Proxy Protocol support on both haproxy and the AWS cloud load balancer.

The reason is when we enable this - the haproxy router only speaks the proxy protocol and doesn't understand http[s] - it expects the cloud load balancer to send a proxy protocol request wrapping the http request.

Now a consequence of this is that all the tests that use curl or equivalent http requests in golang code to talk directly to the haproxy router will fail. As the router's not going to understand that request. It is like a catch-22 on the environments where we run the unit tests and enable this. It makes sense for production env like settings (but for dev/ci it will cause problems).

My 2 bits would be to make this not be the default in the case tests are run.
Either let this setting be controlled by a ClusterIngressSpec field ala EnableLoadBalancerProxyProtocol == true. Or alternatively use the reverse like a env variable or something to always disable it for the ci/tests environments. Thoughts? Thx

@ironcladlou
Copy link
Contributor

Assuming we want PROXY to be the default when supported (we can discuss that assumption separately) — what if during tests we talked to the router through as many supported layers as possible to maximize our test coverage? For example, if on the default AWS install we assert support for AWS ELB in front of the routers, then tests executing in that configuration could speak to the router through its ELB. Thoughts?

P.S., do you already have a list of which tests are trying to talk to the router directly?

@ironcladlou
Copy link
Contributor

/cc @smarterclayton re: the test strategy

@ramr
Copy link
Contributor Author

ramr commented Dec 13, 2018

@ironcladlou answers below:

  1. So my only concern re: PROXY protocol being default or not is that in the case it is the default,
    there is really no way to turn that off in case someone wants to do. Well other than either remove
    aws creds or change ClusterIngressSpec.HighAvailability. We have seen environments that have
    one user-[micro]-service is talking another using the router others where they are actually
    sending health checks for apps using HTTP internally in the cluster. With the PROXY protocol that's
    not going to work - change needed to use the the external ELB.

  2. We could talk thru' the AWS LB for the tests but since they are written at various times/people -- there's a lot of different tools those do end up using and so different fixes neeed.
    Example: https://github.com/openshift/origin/blob/master/test/extended/router/router.go#L69 use the url helper but https://github.com/openshift/origin/blob/master/test/extended/router/scoped.go#L275 uses curl directly etc ... not clean!

  3. So as re: list of tests - most of the routers extended tests are in
    https://github.com/openshift/origin/tree/master/test/extended/router
    but I don't think that would be the entire set. I just ran a simple grep curl origin/test as an example
    and showed up a few others ala: https://github.com/openshift/origin/blob/master/test/extended/builds/service.go#L19

Thx

@ramr
Copy link
Contributor Author

ramr commented Dec 17, 2018

Depends on openshift/origin#21680 to fix a few of the broken tests ...

@openshift-ci-robot openshift-ci-robot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 18, 2018
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jan 15, 2019
@ramr
Copy link
Contributor Author

ramr commented Jan 15, 2019

Rebased this but I think I still need some help with testing this ... I can't boot up an installer cluster with the HEAD - I can only boot up a cluster with v0.90 but that doesn't have the Infrastructure changes. Plus I think origin router tests may need a change.

@ramr
Copy link
Contributor Author

ramr commented Jan 15, 2019

/retest

Flake: error: unable to upload the new image manifest: received unexpected HTTP status: 504 Gateway Time-out

@knobunc
Copy link

knobunc commented Jan 16, 2019

/retest

@ramr
Copy link
Contributor Author

ramr commented Jan 16, 2019

FYI, the router tests will fail until they send requests to the ELB rather than the router service (since the router service will now only "speak" the proxy protocol). This needs openshift/origin#21799 to merge.

@ramr
Copy link
Contributor Author

ramr commented Jan 17, 2019

/retest

1 similar comment
@ramr
Copy link
Contributor Author

ramr commented Jan 18, 2019

/retest

@ramr
Copy link
Contributor Author

ramr commented Jan 23, 2019

/test all

@ironcladlou
Copy link
Contributor

Looks like this is once again good to go.

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 23, 2019
@ironcladlou
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 23, 2019
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ironcladlou, Miciah, pravisankar, ramr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [Miciah,ironcladlou,pravisankar,ramr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ironcladlou
Copy link
Contributor

/retest

@ramr
Copy link
Contributor Author

ramr commented Jan 23, 2019

/retest

last error: "waiting for openshift-console URL: context deadline exceeded"

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@ironcladlou
Copy link
Contributor

/retest

1 similar comment
@ironcladlou
Copy link
Contributor

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 9fb155f into openshift:master Jan 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants