Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 56 additions & 32 deletions apis/v1beta1/gateway_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,38 +85,62 @@ type GatewaySpec struct {
//
// Port and protocol combinations not listed above are considered Extended.
//
// An implementation MAY group Listeners by Port and then collapse each
// group of Listeners into a single Listener if the implementation
// determines that the Listeners in the group are "compatible". An
// implementation MAY also group together and collapse compatible
// Listeners belonging to different Gateways.
//
// For example, an implementation might consider Listeners to be
// compatible with each other if all of the following conditions are
// met:
//
// 1. Either each Listener within the group specifies the "HTTP"
// Protocol or each Listener within the group specifies either
// the "HTTPS" or "TLS" Protocol.
//
// 2. Each Listener within the group specifies a Hostname that is unique
// within the group.
//
// 3. As a special case, one Listener within a group may omit Hostname,
// in which case this Listener matches when no other Listener
// matches.
//
// If the implementation does collapse compatible Listeners, the
// hostname provided in the incoming client request MUST be
// matched to a Listener to find the correct set of Routes.
// The incoming hostname MUST be matched using the Hostname
// field for each Listener in order of most to least specific.
// That is, exact matches must be processed before wildcard
// matches.
//
// If this field specifies multiple Listeners that have the same
// Port value but are not compatible, the implementation must raise
// a "Conflicted" condition in the Listener status.
// A Gateway's Listeners are considered "compatible" if:
//
// 1. The implementation can serve them in compliance with the Addresses
// requirement that all Listeners are available on all assigned
// addresses.
// 2. The implementation can match inbound requests to a single distinct
// Listener. When multiple Listeners share values for fields (for
// example, two Listeners with the same Port value), the implementation
// can can match requests to only one of the Listeners using other
// Listener fields.
//
// Compatible combinations in Extended support are expected to vary across
// implementations. A combination that is compatible for one implementation
// may not be compatible for another.
//
// If this field specifies multiple Listeners that are not compatible, the
// implementation MUST set the "Conflicted" condition in the Listener
// Status to "True".
//
// Implementations MAY choose to still accept a Gateway with conflicted
// Listeners if they accept a partial Listener set that contains no
// incompatible Listeners. They MUST set a "ListenersNotValid" condition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ListenersNotValid is a reason that can be used with "Accepted", generally to set the condition to "False". I'm not sure what we'd recommend in the case where Listeners were not valid and the Gateway was accepted.

Copy link
Contributor Author

@rainest rainest Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We say it can be used when Accepted is True. From the first point in my self-review, this is indeed a confusing case, but probably one that's intentionally ambiguous because we expect it to vary so much between implementations.

If we want to leave that as-is, the change here is just to say (here) that implementations must set this, whereas previously it wasn't a strict requirement, and wasn't obvious unless you looked through the reason comments. Even if it changes, I think we should mention something here.

Changing the current vague guidelines to something more formal would be a significant breaking change. Between that and the expected variance across vendors, my vote would be to defer it to post GA, when we have more practical experience regarding which approaches are in use and their pros and cons.

// the Gateway Status when the Gateway contains incompatible Listeners
// whether or not they accept the Gateway.
Comment on lines +107 to +111
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reason we'd want to allow this is if a Gateway was already programmed with compatible listeners and then invalid/incompatible one(s) were added. It gets really tricky to represent this state. I may be remembering wrong here, but I think on GKE we will leave "Programmed" set to "True" with the last generation it was Programmed, but I don't think we have any clear guidance for these kinds of partially valid states in the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

//
// For example, the following Listener scenarios may be compatible
// depending on implementation capabilities:
//
// 1. Multiple Listeners with the same Port that all use the "HTTP"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example is a core feature

I wonder if "may be compatible" is applicable to it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compatibility is weird because it ends up being about implementation capabilities rather than rules in the spec. If your implementation isn't compatible with something the core requires, it just can't implement the spec.

We could bring this further out of core by making it something outside the language above and mentioning specific ports, e.g. "Multiple Listeners with Port "9999" that all use the "HTTP" protocol (and similar for the other examples, just to make them consistent).

We could also separate examples that are within core or outside it, but it wouldn't be great for brevity.

// Protocol that all have unique Hostname values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an example, so it may not be critical here, but it might be helpful to define whether *.example.com and foo.example.com would be considered unique here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8a9acfb adds this to the following Hostname precedence section. Do you think that works? It's not explicitly saying they are considered distinct, but it is using them as distinct values in the context where it matters.

// 2. Multiple Listeners with the same Port that use either the "HTTPS" or
// "TLS" Protocol that all have unique Hostname values.
// 3. A mixture of "TCP" and "UDP" Protocol Listeners, where no Listener
// with the same Protocol has the same Port value.
//
// An implementation that cannot serve both TCP and UDP listeners on the same
// address, or cannot mix HTTPS and generic TLS listens on the same port
// would not consider those cases compatible.
//
// Implementations using the Hostname value to select between same-Port
// Listeners MUST match inbound request hostnames from the most specific
// to least specific Hostname values to find the correct set of Routes.
// Exact matches must be processed before wildcard matches, and wildcard
// matches must be processed before fallback (empty Hostname value)
// matches. For example, `"foo.example.com"` takes precedence over
// `"*.example.com"`, and `"*.example.com"` takes precedence over `""`.
//
// Implementations SHOULD NOT match requests to less specific Listeners if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cascading listener matches make sense for some cases.

Let's consider an example below:

Gateway:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: gateway
spec:
  gatewayClassName: nginx
  listeners:
  - name: example
    port: 80
    protocol: HTTP
    hostname: "*.example.com"
  - name: cafe-http
    port: 80
    protocol: HTTP
    hostname: "cafe.example.com"

HTTPRoutes:

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: coffee
spec:
  parentRefs:
  - name: gateway
    sectionName: example
  hostnames:
  - "cafe.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /coffee
    backendRefs:
    - name: coffee
      port: 80
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: tea
spec:
  parentRefs:
  - name: gateway
    sectionName: cafe-http
  hostnames:
  - "*.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /tea
    backendRefs:
    - name: tea
      port: 80

The way we interpret this in NKG is two rules:

  • Hostname: cafe.example.com Path: /tea
  • Hostname: cafe.example.com Path: /coffee

As you can see, although those rules come from HTTPRoutes belonging to different listeners, they end up sharing the same hostname cafe.example.com

However, this goes against "For example, a request for subdomain.example.com/path SHOULD NOT match an HTTPRouteattached to a Listener with Hostname *.example.com if there is a Listener with with Hostname subdomain.example.com, even if no routes on the second Listener match". In my example, a request cafe.example.com/coffee will fail to match the tea HTTPRoute on more specific listener cafe.example.com but will succeed to match on the coffee HTTPRoute of the less specific listener *.example.com.

Perhaps the problem here is that in Gateway API we allow hostname of one listener (like *.example.com) to include the hostname of another listener (like cafe.example.com).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually have the same question for @rainest since I think both NKG and Kong may have similar behavior here.

Copy link
Contributor

@pleshakov pleshakov Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I further tried the example #2288 (comment) on the following implementations (I put the files into the a gist -- https://gist.github.com/pleshakov/9e890e359ee47c4be908f13590ee9377 ):

Implementation curl http://cafe.example.com/coffee curl http://cafe.example.com/tea
Contour ghcr.io/projectcontour/contour:v1.26.0 200 from coffee 200 from tea
Envoy Gateway docker.io/envoyproxy/gateway:latest 200 from coffee 404
GKE gke-l7-global-external-managed, v1.27.3-gke.100 200 from coffee 200 from tea

latest = docker.io/envoyproxy/gateway-dev@sha256:db1e923970985d8d30183e9647c6dbb0fcf93674d1e12d5269c7cc2540154d12

As you can see, in addition to NGINX Kubernetes Gateway, listener cascading also happens on Contour and GKE, so it doesn't not only appear in NGINX-based implementations.

Based on our discussion in the Gateway meeting from Sep 18, it was mentioned that Envoy has one level host header (virtual host) matching, which sounds similar to NGINX, which makes me believe that the cascading behavior is the consequence of configuration method used, rather than data-plane specific behavior (of Envoy or NGINX)

This makes me wonder if we need to define those expectations as MUSTs not SHOULDs and have the conformance tests defined too, providing other data planes could also support that.

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

@robscott I will also do some HTTPS / SNI testing shortly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kong selects certificates independent of any HTTP routing. We bind certificates to a set of hostnames and choose whichever cert is associated with the most specific match for the handshake SNI value. HTTP routing happens after based on the HTTP host header. We don't really have an equivalent of the NGINX server_name directive that effectively selects both a certificate and a set of possible HTTP routes at once.

Our HTTP routes can optionally specify a SNI value to also consider the SNI value for HTTP route selection, but this isn't commonly used--usually only if you have configuration that must evaluate before HTTP request data is available, namely the client cert enforcement mechanism.

Most non-certificate TLS configuration is global. Choosing, for example, a separate set of cipher suites generally requires a separate instance.

I'm unsure if we can support both passthrough and terminate on the same IP+port; if we would be able to, for example, cascade from a Terminate Listener for foo.example.com:9999 to a Passthrough for *.example.com:9999. I think we can request it via configuration, but offhand I don't know that the routing engine would handle it properly.

Copy link
Contributor

@arkodg arkodg Sep 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey @pleshakov are you sure you have typed out the sectionName correctly in the HTTPRoute ?
sidenote - i tested this on Envoy Gateway with the latest code and both cases (curl http://cafe.example.com/coffee & http://cafe.example.com/tea) 404

however, here is what you get if you swap the sectionName

apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: coffee
spec:
  parentRefs:
  - name: gateway
    sectionName: cafe-http
  hostnames:
  - "cafe.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /coffee
    backendRefs:
    - name: backend
      port: 3000
---
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
  name: tea
spec:
  parentRefs:
  - name: gateway
    sectionName: example 
  hostnames:
  - "*.example.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /tea
    backendRefs:
    - name: backend 
      port: 3000
curl -I --verbose --header "Host: cafe.example.com" http://localhost:80/tea
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> HEAD /tea HTTP/1.1
> Host: cafe.example.com
> User-Agent: curl/7.86.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 404 Not Found
HTTP/1.1 404 Not Found
< date: Tue, 19 Sep 2023 23:53:32 GMT
date: Tue, 19 Sep 2023 23:53:32 GMT
< server: envoy
server: envoy
< transfer-encoding: chunked
transfer-encoding: chunked

< 
* Connection #0 to host localhost left intact
$ curl -I --verbose --header "Host: cafe.example.com" http://localhost:80/coffee
*   Trying 127.0.0.1:80...
* Connected to localhost (127.0.0.1) port 80 (#0)
> HEAD /coffee HTTP/1.1
> Host: cafe.example.com
> User-Agent: curl/7.86.0
> Accept: */*
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< content-type: application/json
content-type: application/json
< x-content-type-options: nosniff
x-content-type-options: nosniff
< date: Tue, 19 Sep 2023 23:53:44 GMT
date: Tue, 19 Sep 2023 23:53:44 GMT
< content-length: 516
content-length: 516
< x-envoy-upstream-service-time: 0
x-envoy-upstream-service-time: 0
< server: envoy
server: envoy

< 
* Connection #0 to host localhost left intact

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arkodg

hey @pleshakov are you sure you have typed out the sectionName correctly in the HTTPRoute ?
sidenote -

yep. the idea of this example is for the /coffee HTTPRoute with cafe.example.com hostname to attach to the *.example.com listener and for the /tea HTTPRoute with *.example.com hostname to attach to the cafe.example.com listener.

Most implementations I tested (GKE, Contour, NGINX Gateway) end up creating two routing rules:

  • Hostname: cafe.example.com Path: /tea
  • Hostname: cafe.example.com Path: /coffee

So although those rules come from HTTPRoutes attached to different listeners, they end up sharing the same hostname cafe.example.com of the listener "cafe.example.com"

Envoy Gateway creates only one routing rule:

  • Hostname: cafe.example.com Path: /tea

Copy link
Contributor

@pleshakov pleshakov Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robscott

@pleshakov how does this work when the listeners are HTTPS? I'm assuming you would terminate based on the cert(s) specified on one listener and would not want to cascade to other listeners?

I extended the example to include TLS termination using two certs: CN *.example.com for example listener and CNcafe.example.com for cafe listener -- https://gist.github.com/pleshakov/607bec3a9e617435fce3d9574806a7c4

Below are the results:

Implementation Coffee curl Tea curl Coffee mismatch host header and SNI curl Tea mismatch host header and SNI curl
Contour ghcr.io/projectcontour/contour:v1.26.0 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com error: Connection reset by peer error: Connection reset by peer
Envoy Gateway docker.io/envoyproxy/gateway:latest 404, CN=cafe.example.com 200 from tea, CN=cafe.example.com 200 from coffee, CN=*.example.com 404, CN=*.example.com
GKE gke-l7-global-external-managed, v1.27.3-gke.100 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com 200 from coffee, CN=*.example.com 200 from tea, CN=*.example.com
NGINX Gateway Fabric, edge* 200 from coffee, CN=cafe.example.com 200 from tea, CN=cafe.example.com error: routines:ST_CONNECT:tlsv1 unrecognized name error: routines:ST_CONNECT:tlsv1 unrecognized name

edge = sha256:c8040a0d968911496123fd75087a42606f2bdc00b7ec4bf27be280e6bf1a3347
latest = docker.io/envoyproxy/gateway-dev@sha256:db1e923970985d8d30183e9647c6dbb0fcf93674d1e12d5269c7cc2540154d12

Note: CN=... means the CN of the TLS cert used by the data plane, as reported by curl

Coffee curl

curl --resolve cafe.example.com:$GW_HTTPS_PORT:$GW_IP https://cafe.example.com:$GW_HTTPS_PORT/coffee --insecure -v

Tea curl

curl --resolve cafe.example.com:$GW_HTTPS_PORT:$GW_IP https://cafe.example.com:$GW_HTTPS_PORT/tea --insecure -v

Coffee mismatch host header and SNI curl

curl --resolve some.example.com:$GW_HTTPS_PORT:$GW_IP https://some.example.com:$GW_HTTPS_PORT/coffee --insecure -v -H "host: cafe.example.com"

Tea mismatch host header and SNI curl

curl --resolve some.example.com:$GW_HTTPS_PORT:$GW_IP https://some.example.com:$GW_HTTPS_PORT/tea --insecure -v -H "host: cafe.example.com"

Looking at the results, the examined implementations at least have same behavior in one column - Tea curl :)

// they exhaust attached routes on a more specific Listener. For example, a
// request for `subdomain.example.com/path` SHOULD NOT match an HTTPRoute
// attached to a Listener with Hostname `*.example.com` if there is a
// Listener with with Hostname `subdomain.example.com`, even if no routes
// on the second Listener match.
//
// Implementations MAY merge separate Gateways onto a single set of
// Addresses if all Listeners across all Gateways are compatible.
//
// Support: Core
//
Expand Down
128 changes: 84 additions & 44 deletions config/crd/experimental/gateway.networking.k8s.io_gateways.yaml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading