Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to multiple match types #641

Merged
merged 15 commits into from
Aug 21, 2020
Merged

Add support to multiple match types #641

merged 15 commits into from
Aug 21, 2020

Conversation

jcmoraisjr
Copy link
Owner

@jcmoraisjr jcmoraisjr commented Aug 11, 2020

Refactor in the HAProxy's frontend map builder to allow new match types.

The supported types are:

  • Begin: case insensitive check of the start of the path, where /api matches /api/v1 and also /apiv1. This is the current behavior.
  • Exact: case sensitive check of the full path.
  • Prefix: case sensitive check of the start of the path, where /api matches /api/v1 but does not match /apiv1.
  • Regex: case sensitive check of the path using an extended regular expression.

Begin, Exact and Prefix are configurable via k8s 1.18 ingress spec, which defaults to Begin. All of them are configurable via annotation. The ingress spec configuration has precedence over annotation config in the case of a conflict.

Tasks:

  • Cleanup map config
  • Refactor frontend map builder
  • Implement new match types - Exact and Prefix
  • Implement pathType (from k8s 1.18 spec) and path-type annotation
  • Implement crt-list builder - remove Empty match type
  • Ignore maps without content - remove from the cfg and do not create the file
  • Exploratory tests
  • Docs

There were two redirect configs due to the possibility of frontend split when hosts configurations conflict. All the conflicts were removed, a frontend cannot split anymore and the redirect maps can be merged into a single one.
use_server was used to reference which frontend owns a request on a multi frontend configuration. Since the removal of the frontend split option, use_server list can also be removed.
frontend name was used to name them on a multi-frontend configuration, which doesn't exist anymore
Frontend now is unique and global, so we don't need scoped and numbered map prefix. This commit changes such prefix and also uses a better naming.
MatchType allows to configure how a hostname and/or path received will match against a map/list of hostnames and/or paths used to route requests or apply constraints. This is a pre requisite to path as regex and PathType from the k8s 1.18 ingress spec.

HostsMap was rewritten from scratch and its rewrite changed the naming pattern (again) of the HAProxy maps. Most of the changes in the template and instance_test refers to this naming change.
Up to v0.10 map files was just simple and regex, where simple was always created. This is being changed now that we can have up to four files (and counting) per filter type. So lets render, write and load map files only when needed, adjusting the configuration file to behave just like it was empty - without the need to create and load it.

At this moment only the current implementation was migrated: Begin and Regex match.
@sergiimk
Copy link

Hi @jcmoraisjr, I have a question related to this PR if you don't mind:

We started using haproxy-ingress around February when we ran into missing support for regex matching and had to implement solution I shared in #551.

That solution allowed us to create multiple deployments of the same service and route traffic based on URLs to one or another such deployment.

For example if we have my-service application we would create deployments like:

  • my-service-foobar
  • my-service-baz
  • my-service-default
    ... where foobar and baz groups would handle traffic only for some URLs configured via three different Ingress objects with rules like:
# "foo-bar" ingress
# handles /v1/foo/ and /v1/bar/
---
  rules:
  - host: my-service.kube.example.net
    http:
      paths:
      - backend:
          serviceName: my-service-foobar
          servicePort: http
        path: /v[0-9]+/(foo|bar)/

# "baz" ingress
# handles /v1/baz/
---
  rules:
  - host: my-service.kube.example.net
    http:
      paths:
      - backend:
          serviceName: my-service-baz
          servicePort: http
        path: /v[0-9]+/baz/

# "default" ingress
# handles all other paths
---
  rules:
  - host: my-service.kube.example.net
    http:
      paths:
      - backend:
          serviceName: my-service-default
          servicePort: http
        path: /

Note that all three ingresses are using the same host, so we relied on haproxy-ingress to merge the rules together for us.

Yesterday I have rebased #551 onto latest master and I noticed that this merging behaviour seems to be gone now. Instead of having multiple host-paths in the /etc/haproxy/maps/_global_http_front_regex.map controller only uses the first ingress it discovered for this host.

My question is:

  • Is this behavior expected
  • Is support for such merging coming back in this PR, or you think merging multiple Ingress objects with the same host was invalid behavior and we shouldn't have relied on it?

@jcmoraisjr
Copy link
Owner Author

Hi, merging distinct paths from distinct ingress objects and sharing the same hostname is a supported feature and will continue to be. I'll tag a new snapshot within the next few days, as soon as this PR is merged. If you find this functionality broken please file a new issue with reproducible steps.

This is just a rename, without change in the behavior.

TLS auth lists are used to match incoming domains against a list of hostnames with optional or mandatory client certificate. The old names described an expected behavior in the case of an error inspecting the client crt. The new names properly describe the meaning of the list: `TLSAuthList` has hostnames which uses tls client auth, `TLSNeedCrtList` has hostnames whose client crt is mandatory.
A new match type in haproxy is done using a new string match or a new converter derivative. The model configuration is abstract, only need to create a new `MatchType` constant and use it as a path attribute.

Most of the changes are adding new `http-request set-var()` keywords reading a new `map()` converter derivative - `map()` when exact match and `map_dir()` when prefix match. Begin was already implemented and using `map_beg()`, regex was already implemented as well (wildcards and alias-regex annotation) and using `map_reg()`.

Most of the test changes reflect the impact of adding a new acl on `map_beg()` due to the following chain:

- `setvar(v) fetch,map()`
- `setvar(v) fetch,map_dir() if !{ var(v) -m found }`
- `setvar(v) fetch,map_beg() if !{ var(v) -m found }`
- `setvar(v) fetch,map_reg() if !{ var(v) -m found }`

The `map_beg()` line didn't need a `-m found` acl before `map()` and `map_dir()` implementation, because of that his commit does not have new tests - they will be added in a new commit on behalf of visibility of the changes.
Reads and configures path type info from ingress spec, k8s 1.18 and above, and path-type configuration key. Ingress spec has precedente if provided, the default value is `ImplementationSpecific` which will use the annotation value. The default annotation value is `begin`, that matches with the old behavior.
Namespace map was being created despite the var-namespace configuration. This commit changes this behavior and also uses a faster config verifier when building the configuration file.
A PathLink uniquely identifies a hostname and a path. It is used during the configuration building to identify the target(s) of a configuration, which the config file generator uses to build keywords and ACLs.

The old implementation just concatenate hostname and path, which doesn't work well on regex match, so this refactor is a pre requisite to properly implement it.
Change the regex match path type to not use implicit begin-of-line and end-of-line commands, resulting in a more intuitive configuration and compatible with other web servers. The problem of this approach is that hostname and path are concatenated and used together to match domain and path in a single step. The expected behavior was preserved changing the presence or absence of the begin/end of line between hostname and path to a proper regex.
All domain and path validations use the same keywords with almost the same ACLs to analyse input requests. All similar fetches and converters are now declared just once inside a for-loop, resulting in a smarter final configuration and also easier to extend to more match types.
The only match type up to now was begin, which is case insensitive. This was implemented lowering all `host`, `path` and `base` fetch methods as well as the hostnames and paths declared in the ingress object - `map()` converter does not have a case insensitive version.

Now lower is only used in the hostname and map files used to match the `begin` path type, other path types are case sensitive and are compared as is.
@jcmoraisjr jcmoraisjr changed the title WIP: Add support to multiple match types Add support to multiple match types Aug 21, 2020
@jcmoraisjr jcmoraisjr merged commit 7f6b4f1 into master Aug 21, 2020
@jcmoraisjr jcmoraisjr deleted the jm-path-type branch August 21, 2020 01:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants