Skip to content

WIP POC: integrate matching API with router#17037

Closed
snowp wants to merge 4 commits intoenvoyproxy:mainfrom
snowp:router-matcher-poc
Closed

WIP POC: integrate matching API with router#17037
snowp wants to merge 4 commits intoenvoyproxy:mainfrom
snowp:router-matcher-poc

Conversation

@snowp
Copy link
Contributor

@snowp snowp commented Jun 18, 2021

A proof of concept of how we might integrate the matching API with the HTTP route table. Here we accomplish this
by using the envoy::config::route::v3::Route as the match action, which allows configuring a route/direct response/redirect
route and most of the other route table features via the match API.

A few problems appeared while implementing this that will need to be addressed:

  1. There is a proto dependency between matcher.proto and route_components.proto, which means
    that we can't just define a match tree within the vhost, as this leads to a circular dependency. There is no actual circular dependency between the match tree and the route components, but it happens to live in the same package as the MatchPredicate which does have a dependency on the route components via HeaderMatcher. To avoid this problem for now this PR just uses
    an Any proto in the route table to avoid a hard coupling.
  2. The current Route class exposed to filters etc. exposes information about the route and the associated route entry. This would allow a filter to check what kind of match resulted in the route (prefix, exact, etc.), but this becomes nonsensical when using the matching tree.
  3. There is synergy in the current route table entries around rewriting the matched path: this also becomes nonsensical with the new matching tree as there could be multiple match conditions on the path.

Point 1) is a bit of a challenge since I'm not sure if we can break the proto cycle without moving protos to another package, which breaks API compatibility. Another idea would be to implement the match tree as a route entry extension in some form; this would also avoid this hard dependency.
Point 2) and 3) makes me wonder if we perhaps need to start from scratch with a new set of protos for matching via the tree, as otherwise we'll have a hard time support the API surface implied by the existing route protos. For the route information we expose to HTTP filters we might get by with returning null values for data not supported, sufficient to avoid nullptr derefs.

@mattklein123 fyi if you have any thoughts. No need to review this PR in depth, more curious about how to best resolve the API issues.

This sits on top of #17025, as this allows us to pass through the necessary bits from the route construction context to the action factories.

Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]

Snow Pettersen added 4 commits June 17, 2021 13:02
Prior to this PR the MatchTreeFactory requried a ServerFactoryContext in
order to pass this to the action factories. This is not available in all
contexts, and there are other possible use cases where we might need
to pass additional information to the match tree factory (e.g. the
router might need to pass the parent vhost configuration).

This PR introduces a paramterized action context, allowing each usage of
the MatchTree to define its own data that should be presented to the
action factories. This also has the nice benefit of partitioning the
action factories per context: each unique ActionFactoryContext defines a
new factory type, ensuring that actions defined for a HTTP filter
context won't be conflated with actions defined for another context.

Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
Signed-off-by: Snow Pettersen <snowp@lyft.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #17037 was opened by snowp.

see: more, trace.

@markdroth
Copy link
Contributor

  • There is a proto dependency between matcher.proto and route_components.proto, which means
    that we can't just define a match tree within the vhost, as this leads to a circular dependency. There is no actual circular dependency between the match tree and the route components, but it happens to live in the same package as the MatchPredicate which does have a dependency on the route components via HeaderMatcher. To avoid this problem for now this PR just uses
    an Any proto in the route table to avoid a hard coupling.

Can we just move the new matcher proto to its own file, since that is still alpha? It's only the legacy HTTP-specific matchers in that package that depend on route_components.proto. (I'd ideally like to see those legacy matchers go away as we migrate to the new matching tree, but I realize that that won't actually fix anything until the next time we a major version bump, which will be approximately never. :) )

  • The current Route class exposed to filters etc. exposes information about the route and the associated route entry. This would allow a filter to check what kind of match resulted in the route (prefix, exact, etc.), but this becomes nonsensical when using the matching tree.

Are there any filters that actually do that? This seems like it may be more of an accident of the way that the protos are structured than an intentional part of the API exposed to filters. Maybe it's okay to just stop allowing that.

  • There is synergy in the current route table entries around rewriting the matched path: this also becomes nonsensical with the new matching tree as there could be multiple match conditions on the path.

One possible way to solve this might be to plumb some state down through the matching tree that would be available to action plugins, sort of like the way that dynamic metadata is used in filters. That would allow passing this information down to an action that performs a header rewrite.

That having been said, it might just be cleaner to change the rewrite syntax to do its own matching, not associated with the matching that led to the route in the first place.

Point 2) and 3) makes me wonder if we perhaps need to start from scratch with a new set of protos for matching via the tree, as otherwise we'll have a hard time support the API surface implied by the existing route protos.

I think this is going to be a sufficiently invasive change that I would probably be okay with switching to a new set of protos if that will actually help us, but I'm not sure that it will actually buy us much.

At first glance, it looks like the only real structural change we need from the existing protos is that the Route proto would not include the match field when used as the action in the matching tree. It seems like we could handle that by changing the validation rules to make that field optional, although that would admittedly weaken the validation for the old API. If that's not acceptable, we can just create a copy of the Route proto that duplicates everything except the match field, and use that copy as the action in the matching tree.

Are there are other structural changes that would be needed here that I'm not seeing?

@snowp
Copy link
Contributor Author

snowp commented Jun 22, 2021

Can we just move the new matcher proto to its own file, since that is still alpha?

This would be the best thing I agree, but wasn't sure if the API compatibility rules would allow it given that its part of v3, not v3alpha. It's "alpha ness" is coming from a docs tag. If we can move it I'll be happy to do so

Are there any filters that actually do that?

I think the main offending function is pathMatchCriterion, which exposes what match requirement resulted in this match (added here: #2531). We could potentially extend this function to include support for the matching tree, but the tree matching structure doesn't really fit well into the std::string matcher format used by the API.

One possible way to solve this might be to plumb some state down through the matching tree that would be available to action plugins, sort of like the way that dynamic metadata is used in filters. That would allow passing this information down to an action that performs a header rewrite.
That having been said, it might just be cleaner to change the rewrite syntax to do its own matching, not associated with the matching that led to the route in the first place.

Yeah, the challenge is that the regex/prefix write does a transformation on the "matched value", which doesn't make as much sense when you can be matching on the same value multiple times. Reworking how these work with the match tree might be the best option, but there are some API challenges at that point with users potentially configuring these when using a Route with the match tree? Maybe we can just explicitly disallow this?

I'm not sure about any structural issues beyond this, but it's a bit hard to say for sure due to the large API surface of the route.

@markdroth
Copy link
Contributor

This would be the best thing I agree, but wasn't sure if the API compatibility rules would allow it given that its part of v3, not v3alpha. It's "alpha ness" is coming from a docs tag. If we can move it I'll be happy to do so

I thought I remembered that we had an exception carved out for the [#alpha:] tag, but it looks like we only do this for [#not-implemented-hide:...]:

https://github.com/envoyproxy/envoy/blob/main/api/API_VERSIONING.md#backwards-compatibility

@envoyproxy/api-shepherds, anyone object to also carving out an exception for [#alpha:] here?

I think the main offending function is pathMatchCriterion, which exposes what match requirement resulted in this match (added here: #2531). We could potentially extend this function to include support for the matching tree, but the tree matching structure doesn't really fit well into the std::string matcher format used by the API.

It might also be useful to know which filters actually call that function, and what the filters are doing with that information. If we better understand the actual use-cases, we may be able to come up with better alternatives.

Yeah, the challenge is that the regex/prefix write does a transformation on the "matched value", which doesn't make as much sense when you can be matching on the same value multiple times.

Good point. I suppose we could handle this by adding an extension point to allow plugging in a rewriter at each node in a tree. That way, we rewrite as we go through the match tree, each rewrite using the matched value from the matcher at that node. But this approach does seem really complicated; for example, if we wind up not matching at the bottom of the tree, we'd have to back out any rewrites we might have done on the way down the tree. It's probably not worth the complexity.

Reworking how these work with the match tree might be the best option, but there are some API challenges at that point with users potentially configuring these when using a Route with the match tree? Maybe we can just explicitly disallow this?

The clearest thing might be to just replace the current rewriting fields in RouteAction with a new set of fields that do their own matching, rather than relying on the result of the path match. We could just deprecate the existing fields and migrate to the new fields even when people don't use the match tree API. It's admittedly a little less efficient in that case, but if the long-term goal is to move people to the match tree anyway, that might be fine.

@htuch
Copy link
Member

htuch commented Jun 23, 2021

Yeah, you can do whatever you want with alpha, I think maybe update the docs as well.

@mattklein123
Copy link
Member

Thanks very exciting. Directionally this discussion sounds right to me. Per others I would suggest we start by moving the proto to some other place so we can break the dependency issue. I also agree that starting from scratch is probably not going to buy us much. I think targeted deprecation would probably make sense but perhaps we can take a look at that once we have a bit more fleshed out in terms of moving protos and some high level docs? (IMO actually doing either the RST archgdocs or a design doc might be best for this discussion.)

/wait

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Jul 29, 2021
@github-actions
Copy link

github-actions bot commented Aug 5, 2021

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot closed this Aug 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants