Skip to content

docs: update path front-proxy sandbox#599

Merged
junr03 merged 1 commit intomasterfrom
docs-front-sandbox
Mar 20, 2017
Merged

docs: update path front-proxy sandbox#599
junr03 merged 1 commit intomasterfrom
docs-front-sandbox

Conversation

@junr03
Copy link
Member

@junr03 junr03 commented Mar 20, 2017

No description provided.

@RomanDzhabarov
Copy link
Member

+1

@junr03 junr03 merged commit 490cddb into master Mar 20, 2017
@junr03 junr03 deleted the docs-front-sandbox branch March 20, 2017 22:45
mathetake added a commit that referenced this pull request Mar 3, 2026
**Commit Message**

This commit is a relatively large refactoring of internals to make Envoy
AI Gateawy's API more aligned with Envoy Gateway's BackendTrafficPolicy
as well as HTTPRoute. Specifically, the main objective here to allow
failover and retires to work well across multiple AIServiceBackend.

One of the most notable changes in this commit is that we split the
extproc's logic into two phases; one is executed at the normal router
level that selects a route (as opposed to the backend selection
previously) and the other as the upstream filter that performs auth and
transformation. In other words, Envoy AI Gateway configures two external
processing filters.

As a result, users are now able to configure failover as well as the
retry/fallback using Envoy Gateway's BackendTrafficPolicy attached to
HTTPRoute generated by the Envoy AI Gateway. For example, this allows us
to support the case where primary cluster is an Azure OpenAI and when
it's failing, the AI Gateway fallbacks to AWS Bedrock with the standard
Envoy Gateway configuration.

**Background**
At the Envoy configuration level, Envoy Gateway translates multiple
backends in a single HTTPRoute's Rule into a single Envoy cluster whose
endpoints consists of multiple Endpoint set (called
`LocalityLbEndpoints` in Envoy API [1]) and each set corresponds to a
Backend with priority configured. For example, very roughly speaking,
the following pseudo HTTPRoute

```yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata
  name: provider-fallback
spec:
  rules:
  - backendRefs:
    - group: gateway.envoyproxy.io
      kind: Backend
      name: primary-backend
    - group: gateway.envoyproxy.io
      kind: Backend
      name: secondary-backend
    matches:
    - path:
        type: PathPrefix
        value: /
```

will be translated as, when `secondary-backend` is marked as `fallback:
true` in its Backend definition ([2]):

```yaml
- cluster:
  '@type': type.googleapis.com/envoy.config.cluster.v3.Cluster
  loadAssignment:
    clusterName: httproute/default/provider-fallback/rule/0
    endpoints:
    - lbEndpoints:
      - endpoint:
          address:
            socketAddress:
              address: primary.com
              portValue: 443
      priority: 0
    - lbEndpoints:
      - endpoint:
          address:
            socketAddress:
              address: secondary.com
              portValue: 443
      priority: 1
```

where priority is configured 0 and 1 for each primary and secondary
backend. When retry or passive health check is configured, Envoy will
retry or fallback into the secondary cluster.

In our API, transformation as well as upstream authentication must be
performed per Backend so these logic must be inserted after this
endpoint set (or LocalityLbEndpoints to be precise) is chosen by Envoy.
For example, primary.com and secondary.com might have different API
schema, authentication etc. Since Envoy has a specific HTTP filter chain
that will be executed at this stage, which is called "upstream filters",
if we insert the extproc that performs these logic, we can properly do
authn/z and transformation in response to the retry attempts by Envoy
natively.

From the upstream filter level external processor's perspective, it
needs to know which exactly backend is chosen by the Envoy's cluster
load balancing logic. We add some additional metadata information into
the endpoint with EG's extension server so that the extproc can retrieve
these information. We also use the extension server to insert the
upstream extproc filter since currently it's not supported by EG. These
logic in our extension server can be eliminated when the corresponding
functionality become available in EG ([3],[4]).

**Caveats**
* Due to the limitation of EG's extension server API, AIBackendService
that references k8s Service cannot be supported so we have to drop the
support for it. Since there's a workaround for it, it should be fine
plus EG can be fixed easily so the version after the next release should
be able to revive the support.
* `aigw run` temporarily disabled until [5] is resolved
* Infernce Extension support temporarily disabled but will be revived
before the next release.

[1]
https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/endpoint/v3/endpoint_components.proto
[2]
https://gateway.envoyproxy.io/latest/api/extension_types/#backendspec
[3] envoyproxy/gateway#5523
[4] envoyproxy/gateway#5351
[5] envoyproxy/gateway#5918


**Related Issues/PRs (if applicable)**

Partially resolves the provider level fallbacks for #34

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
mathetake added a commit that referenced this pull request Mar 3, 2026
**Commit Message**

This deprecates the AIServiceBackend.Timeouts configuration that has
started working not well with the refactored use of HTTPRoute since
#599. Instead, this adds `timeouts` into AIGatewayRouteRule to matche
the one of HTTPRoute in GWAPI.

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
mathetake added a commit that referenced this pull request Mar 3, 2026
**Commit Message**

This fixes `aigw run` command which has been disabled since the
refactoring in #599. This requires a couple bug fixes in Envoy Gateway
side, so this commit includes the upgrade of the EG as a dependency.

**Related Issues/PRs (if applicable)**

* Closes #607
* Includes envoyproxy/gateway/pull/5984
* Includes envoyproxy/gateway/pull/6020

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
mathetake added a commit that referenced this pull request Mar 3, 2026
**Commit Message**

This commit refactors the internal on how the ext proc is deployed.
Specifically, this switches to insert the ext proc container as a
sidecar container of Envoy pods created by Envoy Gateway. This is
another large refactoring that turned out necessary for #599. This
utilizes the mutating webhook to insert the extproc container Envoy
pods.

Making the extproc as as sidecar means that we now have a one-to-one
mapping between Gateway and the extproc hence this naturally resolves
the previously known limitation #509 and now users can attach multiple
AIGatewayRoute(s) to one Gateway.

Implementation note: since the volume mounts only work in the
namespace-scoped way, use-created secrets (like API Keys) cannot be
mounted by the extproc as it runs in "envoy-gateway-system" namespace.
To resolve this, now the controller reads the secret and embed the read
credentials into the "extproc secret" (which is previously known as
"extproc configmap") together with routing, matching and backend
information. That secret is written in the "envoy-gateway-system"
namespace hence it can be mounted by the extproc container.

**Related Issues/PRs (if applicable)**

Resolves #509 
Resolves #621

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
mathetake added a commit that referenced this pull request Mar 3, 2026
**Description**

This commit removes the handwritten header matching code from the
extproc, and instead starts utilizing the hardened envoy native router.

Historically, we had only one giant extproc filter where we did all
logics including model name extraction, routing and then body
transformation & upstream authorization. Since #599, we split into two
external processor filters; one sits at the normal HTTP router and the
other is configured at the per-cluster upstream HTTP filter. In theory,
the one at HTTP router has only one job on request path: extracting
model name from the request body. However, due to the historical reason,
the handwritten router logic component remained, and that comes with not
only a maintenance cost (forcing a complex extproc & control plane
orchestration) but also a potential security vulnerability. In fact,
writing header matching logic can be an easy attack surface, so if it's
possible, we should avoid writing our own header matching (routing
logic) but should rely on the battle-tested hardened envoy native
router.

With this commit, now a regex matching is available as well as there's
no difference between HTTPRoute's matching and AIGatewayRoute's matching
implementation. This also opens up a possibility to support path
matching in our rule.

**Related Issues/PRs (if applicable)**

Ref #612 
Ref #73

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants