Add streamable HTTP MCP handler with app auth config#62975
Add streamable HTTP MCP handler with app auth config#62975gabrielcorado wants to merge 14 commits intomasterfrom
Conversation
Is the ref'ed PR really the one about app auth config? I'm asking because I'm the author of that PR and I'm a bit confused 🙂 |
|
@tele-lion The correct PR is #62324. Sorry about that, GitHub auto-completed to a different PR I've updated the PR description. |
| for config, err := range clientutils.Resources(ctx, h.c.AccessPoint.ListAppAuthConfigs) { | ||
| if err != nil { | ||
| return nil, trace.Wrap(err, "unable to retrieve app auth configs") | ||
| } | ||
| convertedLabels := make(types.Labels) | ||
| for k, vs := range label.ToMap(config.Spec.AppLabels) { | ||
| convertedLabels[k] = apiutils.Strings(vs) | ||
| } | ||
|
|
||
| matched, message, err := services.MatchLabelGetter(convertedLabels, app) |
There was a problem hiding this comment.
Have we considered any alternative APIs to retrieve an AppAuthConfig matching a specific App? It seems suboptimal to have to walk all AppAuthConfigs and match each one by labels just to find the one we are looking for.
There was a problem hiding this comment.
Do you mean an alternative to label matching, or a more efficient way to perform it?
Initially, I've placed a short TTL cache on this as well, so that if it receives multiple subsequent requests in a short period, they all use the same app auth config.
I've removed it because I encountered issues during testing when changing the app auth config while requests were ongoing (such as with active MCP sessions). In this situation, it could cause disruptions (for example, if the JWT audience changed or the app auth config became invalid). However, if we see this as valuable, we can consider a better cache expiration strategy.
There was a problem hiding this comment.
No I didn't mean caching the results here. I was more curious of ListAppAuthConfigs followed by label matching was the best way to find a single app that matched app auth configs.
There was a problem hiding this comment.
The original idea is to limit app auth config to a small subset of your apps; that's why we're testing the requested app to see if it can use any app auth config. Worth noting that on the original use-case, the clusters should have a small set of app auth configs configured (usually one per SSO provider on the cluster).
After speaking with Steve, we came up with a solution that would avoid doing this matching (and app auth config listing) at every request and leave it only when starting a new session. Currently, we need to load the config (and perform the app matching) to retrieve the header name configuration (so we can get the JWT token). If we drop this option and always use the default value (Authorization), we don't need to load the configs. The cost of this change would be making the feature less flexible.
If we think this might become a bottleneck, implementing this change should be simple. Let me know your thoughts.
There was a problem hiding this comment.
The original idea is to limit app auth config to a small subset of your apps
Can we assume this will always hold? Should we design for a world where this is no longer the case now?
we came up with a solution that would avoid doing this matching (and app auth config listing) at every request and leave it only when starting a new session
Loading a single time would be better than on every request.
There was a problem hiding this comment.
Each app_auth_config is basically a new SSO connector but just for apps instead of users. I wouldn't imagine there is a world with many of them. That said, we do handle this well for SSO connectors by having a default connector in auth config or for user to explicitly select a connector name.
We could replicate the part for caller to provide a connector name here. That will require significant redesign and will not have the best UX as we have to embed extra info into the token.
What Gabriel and I have discussed is to drop the custom auth header so we always target Authorization. This way we can compute the session ID without the need to grab app_auth_config for existing sessions (so we only grab app_auth_config for new sessions). I like this because it also reduces number of knobs.
An alternative is to use a watcher similar to what Tyler did for database servers where the watcher serves as a cache we can easily loop through:
#63878
@rosstimothy what do you think about the watcher approach?
There was a problem hiding this comment.
What Gabriel and I have discussed is to drop the custom auth header so we always target Authorization. This way we can compute the session ID without the need to grab app_auth_config for existing sessions (so we only grab app_auth_config for new sessions). I like this because it also reduces number of knobs.
This sounds reasonable to me. Is there any reason we didn't pursue this to begin with? What are the downsides to this approach?
There was a problem hiding this comment.
We wanted to provide a custom solution that allows them to use their own set of header names to forward credentials (JWT tokens). Those are not a requirement but play well when you're dealing with multiple tokens, for example, when your app receives an Authorization header, after doing a token exchange, you could use a different naming to avoid confusion, like Teleport-Authorization-JWT. But it turned out those are not hard requirements and we can drop this customization,
There was a problem hiding this comment.
I've updated the implementation to avoid fetching and matching the app auth config for every request. I'll drop the authorization header property in a separate PR as it will require Terraform and other docs changes.
There was a problem hiding this comment.
PR removing the option from config and service: #64350
| func (h *Handler) getSessionWithAppAuth(ctx context.Context, ws types.WebSession, appAuthConfig *appauthconfigv1.AppAuthConfig) (*sessionWithAppAuth, error) { | ||
| // Put the session in the cache so the next request can use it. | ||
| ttl := ws.Expiry().Sub(h.c.Clock.Now()) | ||
| sess, err := utils.FnCacheGetWithTTL(ctx, h.cache, ws.GetName(), ttl, func(ctx context.Context) (*sessionWithAppAuth, error) { |
There was a problem hiding this comment.
Is there any chance that this key conflicts with a traditional session? Should we prefix the key to better differentiate?
There was a problem hiding this comment.
You brought an important point. Since we're using the WebSession name, there shouldn't be any session conflicts, and if there is, FnCacheGetWithTTL would return an error due to a type mismatch (which would prevent using the wrong session type).
However, since the app auth config sessions use the CreateAppSessionForAppAuth to create sessions, we have two ways of defining the WebSession name (ID):
- App auth with JWT config:
sha256the JWT token contents. - Regular sessions, which use a "crypto-strong pseudo-random generator".
There is a chance of a session ID conflict (though it seems very low), and if it happens, the first session would be overwritten (since we use Put).
@rosstimothy @greedy52 Do you think we should handle this case in CreateAppSessionForAppAuth and return an early error when there is a conflict?
There was a problem hiding this comment.
We should avoid conflicts if we can.
Do you think we should handle this case in CreateAppSessionForAppAuth and return an early error when there is a conflict
What kind of UX does this result in? What are end users expected to do to resolve this error? Can we prefix app auth sessions to eliminate the conflict entirely? Prefixing regular sessions is probably not feasible due to backward compatibility.
There was a problem hiding this comment.
+1 for prefix the app auth sessions then the length will be different
There was a problem hiding this comment.
I'll do this in a separate PR (opening it soon) so we can have a proper (and easier) review, and also better manual testing coverage. I'll update this comment once I have the PR up.
|
Could you please update the PR description with a manual test plan. |
| // Results and errors are sent by the handler. | ||
| return nil, nil |
There was a problem hiding this comment.
If we never return an error will the limiter ever kick in?
There was a problem hiding this comment.
It will. The limiter we're using, WithUnauthenticatedLimiter, doesn't care about the handler's return value, since it only relies on the request. Other handlers like getWebConfig, which uses this same handler, also don't use the handler return for sending results.
I also did manual testing on this and confirmed the rate limiter is being applied and working as expected.
| return nil, trace.BadParameter("unable to resolve requested app by name") | ||
| } | ||
|
|
||
| return servers[0], nil |
There was a problem hiding this comment.
Should we randomize the app server instead of returning the first?
There was a problem hiding this comment.
It won't make much difference here, since this app server is only used to retrieve the app configuration (which is later used to retrieve its labels). This follows the same behavior as existent flow with ResolveFQDN.
The HA flow will still construct an appropriate list of healthy app servers that can serve requests. There, we do the shuffling.
teleport/lib/web/app/session.go
Lines 76 to 90 in 3cc7f12
| for config, err := range clientutils.Resources(ctx, h.c.AccessPoint.ListAppAuthConfigs) { | ||
| if err != nil { | ||
| return nil, trace.Wrap(err, "unable to retrieve app auth configs") | ||
| } | ||
| convertedLabels := make(types.Labels) | ||
| for k, vs := range label.ToMap(config.Spec.AppLabels) { | ||
| convertedLabels[k] = apiutils.Strings(vs) | ||
| } | ||
|
|
||
| matched, message, err := services.MatchLabelGetter(convertedLabels, app) |
There was a problem hiding this comment.
The original idea is to limit app auth config to a small subset of your apps
Can we assume this will always hold? Should we design for a world where this is no longer the case now?
we came up with a solution that would avoid doing this matching (and app auth config listing) at every request and leave it only when starting a new session
Loading a single time would be better than on every request.
| for config, err := range clientutils.Resources(ctx, h.c.AccessPoint.ListAppAuthConfigs) { | ||
| if err != nil { | ||
| return nil, trace.Wrap(err, "unable to retrieve app auth configs") | ||
| } | ||
| convertedLabels := make(types.Labels) | ||
| for k, vs := range label.ToMap(config.Spec.AppLabels) { | ||
| convertedLabels[k] = apiutils.Strings(vs) | ||
| } | ||
|
|
||
| matched, message, err := services.MatchLabelGetter(convertedLabels, app) |
There was a problem hiding this comment.
Each app_auth_config is basically a new SSO connector but just for apps instead of users. I wouldn't imagine there is a world with many of them. That said, we do handle this well for SSO connectors by having a default connector in auth config or for user to explicitly select a connector name.
We could replicate the part for caller to provide a connector name here. That will require significant redesign and will not have the best UX as we have to embed extra info into the token.
What Gabriel and I have discussed is to drop the custom auth header so we always target Authorization. This way we can compute the session ID without the need to grab app_auth_config for existing sessions (so we only grab app_auth_config for new sessions). I like this because it also reduces number of knobs.
An alternative is to use a watcher similar to what Tyler did for database servers where the watcher serves as a cache we can easily loop through:
#63878
@rosstimothy what do you think about the watcher approach?
| func (h *Handler) getSessionWithAppAuth(ctx context.Context, ws types.WebSession, appAuthConfig *appauthconfigv1.AppAuthConfig) (*sessionWithAppAuth, error) { | ||
| // Put the session in the cache so the next request can use it. | ||
| ttl := ws.Expiry().Sub(h.c.Clock.Now()) | ||
| sess, err := utils.FnCacheGetWithTTL(ctx, h.cache, ws.GetName(), ttl, func(ctx context.Context) (*sessionWithAppAuth, error) { |
There was a problem hiding this comment.
+1 for prefix the app auth sessions then the length will be different
Related to RFD 0030e.
Adds the web API endpoints that will accept/handle streamable HTTP MCP connections with app authentication (managed/done by appauthconfig service, introduced at #62324).
Manual testing
tsh app login)tsh mcp connect).tsh mcp connect).Setup
To test the app auth config JWT, we need a JWT token issuer with JSON Web Key Set (JWKS). For this, I used Teleport itself.
First, start your Streamble MCP server. We're using
mcp-everything:Second, configure the
app_service:With this running, we can grab the dumper address and JWT that we'll use for accessing the MCP:
Now, set up the app auth config:
With everything setup you can now use the MCP inspector to connect to the server.
https://your-proxy-addr/mcp/apps/mcp-everythingaddress.Authorizationheader withBearer xxxvalue using the JWT response from the dumper request.