Skip to content

Conversation

@dimpavloff
Copy link
Contributor

@dimpavloff dimpavloff commented Aug 22, 2025

Part two for grpc/proposal#492 (A97), following #8431 .

What this PR does is:

  • update internal/xds/bootstrap with support for loading multiple PerRPCCallCredentials specifed in a new call_creds field in the boostrap file as per A97
  • adjust xds/internal/xdsclient/clientimpl.goto use the call credentials when constructing the client
  • update xds/bootstrap to register the jwtcreds call credentials and make them available if GRPC_EXPERIMENTAL_XDS_BOOTSTRAP_CALL_CREDS is enabled

I have added DialOptionsWithCallCredsForTransport because, even though current and future call credentials are likely to all expect secure transport, I thought it would be safer to check of insecure transport just in case. If you prefer, I can just update DialOptions to use all call credentials regardless of the transport.

Relates to istio/istio#53532

RELEASE NOTES:

  • xds bootstrap: add support for loading a JWT from file and use it as Call Credentials (A97). This is guarded by GRPC_EXPERIMENTAL_XDS_BOOTSTRAP_CALL_CREDS

Copy link
Contributor Author

@dimpavloff dimpavloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @easwars ; I think it's ready for another pass

// CallCreds contains the call credentials configuration for individual RPCs.
// It implements gRFC A97 call credentials structure.
type CallCreds struct {
// Type contains a unique name identifying the call credentials type.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, you're right, it doesn't. I was influenced by the ChannelCreds.Type but there it makes more sense given that only one gets picked up. I've updated the comment

}

// CallCreds returns the call credentials configuration for this server.
func (sc *ServerConfig) CallCreds() []CallCreds {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to keep symmetry with ChannelCreds and SelectedChannelCreds. However, SelectedCallCreds now returns []credentials.PerRPCCredentials rather than the config, still, which breaks this symmetry.

Realistically, I'm not sure even if the public methods are needed -- the PerRPCCredentials are added to the ServerConfig.extraDialOptions and used via ServerConfig.DialOptions()

WDYT, shall I remove the public methods? I've renamed them in the meantime. I've used CallCredsConfigs, plural, because I also renamed the Go type to CallCredsConfig and the method returns a collection of those. I've also renamed the relevant fields in ServerConfig (and ServerConfigTestingOptions) in similar way.

// Skip unsupported call credential types (don't fail bootstrap).
continue
}
callCred, cancel, err := c.Build(callCredConfig.Config)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, yes, this did confuse me a bit when starting. Thanks for spotting this

Comment on lines 75 to 80
if err != nil {
t.Fatalf("NewCallCredentials failed: %v", err)
}
if callCreds == nil {
t.Fatal("Expected non-nil bundle")
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, much better, thanks!

@dimpavloff dimpavloff requested a review from easwars October 3, 2025 13:39
@dimpavloff dimpavloff removed their assignment Oct 3, 2025
@easwars easwars self-assigned this Oct 8, 2025
Comment on lines 265 to 269
originalJWTEnabled := envconfig.XDSBootstrapCallCredsEnabled
envconfig.XDSBootstrapCallCredsEnabled = true
defer func() {
envconfig.XDSBootstrapCallCredsEnabled = originalJWTEnabled
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a helper function for this:

func SetEnvConfig[T any](t *testing.T, variable *T, value T) {

Comment on lines +32 to +36
// NewCallCredentials returns a new JWT token based call credentials. The input
// config must match the structure specified in gRFC A97. The caller is expected
// to invoke the second return value when they are done using the returned call creds.
// The cancel function is idempotent.
func NewCallCredentials(configJSON json.RawMessage) (credentials.PerRPCCredentials, func(), error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding named return values to help with the documentation.

// NewCallCredentials creates and returns JWT token-based call credentials
// using the input config (which must match gRFC A97).
//
// The caller must invoke the returned cancel function (which is idempotent),
// when they are done using the returned call credentials.
func NewCallCredentials(configJSON json.RawMessage) (c credentials.PerRPCCredentials, cancel func(), err error) {

"google.golang.org/grpc/internal/grpctest"
)

const defaultCtxTimeout = 5 * time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/defaultCtxTimeout/defaultTestTimeout/

callCreds, cleanup, err := NewCallCredentials(json.RawMessage(tt.config))

if err == nil {
t.Fatal("NewCallCredentials: expected error, got nil")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adding the config to the error message:
t.Fatal("NewCallCredentials(%s) succeeded when expected to fail", tt.config)

t.Fatal("NewCallCredentials: expected error, got nil")
}
if callCreds != nil {
t.Error("NewCallCredentials: Expected nil bundle to be returned")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: there is no bundle anymore.
t.Error("NewCallCredentials(%s) returned non-nil credentials", tt.config)

Comment on lines 93 to 97
if cleanup == nil {
t.Error("NewCallCredentials: Expected non-nil cleanup function to be returned")
} else {
defer cleanup()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nicer to always return a non-nil cancel func (even in error cases), to avoid such checks at callsites.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, updated. Although I think it might be good not to document it as then it might sound like the caller should call the cancel function even when an error is returned which would be an unnecessary ritual. OTOH, in this case, the test will be asserting implementation detail..
Or I can keep it as is. What do you prefer?

}

// Test that call credentials get used
ctx, cancel := context.WithTimeout(context.Background(), defaultCtxTimeout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need a 5s timeout here? And this const is only used here, you could remove the global and just inline the value here. Maybe a second at best should be good enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely don't need anywhere near 5 seconds.

func writeTempFile(t *testing.T, content string) string {
t.Helper()
tempDir := t.TempDir()
filePath := filepath.Join(tempDir, "tempfile")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't matter. But could be rename the file to something more appropriate. Maybe "jwt_token"?

if cc.Config == nil {
return cc.Type
}
// We do not expect the Marshal call to fail since we wrote to cc.Config
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Here and in a few other places, please terminate comment lines with periods. https://google.github.io/styleguide/go/decisions#comment-sentences

channelCreds []ChannelCreds
serverFeatures []string
serverURI string
channelCreds []ChannelCreds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are here, could you please add a TODO to rename ChannelCreds to ChannelCredsConfig to match what we have with the call creds? Thanks.

return cc.Type + "-" + string(b)
}

// CallCredsConfig contains the call credentials configuration for individual RPCs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to clarify in this docstring that these call creds are used when talking to the xDS management server. The docstring on the ChannelCreds type does that. So, what do you think about something like:

CallCredsConfig contains the call credentials configuration to be used on 
RPCs to the management server.

I think the part about A97 is not that important and could be skipped. Thanks.

case !slices.Equal(sc.serverFeatures, other.serverFeatures):
return false
case !sc.selectedCreds.Equal(other.selectedCreds):
case !sc.selectedChannelCreds.Equal(other.selectedChannelCreds):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is existing code, but I'm wondering if we can remove this field from the equality check for a couple of reasons:

  • we are already checking the channel credentials configuration (from which we selected this credentials) for equality
  • this type is supposed to semantically represent configuration corresponding to an xDS server and not any runtime state. We are already abusing it a little (or optimizing) by storing the selected credentials in it, so that we don't have to parse the configuration again when we actually want to use the credentials.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - because we're already comparing sc.channelCreds and other.channelCreds and because the config doesn't get dynamically updated, it makes sense.
Shall I just remove it in this PR or send a follow-up PR in case we're missing something?

}
features := strings.Join(sc.serverFeatures, "-")
return strings.Join([]string{sc.serverURI, sc.selectedCreds.String(), features}, "-")
return strings.Join([]string{sc.serverURI, sc.selectedChannelCreds.String(), features, sc.CallCredsConfigs().String()}, "-")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you asking me if we want to include the channel creds here and I said yes. But thinking about this now, I'm wondering if the returned string is going to be too verbose, especially since the String() method of the call credentials config returns a JSON (as does the channel credentials). Have you see what this string actually looks like in any of your testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit verbose and why I was asking about it (I should have been clearer). The returned value looks like this xds-server:443-tls-{}-jwt_token_file-{"jwt_token_file":"/token.jwt"}

Comment on lines 239 to 244
// CallCreds returns the built call credentials that are ready to use.
// These are the credentials that were successfully built from the call_creds
// configuration.
func (sc *ServerConfig) CallCreds() []credentials.PerRPCCredentials {
return sc.selectedCallCreds
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this method at all (and the selectedCallCreds field)? We are currently storing the selected call credentials directly as dial options in the extraDialOptions field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't, other than for testing; I had it for symmetry with ChannelCreds.
I've assumed that you'd prefer that we delete it so I have done so and adjusted the unit tests a bit (in bootstrap_test.go and clientimpl_test.go). A downside is that DialOpts feels a bit distant/opaque when unit testing the call credentials config functionality.

Comment on lines 594 to 598
original := envconfig.XDSBootstrapCallCredsEnabled
envconfig.XDSBootstrapCallCredsEnabled = true
defer func() {
envconfig.XDSBootstrapCallCredsEnabled = original
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use the helper function here as well testutils.SetEnvConfig.

Comment on lines +76 to +79
// Build returns a PerRPCCredentials created from the provided
// configuration, and a function to clean up any additional resources
// associated with them when they are no longer needed.
Build(config json.RawMessage) (credentials.PerRPCCredentials, func(), error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize that I added a comment about possibly using named return values for this method on the JWT call creds implementation. But it looks like we haven't done that for the existing channel credentials interface which has a similar set of return values. So, I'm fine with leaving this as is. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kept it without for consistency.

builder Credentials
typename string
builder ChannelCredentials
minimumRequiredConfig json.RawMessage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this new field when it is always set to nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because I had the Channel and Call creds together initially. Removed.

if bundle == nil {
t.Errorf("%T.Build returned nil bundle, expected non-nil", test.builder)
}
stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we defer this call to stop right after the line where we call Build?

if bundle == nil {
t.Errorf("%T.Build returned nil bundle, expected non-nil", test.builder)
}
stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

}

func TestJwtCallCredentials_DisabledIfFeatureNotEnabled(t *testing.T) {
builder := GetCallCredentials("jwt_call_creds")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we explicitly set the value of the env var to false instead of running this test under the assumption that it is false (which it currently is). But if we explicitly set it here, then we wont have to change the test when we change the default value of the env var to true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that eventually the variable would just be removed and the behaviour – turned on. So my intention here is asserting that the feature is off by default rather than that it can be controlled by the env variable.
When the default changes, the test would then rightly fail this assertion and in this case it would make sense to just delete it.

Happy to adjust it if you still prefer.

@easwars easwars assigned dimpavloff and unassigned easwars Oct 10, 2025
@easwars
Copy link
Contributor

easwars commented Oct 10, 2025

@dimpavloff This PR is mostly looking good. Most of the comments in this pass are nits. We should be able to resolve them quickly and move on. Thank you very much!

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label Oct 19, 2025
Copy link
Contributor Author

@dimpavloff dimpavloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback and apologies for the delay, @easwars , it's been a hectic week.
The PR is ready for a another look, PTAL.

Comment on lines 239 to 244
// CallCreds returns the built call credentials that are ready to use.
// These are the credentials that were successfully built from the call_creds
// configuration.
func (sc *ServerConfig) CallCreds() []credentials.PerRPCCredentials {
return sc.selectedCallCreds
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't, other than for testing; I had it for symmetry with ChannelCreds.
I've assumed that you'd prefer that we delete it so I have done so and adjusted the unit tests a bit (in bootstrap_test.go and clientimpl_test.go). A downside is that DialOpts feels a bit distant/opaque when unit testing the call credentials config functionality.

}
features := strings.Join(sc.serverFeatures, "-")
return strings.Join([]string{sc.serverURI, sc.selectedCreds.String(), features}, "-")
return strings.Join([]string{sc.serverURI, sc.selectedChannelCreds.String(), features, sc.CallCredsConfigs().String()}, "-")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's a bit verbose and why I was asking about it (I should have been clearer). The returned value looks like this xds-server:443-tls-{}-jwt_token_file-{"jwt_token_file":"/token.jwt"}

Comment on lines +76 to +79
// Build returns a PerRPCCredentials created from the provided
// configuration, and a function to clean up any additional resources
// associated with them when they are no longer needed.
Build(config json.RawMessage) (credentials.PerRPCCredentials, func(), error)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've kept it without for consistency.

}

// Test that call credentials get used
ctx, cancel := context.WithTimeout(context.Background(), defaultCtxTimeout)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely don't need anywhere near 5 seconds.

Comment on lines 93 to 97
if cleanup == nil {
t.Error("NewCallCredentials: Expected non-nil cleanup function to be returned")
} else {
defer cleanup()
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, updated. Although I think it might be good not to document it as then it might sound like the caller should call the cancel function even when an error is returned which would be an unnecessary ritual. OTOH, in this case, the test will be asserting implementation detail..
Or I can keep it as is. What do you prefer?

}

func TestJwtCallCredentials_DisabledIfFeatureNotEnabled(t *testing.T) {
builder := GetCallCredentials("jwt_call_creds")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that eventually the variable would just be removed and the behaviour – turned on. So my intention here is asserting that the feature is off by default rather than that it can be controlled by the env variable.
When the default changes, the test would then rightly fail this assertion and in this case it would make sense to just delete it.

Happy to adjust it if you still prefer.

builder Credentials
typename string
builder ChannelCredentials
minimumRequiredConfig json.RawMessage
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because I had the Channel and Call creds together initially. Removed.

case !slices.Equal(sc.serverFeatures, other.serverFeatures):
return false
case !sc.selectedCreds.Equal(other.selectedCreds):
case !sc.selectedChannelCreds.Equal(other.selectedChannelCreds):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - because we're already comparing sc.channelCreds and other.channelCreds and because the config doesn't get dynamically updated, it makes sense.
Shall I just remove it in this PR or send a follow-up PR in case we're missing something?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: xDS Includes everything xDS related, including LB policies used with xDS. Status: Requires Reporter Clarification Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants