Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 5.2 Google Cloud's Workload identity federation auth seems broken #15639

Closed
bazaglia opened this issue Jun 8, 2022 · 17 comments
Closed
Assignees
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug

Comments

@bazaglia
Copy link

bazaglia commented Jun 8, 2022

Description of the bug:

Bazel 5.2 updated to the Google Auth library, which supports Workload identity federation, useful for keyless authentication from pipelines. This can be verified in #15383. However, when providing the credentials file through the google_credentials flag:

bazel build //... \
  --remote_cache <cache-url> \
  --google_credentials=${{ steps.auth.outputs.credentials_file_path }}

Bazel just throws an error:

Caused by: java.lang.IllegalArgumentException: Can not set java.util.List field com.google.api.client.http.HttpHeaders.authorization to java.lang.String
	at java.base/jdk.internal.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown Source)
	at java.base/jdk.internal.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown Source)
	at java.base/jdk.internal.reflect.UnsafeObjectFieldAccessorImpl.set(Unknown Source)
	at java.base/java.lang.reflect.Field.set(Unknown Source)
	at com.google.api.client.util.FieldInfo.setFieldValue(FieldInfo.java:245)
	at com.google.api.client.util.FieldInfo.setValue(FieldInfo.java:206)
	at com.google.api.client.util.GenericData.set(GenericData.java:125)
	at com.google.api.client.http.HttpHeaders.set(HttpHeaders.java:175)
	at com.google.api.client.http.HttpHeaders.set(HttpHeaders.java:58)
	at com.google.api.client.util.GenericData.putAll(GenericData.java:138)
	at com.google.auth.oauth2.IdentityPoolCredentials.getSubjectTokenFromMetadataServer(IdentityPoolCredentials.java:233)
	at com.google.auth.oauth2.IdentityPoolCredentials.retrieveSubjectToken(IdentityPoolCredentials.java:188)
	at com.google.auth.oauth2.IdentityPoolCredentials.refreshAccessToken(IdentityPoolCredentials.java:169)
	at com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:257)
	at com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:254)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
	at com.google.auth.oauth2.OAuth2Credentials$AsyncRefreshResult.executeIfNew(OAuth2Credentials.java:580)
	at com.google.auth.oauth2.OAuth2Credentials.asyncFetch(OAuth2Credentials.java:220)
	at com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:170)
	at com.google.auth.oauth2.ExternalAccountCredentials.getRequestMetadata(ExternalAccountCredentials.java:292)
	at com.google.devtools.build.lib.remote.http.AbstractHttpHandler.addCredentialHeaders(AbstractHttpHandler.java:73)
	at com.google.devtools.build.lib.remote.http.HttpDownloadHandler.write(HttpDownloadHandler.java:141)
	at io.netty.channel.AbstractChannelHandlerContext.invokeWrite0(AbstractChannelHandlerContext.java:717)
	at io.netty.channel.AbstractChannelHandlerContext.invokeWriteAndFlush(AbstractChannelHandlerContext.java:764)
	at io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:790)
	at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:758)
	at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:808)
	at io.netty.channel.DefaultChannelPipeline.writeAndFlush(DefaultChannelPipeline.java:1025)
	at io.netty.channel.AbstractChannel.writeAndFlush(AbstractChannel.java:306)
	at com.google.devtools.build.lib.remote.http.HttpCacheClient.lambda$get$6(HttpCacheClient.java:496)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:578)
...

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

bazel build //... \
  --remote_cache <cache-url> \
  --google_credentials=${{ steps.auth.outputs.credentials_file_path }}

Which operating system are you running Bazel on?

Linux on Github Actions

What is the output of bazel info release?

5.2.0

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

No response

Have you found anything relevant by searching the web?

#14278

Any other information, logs, or outputs that you want to share?

No response

@sgowroji sgowroji added type: bug untriaged team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Jun 8, 2022
@bazaglia
Copy link
Author

bazaglia commented Jun 8, 2022

@coeuvre I see you made the cherry-pick adding the related PR to Bazel 5.2 in the first place. Maybe you have a clue about what is wrong?

@tjgq
Copy link
Contributor

tjgq commented Aug 9, 2022

@bazaglia I'd like to look into this, but reproducing it seems to be quite involved. Do you have a repro that does not require setting up a GitHub action? I wonder if a fake credentials file (i.e., with sensitive data replaced by random strings) is sufficient to trigger the issue.

@russellhaering
Copy link

@tjgq if you're interested I can set you up a GH repository to reproduce this pretty easily.

@kylekurz
Copy link

kylekurz commented Aug 9, 2022

I was also able to reproduce this today as well on my production repo. I believe my instructions from #14278 will still reproduce it with minimal effort.

@tjgq
Copy link
Contributor

tjgq commented Aug 10, 2022

The difficult part for me isn't setting up the GitHub repository, it's configuring the GCP workload identity provider: the google.com GCP org policy forbids me from using https://token.actions.githubusercontent.com as the issuer URI. I'd probably need to set up a separate GCP org, but that's going to require a lot more steps that I'm not familiar with.

I do have a working theory, though: in #15176 we upgraded google-auth-library-oauth2-http to 1.6.0, but its dependencies google-http-client and google-http-client-gson were kept at 1.22.0. According to Maven, the minimum required version is 1.41.1 (which in turn requires an additional dependency on opencensus-contrib-http-util 0.31.0). This strongly correlates with the stack trace above.

@kylekurz Are you able to build Bazel with PR #16082 and let me know if you can still repro?

@kylekurz
Copy link

@tjgq I will give this a shot. Might not get to it until tomorrow though.

@kylekurz
Copy link

kylekurz commented Aug 16, 2022

@tjgq sorry for the delay, been fighting migraines for a week. It doesn't look like that branch fixes this:

	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:646)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:382)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: java.lang.IllegalArgumentException: Can not set java.util.List field com.google.api.client.http.HttpHeaders.authorization to java.lang.String
	at java.base/jdk.internal.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown Source)
	at java.base/jdk.internal.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(Unknown Source)
	at java.base/jdk.internal.reflect.UnsafeObjectFieldAccessorImpl.set(Unknown Source)
	at java.base/java.lang.reflect.Field.set(Unknown Source)

Let me know if I can get you more information. I built off the tip of your branch this morning.

EDIT: I did do a second run and dumped the credentials file, so I'm not passing a broken path to Bazel.

tjgq added a commit to tjgq/google-auth-library-java that referenced this issue Aug 19, 2022
HttpHeaders.putAll uses reflective access. Well-known headers such as
Content-Type or Authentication have dedicated fields of type List<String>,
while remaining headers go into a Map<String, Object> grab bag. The
IdentityPoolCredentials#getSubjectTokenFromMetadataServer method attempts
to set every header to a String, which causes a crash for well-known headers.

See bazelbuild/bazel#15639 for where this issue
was first noticed.
@tjgq
Copy link
Contributor

tjgq commented Aug 19, 2022

I was able to repro this today. It looks like there's a bug in the google-auth-library-oauth2-http library. I've sent googleapis/google-auth-library-java#984 to fix it.

tjgq added a commit to tjgq/google-auth-library-java that referenced this issue Aug 19, 2022
IdentityPoolCredentials#getSubjectTokenFromMetadataServer calls
HttpHeaders.putAll to set request headers. The latter sets its fields through
reflective access: well-known headers such as Content-Type or Authentication
have dedicated fields of type List<String>, while remaining headers go into a
Map<String, Object> grab bag. However, we attempt to set every header to a
String, which causes a crash for well-known headers.

See bazelbuild/bazel#15639 for where this issue
was first noticed.
tjgq added a commit to tjgq/google-auth-library-java that referenced this issue Aug 25, 2022
IdentityPoolCredentials#getSubjectTokenFromMetadataServer calls
HttpHeaders.putAll to set request headers. The latter sets its fields through
reflective access: well-known headers such as Content-Type or Authentication
have dedicated fields of type List<String>, while remaining headers go into a
Map<String, Object> grab bag. However, we attempt to set every header to a
String, which causes a crash for well-known headers.

See bazelbuild/bazel#15639 for where this issue
was first noticed.
tjgq added a commit to tjgq/google-auth-library-java that referenced this issue Aug 25, 2022
IdentityPoolCredentials#getSubjectTokenFromMetadataServer calls
HttpHeaders.putAll to set request headers. The latter sets its fields through
reflective access: well-known headers such as Content-Type or Authentication
have dedicated fields of type List<String>, while remaining headers go into a
Map<String, Object> grab bag. However, we attempt to set every header to a
String, which causes a crash for well-known headers.

See bazelbuild/bazel#15639 for where this issue
was first noticed.
tjgq added a commit to tjgq/google-auth-library-java that referenced this issue Aug 25, 2022
IdentityPoolCredentials#getSubjectTokenFromMetadataServer calls
HttpHeaders.putAll to set request headers. The latter sets its fields through
reflective access: well-known headers such as Content-Type or Authentication
have dedicated fields of type List<String>, while remaining headers go into a
Map<String, Object> grab bag. However, we attempt to set every header to a
String, which causes a crash for well-known headers.

See bazelbuild/bazel#15639 for where this issue
was first noticed.
@tjgq
Copy link
Contributor

tjgq commented Aug 26, 2022

I'm no longer convinced there's a bug in google-auth-library-oauth2-http. The test case I added in googleapis/google-auth-library-java#984 passes even without the fix (as the maintainer pointed out).

I'm fairly sure PR #16082 was the right fix all along. I've just managed to run a GitHub action successfully with WIF using a Bazel built at that PR.

tjgq added a commit to tjgq/bazel that referenced this issue Aug 26, 2022
In bazelbuild#15176 we upgraded google-auth-library-oauth2-http to 1.6.0, but didn't
upgrade its dependencies accordingly; Maven claims 1.41.4 is needed [1].
In turn, a new transitive dependency on opencensus-contrib-http-util 0.31.0
also becomes necessary [2].

Fixes bazelbuild#15639.

[1] https://mvnrepository.com/artifact/com.google.auth/google-auth-library-oauth2-http/1.6.0
[2] https://mvnrepository.com/artifact/com.google.http-client/google-http-client/1.41.4
@kylekurz
Copy link

@tjgq does that mean my build of your branch was wrong? I definitely didn't get a successful WIF run using that, but I can try again if you'd like.

@tjgq
Copy link
Contributor

tjgq commented Aug 26, 2022

How exactly are you building and running Bazel? In particular, how does the built Bazel make it into the GitHub action execution environment?

@kylekurz
Copy link

I have a GHA runner I manage in GCP so I can have local cache for some runs, so I just built the binary (on that machine) and called it directly from there instead of using the bazelisk wrapper.

@tjgq
Copy link
Contributor

tjgq commented Aug 26, 2022

Ok, so here's how I verified that it works for me:

I've also confirmed that I get the reported crash if I check in a Bazel binary built without the changes in my PR.

One thing you might want to try is grab the credentials JSON file and run the Bazel binary locally (to take some complexity out of the equation). I'm not sure that these credentials can be reused across build requests, but at least you seem to get Bazel to report a different error (I got something like a 401 Unauthorized when I tried).

@jbms
Copy link

jbms commented Sep 1, 2022

Is this going to be included in a release soon?

@tjgq
Copy link
Contributor

tjgq commented Sep 2, 2022

It will definitely be included in 6.0, but I'm reluctant about backporting it to 5.3.1. There's a lot of complexity in the interaction between Bazel and the OAuth2 support libraries, and we could very easily introduce other bugs.

@kylekurz
Copy link

@tjgq so I'm still not entirely sure what I did wrong building your branch, but I think I agree that your fix works. I took the binary in your test repo and put it on my CI machine, then ran a job that used it and it worked perfectly. Thanks for your research here, I will be watching for when this hits a released version of Bazel!

aiuto pushed a commit to aiuto/bazel that referenced this issue Oct 12, 2022
In bazelbuild#15176 we upgraded google-auth-library-oauth2-http to 1.6.0, but didn't
upgrade its dependencies accordingly; Maven claims 1.41.4 is needed [1].
In turn, a new transitive dependency on opencensus-contrib-http-util 0.31.0
also becomes necessary [2].

Fixes bazelbuild#15639.

[1] https://mvnrepository.com/artifact/com.google.auth/google-auth-library-oauth2-http/1.6.0
[2] https://mvnrepository.com/artifact/com.google.http-client/google-http-client/1.41.4

Partial commit for third_party/*, see bazelbuild#16082.

Signed-off-by: Sunil Gowroji <[email protected]>
tjgq added a commit to tjgq/bazel that referenced this issue Nov 11, 2022
In bazelbuild#15176 we upgraded google-auth-library-oauth2-http to 1.6.0, but didn't
upgrade its dependencies accordingly; Maven claims 1.41.4 is needed [1].
In turn, a new transitive dependency on opencensus-contrib-http-util 0.31.0
also becomes necessary [2].

Fixes bazelbuild#15639.

[1] https://mvnrepository.com/artifact/com.google.auth/google-auth-library-oauth2-http/1.6.0
[2] https://mvnrepository.com/artifact/com.google.http-client/google-http-client/1.41.4

Partial commit for third_party/*, see bazelbuild#16082.

Signed-off-by: Sunil Gowroji <[email protected]>
@tjgq
Copy link
Contributor

tjgq commented Nov 11, 2022

FYI, I'm going to backport this into 5.4.0 because I got a report of another user running into an issue related to this.

ShreeM01 pushed a commit that referenced this issue Nov 15, 2022
In #15176 we upgraded google-auth-library-oauth2-http to 1.6.0, but didn't
upgrade its dependencies accordingly; Maven claims 1.41.4 is needed [1].
In turn, a new transitive dependency on opencensus-contrib-http-util 0.31.0
also becomes necessary [2].

Fixes #15639.

[1] https://mvnrepository.com/artifact/com.google.auth/google-auth-library-oauth2-http/1.6.0
[2] https://mvnrepository.com/artifact/com.google.http-client/google-http-client/1.41.4

Partial commit for third_party/*, see #16082.

Signed-off-by: Sunil Gowroji <[email protected]>

Signed-off-by: Sunil Gowroji <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team-Remote-Exec Issues and PRs for the Execution (Remote) team type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants