Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS SDK seems to have failed to poison S3 connections after today's outage #827

Closed
benesch opened this issue Jun 14, 2023 · 7 comments
Closed
Labels
bug This issue is a bug. needs-reproduction This issue needs reproduction. p2 This is a standard priority issue

Comments

@benesch
Copy link
Contributor

benesch commented Jun 14, 2023

Describe the bug

At @MaterializeInc we run a number of clusterd processes that continually read and write from S3. (We make a streaming database that's in the business of reading data from S3, transforming it, and writing it back to S3.)

During today's AWS outage, these clusterd processes all experienced an outage. Almost all of the clusterd processes recovered, except for two that continually produced error messages like the following:

clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.407205Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.407274Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.407345Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.407427Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.409267Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.409623Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.413165Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.413439Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.416520Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.416801Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.473353Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.473688Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.475366Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.475470Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.475532Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.476895Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.477315Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.480570Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.480894Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}

AIUI, rollup::set is attempting to write to S3, while s3 get meta is attempting to read from S3. The following error message (request has timed out) or (failed to construct request) comes straight from the AWS SDK, AFAICT.

Expected Behavior

We expected our S3 reads/writes to eventually succeed on retry.

Current Behavior

The S3 reads/writes continued failing forever.

We validated that the two affected containers did in fact have access to S3—logging in to the containers and manually issuing S3 requests was successful at the time shown in the S3 logs.

Reproduction Steps

So sorry, but we have no reproduction of this. We run chaos tests in our CI that interrupt network connections but we've never seen anything like this. I think it's unlikely we'll see this again until another wide AWS outage.

I wonder if it's something specific about interrupting the IAM connections in the way that the AWS outage did. We don't test that chaos extensively in our CI.

Possible Solution

The reason I wanted to file this issue is because of this println that we're seeing in the output:

https://github.com/awslabs/smithy-rs/blob/312d190535b1c77625d662d18313b90af64cb448/rust-runtime/aws-smithy-http/src/connection.rs#L85

This looks like a stray debugging println. It was added in smithy-lang/smithy-rs#2445. I'm just spitballing, but I'm wondering if this println is related to the issue. Perhaps connections in this process weren't getting poisoned properly because the connection metadata wasn't available?

I've never seen this println in our logs while debugging before. Unfortunately I can't say that we've truly never seen this log on the unaffected processes because as a println rather than a tracing log it doesn't get picked up by our logging infrastructure. It's possible that this println is actually a normal occurrence when there are S3 connectivity issues, and not indicative of a failure to poison broken connections.

If nothing else, seems like the debugging println ought to be removed!

Additional Information/Context

No response

Version

│       ├── aws-credential-types v0.55.1
│       │   ├── aws-smithy-async v0.55.2
│       │   ├── aws-smithy-types v0.55.2
│       ├── aws-sdk-sts v0.26.0
│       │   ├── aws-credential-types v0.55.1 (*)
│       │   ├── aws-endpoint v0.55.1
│       │   │   ├── aws-smithy-http v0.55.2
│       │   │   │   ├── aws-smithy-eventstream v0.55.2
│       │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1
│       │   │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   │   ├── aws-smithy-async v0.55.2 (*)
│       │   │   │   ├── aws-smithy-client v0.55.2
│       │   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-http-tower v0.55.2
│       │   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-http v0.55.1
│       │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1 (*)
│       │   ├── aws-sig-auth v0.55.1
│       │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   ├── aws-sigv4 v0.55.1
│       │   │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│       │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│       │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1 (*)
│       │   ├── aws-smithy-async v0.55.2 (*)
│       │   ├── aws-smithy-client v0.55.2 (*)
│       │   ├── aws-smithy-http v0.55.2 (*)
│       │   ├── aws-smithy-http-tower v0.55.2 (*)
│       │   ├── aws-smithy-json v0.55.2
│       │   │   └── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-query v0.55.2
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-xml v0.55.2
│       │   ├── aws-types v0.55.1 (*)
│       ├── aws-sig-auth v0.55.1 (*)
│       ├── aws-sigv4 v0.55.1 (*)
│       ├── aws-smithy-http v0.55.2 (*)
│       ├── aws-credential-types v0.55.1 (*)
│       ├── aws-sdk-sts v0.26.0 (*)
│       ├── aws-sig-auth v0.55.1 (*)
│       ├── aws-sigv4 v0.55.1 (*)
│       ├── aws-smithy-http v0.55.2 (*)
│   │   │   ├── aws-config v0.55.1
│   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   ├── aws-sdk-sso v0.26.0
│   │   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   │   ├── aws-endpoint v0.55.1 (*)
│   │   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   │   ├── aws-sig-auth v0.55.1 (*)
│   │   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   │   ├── aws-sdk-sts v0.26.0 (*)
│   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   ├── aws-sdk-s3 v0.26.0
│   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   ├── aws-endpoint v0.55.1 (*)
│   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   ├── aws-sig-auth v0.55.1 (*)
│   │   │   │   ├── aws-sigv4 v0.55.1 (*)
│   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   ├── aws-smithy-checksums v0.55.2
│   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-smithy-xml v0.55.2 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util)
│   │   │   │   ├── aws-sdk-s3 v0.26.0 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   ├── aws-config v0.55.1 (*)
│   │   ├── aws-credential-types v0.55.1 (*)
│   │   ├── aws-types v0.55.1 (*)
│   ├── aws-sdk-sts v0.26.0 (*)
mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
│   ├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
├── aws-config v0.55.1 (*)
├── aws-sdk-s3 v0.26.0 (*)
├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
├── aws-config v0.55.1 (*)
├── aws-credential-types v0.55.1 (*)
├── aws-sdk-sts v0.26.0 (*)
├── aws-types v0.55.1 (*)
├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)

Environment details (OS name and version, etc.)

Linux 5.10.178-162.673.amzn2.x86_64

Logs

No response

@benesch benesch added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 14, 2023
@benesch
Copy link
Contributor Author

benesch commented Jun 14, 2023

Eyeballing the output of some unaffected clusterd processes by hand, though, I'm really not seeing that debugging println on any of those unaffected processes. We see STS errors around 4pm ET (as expected) that continue for a bit and then clear right up:

clusterd {"timestamp":"2023-06-13T20:09:02.084937Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.774907Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.802298Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:02 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.802611Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:02 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.815222Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:03.383098Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.116729Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.117057Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.138912Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.228762Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:04 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.229078Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:04 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.229290Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 32ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.266399Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.290968Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.291272Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.291716Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 64ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.364927Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394375Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394685Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394893Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 128ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.533256Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.560945Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.561244Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.561451Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 256ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.813752Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.840466Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.840796Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.841011Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 512ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:05.324239Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:05.363141Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 43.184745ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T21:09:29.655483Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T21:09:29.691265Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 39.945721ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T22:09:39.545786Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T22:09:39.714404Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 172.900715ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T23:07:49.226265Z","level":"INFO","fields":{"message":"poisoning connection: SmithyConnection { is_proxied: false, remote_addr: Some(54.231.134.18:443) }"},"target":"aws_smithy_client::poison"}
clusterd {"timestamp":"2023-06-13T23:07:49.226334Z","level":"INFO","fields":{"message":"smithy connection was poisoned"},"target":"aws_smithy_http::connection"}

@ysaito1001
Copy link
Collaborator

Hi @benesch , thank you for reporting this issue & providing an analysis in the description. We'll add this to our backlog. @rcoh may have some insights into this. It's conceivable that the outage use case may have exposed a code execution path that was not originally considered when PR2445 was created. In the meantime, if you happen to discover a simpler reproduction step, kindly share it with us.

@ysaito1001 ysaito1001 removed the needs-triage This issue or PR still needs to be triaged. label Jun 19, 2023
@rcoh
Copy link
Contributor

rcoh commented Jun 19, 2023

Are you running with non-stock connectors of some kind? That could cause connection metadata to fail to work 🤔

@benesch
Copy link
Contributor Author

benesch commented Jun 20, 2023

Are you running with non-stock connectors of some kind? That could cause connection metadata to fail to work 🤔

Not as far as I know! This is our configuration: https://github.com/MaterializeInc/materialize/blob/bfbebe04a7d181f1bac91f01785309a52cf2de34/src/persist/src/s3.rs#L111-L180

In the meantime, if you happen to discover a simpler reproduction step, kindly share it with us.

Will do, but I'm afraid it'll be pretty unlikely that we do. To be honest, if the only outcome of this issue is the removal of the debugging println!, that's pretty much what I expect! I just figured I'd file in case it sparked insight for someone—e.g. about some multi-threaded credential connection cache or something like that.

@rcoh rcoh added needs-reproduction This issue needs reproduction. p2 This is a standard priority issue labels Aug 8, 2023
@rcoh
Copy link
Contributor

rcoh commented Sep 5, 2023

coming back here as I prepare to close this ticket—it seems like in most of your pods, connection poisoning worked as intended and through out the bad connections but in one pod maybe a race condition of some kind caused us to fail to get the poisoning to work. Since no loader was set, it means we never actually made it to even trying to send the request with Hyper—maybe the timeout hit during waiting for a retry or poll_ready on Hyper was pending?

In any case, the println has been replaced with a debug

@rcoh rcoh closed this as completed Sep 5, 2023
@github-actions
Copy link

github-actions bot commented Sep 5, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@benesch
Copy link
Contributor Author

benesch commented Sep 5, 2023

Since no loader was set, it means we never actually made it to even trying to send the request with Hyper—maybe the timeout hit during waiting for a retry or poll_ready on Hyper was pending?

Yeah, could be. In any case, thanks for removing that println!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-reproduction This issue needs reproduction. p2 This is a standard priority issue
Projects
None yet
Development

No branches or pull requests

3 participants