AWS SDK seems to have failed to poison S3 connections after today's outage #827

benesch · 2023-06-14T02:52:06Z

Describe the bug

At @MaterializeInc we run a number of clusterd processes that continually read and write from S3. (We make a streaming database that's in the business of reading data from S3, transforming it, and writing it back to S3.)

During today's AWS outage, these clusterd processes all experienced an outage. Almost all of the clusterd processes recovered, except for two that continually produced error messages like the following:

clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.407205Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.407274Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.407345Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.407427Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.409267Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.409623Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.413165Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.413439Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.416520Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.416801Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.473353Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.473688Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.475366Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.475470Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd no loader was set :-/
clusterd no loader was set :-/
clusterd {"timestamp":"2023-06-14T02:21:42.475532Z","level":"INFO","fields":{"message":"external operation rollup::set failed, retrying in 16s: indeterminate: request has timed out"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.476895Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.477315Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-14T02:21:42.480570Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-14T02:21:42.480894Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 16s: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}

AIUI, rollup::set is attempting to write to S3, while s3 get meta is attempting to read from S3. The following error message (request has timed out) or (failed to construct request) comes straight from the AWS SDK, AFAICT.

Expected Behavior

We expected our S3 reads/writes to eventually succeed on retry.

Current Behavior

The S3 reads/writes continued failing forever.

We validated that the two affected containers did in fact have access to S3—logging in to the containers and manually issuing S3 requests was successful at the time shown in the S3 logs.

Reproduction Steps

So sorry, but we have no reproduction of this. We run chaos tests in our CI that interrupt network connections but we've never seen anything like this. I think it's unlikely we'll see this again until another wide AWS outage.

I wonder if it's something specific about interrupting the IAM connections in the way that the AWS outage did. We don't test that chaos extensively in our CI.

Possible Solution

The reason I wanted to file this issue is because of this println that we're seeing in the output:

https://github.com/awslabs/smithy-rs/blob/312d190535b1c77625d662d18313b90af64cb448/rust-runtime/aws-smithy-http/src/connection.rs#L85

This looks like a stray debugging println. It was added in smithy-lang/smithy-rs#2445. I'm just spitballing, but I'm wondering if this println is related to the issue. Perhaps connections in this process weren't getting poisoned properly because the connection metadata wasn't available?

I've never seen this println in our logs while debugging before. Unfortunately I can't say that we've truly never seen this log on the unaffected processes because as a println rather than a tracing log it doesn't get picked up by our logging infrastructure. It's possible that this println is actually a normal occurrence when there are S3 connectivity issues, and not indicative of a failure to poison broken connections.

If nothing else, seems like the debugging println ought to be removed!

Additional Information/Context

No response

Version

│       ├── aws-credential-types v0.55.1
│       │   ├── aws-smithy-async v0.55.2
│       │   ├── aws-smithy-types v0.55.2
│       ├── aws-sdk-sts v0.26.0
│       │   ├── aws-credential-types v0.55.1 (*)
│       │   ├── aws-endpoint v0.55.1
│       │   │   ├── aws-smithy-http v0.55.2
│       │   │   │   ├── aws-smithy-eventstream v0.55.2
│       │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1
│       │   │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   │   ├── aws-smithy-async v0.55.2 (*)
│       │   │   │   ├── aws-smithy-client v0.55.2
│       │   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-http-tower v0.55.2
│       │   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-http v0.55.1
│       │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1 (*)
│       │   ├── aws-sig-auth v0.55.1
│       │   │   ├── aws-credential-types v0.55.1 (*)
│       │   │   ├── aws-sigv4 v0.55.1
│       │   │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│       │   │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│       │   │   ├── aws-smithy-http v0.55.2 (*)
│       │   │   ├── aws-types v0.55.1 (*)
│       │   ├── aws-smithy-async v0.55.2 (*)
│       │   ├── aws-smithy-client v0.55.2 (*)
│       │   ├── aws-smithy-http v0.55.2 (*)
│       │   ├── aws-smithy-http-tower v0.55.2 (*)
│       │   ├── aws-smithy-json v0.55.2
│       │   │   └── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-query v0.55.2
│       │   │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-types v0.55.2 (*)
│       │   ├── aws-smithy-xml v0.55.2
│       │   ├── aws-types v0.55.1 (*)
│       ├── aws-sig-auth v0.55.1 (*)
│       ├── aws-sigv4 v0.55.1 (*)
│       ├── aws-smithy-http v0.55.2 (*)
│       ├── aws-credential-types v0.55.1 (*)
│       ├── aws-sdk-sts v0.26.0 (*)
│       ├── aws-sig-auth v0.55.1 (*)
│       ├── aws-sigv4 v0.55.1 (*)
│       ├── aws-smithy-http v0.55.2 (*)
│   │   │   ├── aws-config v0.55.1
│   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   ├── aws-sdk-sso v0.26.0
│   │   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   │   ├── aws-endpoint v0.55.1 (*)
│   │   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   │   ├── aws-sig-auth v0.55.1 (*)
│   │   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   │   ├── aws-sdk-sts v0.26.0 (*)
│   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   ├── aws-sdk-s3 v0.26.0
│   │   │   │   ├── aws-credential-types v0.55.1 (*)
│   │   │   │   ├── aws-endpoint v0.55.1 (*)
│   │   │   │   ├── aws-http v0.55.1 (*)
│   │   │   │   ├── aws-sig-auth v0.55.1 (*)
│   │   │   │   ├── aws-sigv4 v0.55.1 (*)
│   │   │   │   ├── aws-smithy-async v0.55.2 (*)
│   │   │   │   ├── aws-smithy-checksums v0.55.2
│   │   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-smithy-client v0.55.2 (*)
│   │   │   │   ├── aws-smithy-eventstream v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http v0.55.2 (*)
│   │   │   │   ├── aws-smithy-http-tower v0.55.2 (*)
│   │   │   │   ├── aws-smithy-json v0.55.2 (*)
│   │   │   │   ├── aws-smithy-types v0.55.2 (*)
│   │   │   │   ├── aws-smithy-xml v0.55.2 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── aws-types v0.55.1 (*)
│   │   │   ├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util)
│   │   │   │   ├── aws-sdk-s3 v0.26.0 (*)
│   │   │   │   ├── aws-types v0.55.1 (*)
│   │   ├── aws-config v0.55.1 (*)
│   │   ├── aws-credential-types v0.55.1 (*)
│   │   ├── aws-types v0.55.1 (*)
│   ├── aws-sdk-sts v0.26.0 (*)
mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
│   ├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
├── aws-config v0.55.1 (*)
├── aws-sdk-s3 v0.26.0 (*)
├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)
├── aws-config v0.55.1 (*)
├── aws-credential-types v0.55.1 (*)
├── aws-sdk-sts v0.26.0 (*)
├── aws-types v0.55.1 (*)
├── mz-aws-s3-util v0.0.0 (/Users/benesch/Sites/materialize/materialize/src/aws-s3-util) (*)

Environment details (OS name and version, etc.)

Linux 5.10.178-162.673.amzn2.x86_64

Logs

No response

The text was updated successfully, but these errors were encountered:

benesch · 2023-06-14T03:00:45Z

Eyeballing the output of some unaffected clusterd processes by hand, though, I'm really not seeing that debugging println on any of those unaffected processes. We see STS errors around 4pm ET (as expected) that continue for a bit and then clear right up:

clusterd {"timestamp":"2023-06-13T20:09:02.084937Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.774907Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.802298Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:02 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.802611Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:02 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:02.815222Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:03.383098Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.116729Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.117057Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.138912Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.228762Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:04 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.229078Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:04 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.229290Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 32ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.266399Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.290968Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.291272Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.291716Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 64ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.364927Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394375Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394685Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.394893Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 128ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.533256Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.560945Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.561244Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.561451Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 256ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:04.813752Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.840466Z","level":"WARN","fields":{"message":"STS returned an error assuming web identity role","error":"service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }))"},"target":"aws_config::web_identity_token","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.840796Z","level":"WARN","fields":{"message":"provider failed to provide credentials","provider":"WebIdentityToken","error":"an error occurred while loading credentials: service error: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements: InvalidIdentityTokenException: Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements (ProviderError(ProviderError { source: ServiceError(ServiceError { source: InvalidIdentityTokenException(InvalidIdentityTokenException { message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), meta: ErrorMetadata { code: Some(\"InvalidIdentityToken\"), message: Some(\"Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements\"), extras: Some({\"aws_request_id\": \" 00000000-0000-0000-0000-000000000000\"}) } }), raw: Response { inner: Response { status: 400, version: HTTP/1.1, headers: {\"x-amzn-requestid\": \" 00000000-0000-0000-0000-000000000000\", \"content-type\": \"text/xml\", \"content-length\": \"390\", \"date\": \"Tue, 13 Jun 2023 20:09:03 GMT\", \"connection\": \"close\"}, body: SdkBody { inner: Once(Some(b\"<ErrorResponse xmlns=\\\"https://sts.amazonaws.com/doc/2011-06-15/\\\">\\n  <Error>\\n    <Type>Sender</Type>\\n    <Code>InvalidIdentityToken</Code>\\n    <Message>Couldn't retrieve verification key from your identity provider,  please reference AssumeRoleWithWebIdentity documentation for requirements</Message>\\n  </Error>\\n  <RequestId> 00000000-0000-0000-0000-000000000000</RequestId>\\n</ErrorResponse>\\n\")), retryable: true } }, properties: SharedPropertyBag(Mutex { data: PropertyBag, poisoned: false, .. }) } }) }))"},"target":"aws_config::meta::credentials::chain","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:04.841011Z","level":"INFO","fields":{"message":"external operation fetch_batch::get failed, retrying in 512ms: indeterminate: s3 get meta err: failed to construct request"},"target":"mz_persist_client::internal::machine"}
clusterd {"timestamp":"2023-06-13T20:09:05.324239Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T20:09:05.363141Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 43.184745ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T21:09:29.655483Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T21:09:29.691265Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 39.945721ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T22:09:39.545786Z","level":"INFO","fields":{"message":"credentials cache returned CredentialsNotLoaded, ignoring"},"target":"aws_http::auth","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T22:09:39.714404Z","level":"INFO","fields":{"message":"credentials cache miss occurred; added new AWS credentials (took 172.900715ms)"},"target":"aws_credential_types::cache::lazy_caching","span":{"name":"lazy_load_credentials"},"spans":[{"name":"lazy_load_credentials"}]}
clusterd {"timestamp":"2023-06-13T23:07:49.226265Z","level":"INFO","fields":{"message":"poisoning connection: SmithyConnection { is_proxied: false, remote_addr: Some(54.231.134.18:443) }"},"target":"aws_smithy_client::poison"}
clusterd {"timestamp":"2023-06-13T23:07:49.226334Z","level":"INFO","fields":{"message":"smithy connection was poisoned"},"target":"aws_smithy_http::connection"}

ysaito1001 · 2023-06-19T19:49:47Z

Hi @benesch , thank you for reporting this issue & providing an analysis in the description. We'll add this to our backlog. @rcoh may have some insights into this. It's conceivable that the outage use case may have exposed a code execution path that was not originally considered when PR2445 was created. In the meantime, if you happen to discover a simpler reproduction step, kindly share it with us.

rcoh · 2023-06-19T20:50:44Z

Are you running with non-stock connectors of some kind? That could cause connection metadata to fail to work 🤔

benesch · 2023-06-20T00:16:25Z

Are you running with non-stock connectors of some kind? That could cause connection metadata to fail to work 🤔

Not as far as I know! This is our configuration: https://github.com/MaterializeInc/materialize/blob/bfbebe04a7d181f1bac91f01785309a52cf2de34/src/persist/src/s3.rs#L111-L180

In the meantime, if you happen to discover a simpler reproduction step, kindly share it with us.

Will do, but I'm afraid it'll be pretty unlikely that we do. To be honest, if the only outcome of this issue is the removal of the debugging println!, that's pretty much what I expect! I just figured I'd file in case it sparked insight for someone—e.g. about some multi-threaded credential connection cache or something like that.

rcoh · 2023-09-05T19:21:01Z

coming back here as I prepare to close this ticket—it seems like in most of your pods, connection poisoning worked as intended and through out the bad connections but in one pod maybe a race condition of some kind caused us to fail to get the poisoning to work. Since no loader was set, it means we never actually made it to even trying to send the request with Hyper—maybe the timeout hit during waiting for a retry or poll_ready on Hyper was pending?

In any case, the println has been replaced with a debug

github-actions · 2023-09-05T19:21:25Z

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

benesch · 2023-09-05T19:33:40Z

Since no loader was set, it means we never actually made it to even trying to send the request with Hyper—maybe the timeout hit during waiting for a retry or poll_ready on Hyper was pending?

Yeah, could be. In any case, thanks for removing that println!!

benesch added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 14, 2023

ysaito1001 removed the needs-triage This issue or PR still needs to be triaged. label Jun 19, 2023

rcoh added needs-reproduction This issue needs reproduction. p2 This is a standard priority issue labels Aug 8, 2023

rcoh closed this as completed Sep 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS SDK seems to have failed to poison S3 connections after today's outage #827

AWS SDK seems to have failed to poison S3 connections after today's outage #827

benesch commented Jun 14, 2023 •

edited

Loading

benesch commented Jun 14, 2023

ysaito1001 commented Jun 19, 2023

rcoh commented Jun 19, 2023

benesch commented Jun 20, 2023

rcoh commented Sep 5, 2023

github-actions bot commented Sep 5, 2023

benesch commented Sep 5, 2023

AWS SDK seems to have failed to poison S3 connections after today's outage #827

AWS SDK seems to have failed to poison S3 connections after today's outage #827

Comments

benesch commented Jun 14, 2023 • edited Loading

Describe the bug

Expected Behavior

Current Behavior

Reproduction Steps

Possible Solution

Additional Information/Context

Version

Environment details (OS name and version, etc.)

Logs

benesch commented Jun 14, 2023

ysaito1001 commented Jun 19, 2023

rcoh commented Jun 19, 2023

benesch commented Jun 20, 2023

rcoh commented Sep 5, 2023

github-actions bot commented Sep 5, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

benesch commented Sep 5, 2023

benesch commented Jun 14, 2023 •

edited

Loading