Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: refine error msg of aws sdk #11288

Merged
merged 5 commits into from
Jul 31, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions src/connector/src/source/kinesis/source/reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ use std::time::Duration;

use anyhow::{anyhow, Result};
use async_trait::async_trait;
use aws_sdk_kinesis::error::SdkError;
use aws_sdk_kinesis::error::{DisplayErrorContext, SdkError};
use aws_sdk_kinesis::operation::get_records::{GetRecordsError, GetRecordsOutput};
use aws_sdk_kinesis::types::ShardIteratorType;
use aws_sdk_kinesis::Client as KinesisClient;
Expand Down Expand Up @@ -180,7 +180,7 @@ impl KinesisSplitReader {
self.new_shard_iter().await?;
continue;
}
Err(e) => return Err(anyhow!(e)),
Err(e) => return Err(anyhow!(DisplayErrorContext(e))),
}
}
}
Expand Down
7 changes: 6 additions & 1 deletion src/meta/src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
use std::backtrace::Backtrace;
use std::sync::Arc;

use aws_sdk_ec2::error::DisplayErrorContext;
use risingwave_common::error::BoxedError;
use risingwave_connector::sink::SinkError;
use risingwave_pb::PbFieldNotFound;
use risingwave_rpc_client::error::RpcError;
Expand Down Expand Up @@ -71,6 +73,9 @@ enum MetaErrorInner {
#[error("Sink error: {0}")]
Sink(SinkError),

#[error("AWS SDK error: {}", DisplayErrorContext(&**.0))]
Aws(BoxedError),

#[error(transparent)]
Internal(anyhow::Error),
}
Expand Down Expand Up @@ -187,7 +192,7 @@ where
E: std::error::Error + Sync + Send + 'static,
{
fn from(e: aws_sdk_ec2::error::SdkError<E>) -> Self {
MetaErrorInner::Internal(e.into()).into()
MetaErrorInner::Aws(e.into()).into()
}
}

Expand Down
11 changes: 1 addition & 10 deletions src/meta/src/hummock/manager/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2736,16 +2736,7 @@ async fn write_exclusive_cluster_id(

let cluster_id_dir = format!("{}/{}/", state_store_dir, CLUSTER_ID_DIR);
let cluster_id_full_path = format!("{}{}", cluster_id_dir, CLUSTER_ID_NAME);
let metadata = match object_store.list(&cluster_id_dir).await {
Ok(metadata) => metadata,
Err(_) => {
return Err(ObjectError::internal(
"Fail to access remote object storage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we want to see this error message every time meta fails to access object storage, and not only aws s3, but also gcs, oss...
So maybe we can not limite it to only aws error.

Copy link
Member Author

@xxchan xxchan Jul 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not limited to AWS error. Here we just propagate the error, so the message is completely decided by the error itself.

The problem is AWS error's Display, which is fixed by this PR.

e.g., this is the panic msg of gcs (opendal)

thread 'risingwave-main' panicked at 'called `Result::unwrap()` on an `Err` value: ObjectStore(Unexpected (temporary) at Pager::next, context: { called: reqsign::LoadCredential, service: gcs, path: hummock_001/cluster_id/ } => loading credential to sign http request, source: error sending request for url (http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https://www.googleapis.com/auth/devstorage.read_write): error trying to connect: dns error: failed to lookup address information: nodename nor servname provided, or not known
  backtrace of `ObjectError`:

BTW, opendal even has other logging

2023-07-28T12:56:18.140486+02:00 ERROR opendal::services: service=gcs operation=list path=hummock_001/cluster_id/ -> failed: Unexpected (temporary) at Pager::next => loading credential to sign http request

Context:
    called: reqsign::LoadCredential
    service: gcs
    path: hummock_001/cluster_id/

Source: error sending request for url (http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https://www.googleapis.com/auth/devstorage.read_write): error trying to connect: dns error: failed to lookup address information: nodename nor servname provided, or not known

Caused by:
    0: error trying to connect: dns error: failed to lookup address information: nodename nor servname provided, or not known
    1: dns error: failed to lookup address information: nodename nor servname provided, or not known
    2: failed to lookup address information: nodename nor servname provided, or not known

Stack backtrace:
   0: std::backtrace_rs::backtrace::libunwind::trace

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also comfortable to add more error message here, or panic directly. But we should at least also provide the original error message, which turns out to be very informative.

And personally I think "please check if your Access Key and Secret Key are configured correctly" is not always true and can be misleading, so I removed it.

please check if your Access Key and Secret Key are configured correctly. ",
)
.into())
}
};
let metadata = object_store.list(&cluster_id_dir).await?;

if metadata.is_empty() {
object_store
Expand Down
3 changes: 2 additions & 1 deletion src/object_store/src/object/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ use std::backtrace::Backtrace;
use std::io;
use std::marker::{Send, Sync};

use aws_sdk_s3::error::DisplayErrorContext;
use aws_sdk_s3::operation::get_object::GetObjectError;
use aws_sdk_s3::operation::head_object::HeadObjectError;
use aws_sdk_s3::primitives::ByteStreamError;
Expand All @@ -27,7 +28,7 @@ use crate::object::Error;

#[derive(Error, Debug)]
enum ObjectErrorInner {
#[error(transparent)]
#[error("s3 error: {}", DisplayErrorContext(&**.0))]
S3(BoxedError),
#[error("disk error: {msg}")]
Disk {
Expand Down