-
-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate from rusoto to aws-sdk-rust #849
Conversation
I could not get a new Region struct to be created with v as its value, kept running into lifetime issues with value.
Perhaps something like this is needed to get HTTP status codes, as with S3WithCode(StatusCode, String).
Codecov ReportBase: 64.91% // Head: 64.80% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #849 +/- ##
==========================================
- Coverage 64.91% 64.80% -0.11%
==========================================
Files 68 68
Lines 11790 11755 -35
==========================================
- Hits 7653 7618 -35
Misses 4137 4137
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for picking this up again!
#[error("S3 error code: {1} (http status: {0})")] | ||
S3WithCode(StatusCode, String), | ||
#[error("aws-sdk: failed to fetch data from S3")] | ||
S3SDK(#[from] aws_smithy_http::byte_stream::Error), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should be S3Sdk
? Doesn't clippy complain about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any complaints from Clippy. It's only used here. Do you want me to rename it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let's rename this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed.
self.create_s3_client(provider, region) | ||
let provider = LazyCachingCredentialsProvider::builder() | ||
.load(provide_credentials_fn(|| async { | ||
aws_config::ecs::EcsCredentialsProvider::builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should check with the folks who added the AutoRefreshingProvider
and ask them to test this to see if it works equivalently. I recall at some point they had an outage around this. git blame should point in the right direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll investigate this. Once I have this branch working well enough, I can also test this myself to some extent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still pending me looking into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am keeping an eye on the seemingly-related awslabs/aws-sdk-rust#629.
tracing::debug!("Skipping response from s3://{}/{}: {}", bucket, &key, err0); | ||
return match &err0 { | ||
ConstructionFailure(err) => { | ||
println!("ERROR: ConstructionFailure: {:?}", err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like you're still debugging all this. But you could capture all in a single match arm and maybe capture it as a sentry error before throwing cancelled up the chain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea but I am not sure how to go about that, any pointers? Doesn't it require access to a Sentry installation and corresponding API keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking for some guidance here as I know nothing about Sentry. SOme of these can probably grouped into a single response but the mapping from Rusoto errors to AWS SDK errors isn't clear to me (although granted I have not spent much time investigating this yet).
Also, the Rusoto code does some XML parsing of the AWS responce, I wonder how much of that is still required with the AWS SDK.
crates/symbolicator/src/sources.rs
Outdated
let pos = AWS_REGIONS.iter().position(|&x| x == value); | ||
match pos { | ||
Some(_) => Ok(Region::new(String::from(value))), | ||
None => Err(E::custom(format!("unknown region: {}", value))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this means we need to updated our code every time they add a region? That might not be ideal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated this code to use the EC2 describe_regions
function and cache the result. Things I don't like about my solution:
- I'm calling
tokio::runtime::Runtime::new()
to grab a runtime so I can callblock_on
on it. This is needed because thedescribe_regions
call uses anasync
send
wrapper. Would it not be more appropriate to use one of the existing runtimes created inserver.rs
? - I'm creating a
shared_config
usingaws_config::from_env().load().await
; not even sure this will work, given that it should really use thecredentials_provider
(Container
orStatic
) associated with the current source. Thoughts?
Also, does the use of the #[cached]
macro look OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, this does indeed break the unit tests. Have to rethink this, because we need the region list during serde
config parsing yet we won't have the credentials until after parsing. And there's no way to get the region list anonymously beforehand so we can pre-populate the list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just accept any string in serde and just deal with the error once you try and use an invalid region?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll do that.
By the way, Rusoto currently also hardcodes the list of known regions, so if a new region is added, a recompile is needed anyway once Rusoto is updated to include it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about I add a simple test at startup to see if sts get-caller-identity
works and if it fails, error out with an appropriate error message?
crates/symbolicator/src/sources.rs
Outdated
|
||
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result { | ||
formatter.write_str("string or tuple") | ||
formatter.write_str("string") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this type change going to be a problem for some configurations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; custom regions, which use tuples per the docs, don't currently work, as the AWS Rust SDK doesn't support them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ouch, that's quite a regression. this custom region support was also contributed by someone who needed it so it this would probably break for them. there is no way at all to support this? the exact way of configuring doesn't necessarily have to stay the same, but somehow supporting it would be needed i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is in-scope for the AWS SDK, at any rate it doesn't support it today so if we want to offer this functionality we'll have to add it ourselves. :-/
I yes, we introduced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does look very good, thanks for sticking with it for so long!
The only missing thing would be to correctly map the errors. But I wouldn’t spend too much effort on that.
If we correctly flag connection timeout and notfound, we should be good. everything else can be an opaque S3Error.
assert_eq!(cfg.source_key.access_key, "the-access-key"); | ||
assert_eq!(cfg.source_key.secret_key, "the-secret-key"); | ||
} | ||
_ => unreachable!(), | ||
} | ||
} | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably continue supporting custom regions. I commented above
.bucket(bucket.clone()) | ||
.key(key.clone()) | ||
.send(); | ||
|
||
let source = RemoteDif::from(file_source.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe if you clean up all the println!
s below, you can avoid these clone
s as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed by converting to clones to references. I'll deal with the println!
s separately.
To get rid of the file_source.clone()
I implemented a trait, please let me know if I should just keep the existing approach.
@@ -264,6 +257,7 @@ impl S3Downloader { | |||
} | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can leave these tests in, they should be noops in case our credentials are not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have to spend some time to convert the tests from rusoto to aws-sdk-rust, please stay tuned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests now compile.
If the tests are giving troubles due to credentials, I can also pull this into a local branch manually and merge from there, that way it should have the right test credentials. |
…te a new string which is expensive Thanks @Swatinem
Rusoto hardcoded the available regions, aws-sdk-rust does not. But there's no way to obtain the current list of valid regions without making API calls, which means having working credentials and either defaulting to some region (like us-east-1) or having the user specify a valid region anyway.
At this point, as far as I know the major issues are:
Anything else you can think of? |
I believe its okay to be as lenient as possible. People will get NotFound / NoPermission if they provide an invalid region. |
BTW by "custom region support" I mean this:
My patch took away support for this because unlike Rusoto, the Rust AWS SDK doesn't support a Custom Region variant. Perhaps this can be implemented by wrapping the AWS SDK |
Hm, I don’t think we really support those. But I could be wrong. First time I read about it. |
Unlike Rusoto, the Rust AWS SDK doesn't hardcode the currently known regions, and no validation is provided. API requests with invalid region names will fail.
This discussion seems relevant; the answer given suggests that a custom endpoint resolver can be used, as I guess the region is ultimately used to synthesize the URI of the wanted service endpoint. |
Ahhh, now I understand what this is about. I should have read the comment more closely (which needs updating btw :-)
That enum variant indeed consists of name + endpoint: https://docs.rs/rusoto_core/latest/rusoto_core/enum.Region.html#variant.Custom I think this is where my confusion was coming from. I believe in this case we can mimic that behavior and have a: enum Region {
Builtin(String),
Custom {
name: String,
endpoint: String,
}
} Then on usage one would have to match to then create the appropriate endpoint. |
Fixes #575. Posting this for feedback.