-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding credential caching for ECR #523
Conversation
* Also adding testify to LICENSE * Also adding mocks for async.Cache
* Adding benchmark target * Updating factory * Updating GoDeps
|
||
func (client *ecrClient) GetAuthorizationToken(registryId string) (*ecrapi.AuthorizationData, error) { | ||
cachedAuthData, ok := client.tokenCache.Get(registryId) | ||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we still need to verify if the token hasn't expired here, yes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the current token has a shorter expiration than the TTL, we'll have to check to see if it's still valid here. Will add the check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems ExpiresAt
in response is 12 hours.
There is delay to post the Token to ECR.
It needs to check if the token will not expire in few seconds.
Before using a cached token, we validate that it's not yet expired. This also early expires to handle request time and clock skew, as well adds some jitter to ensure that they don't request new tokens all at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Some minor comments.
aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now()) | ||
} | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can you move this to the top of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - wasn't sure what we prefer for style on agent (const close to use or const at top of file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the top!
|
||
// Ensure token is still within it's expiration window. We early expire to allow for timing in calls and add jitter to avoid | ||
// refreshing all of the tokens at once. | ||
func (client *ecrClient) isTokenValid(authData *ecrapi.AuthorizationData) bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also add unit tests just around this method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
) | ||
|
||
// returns a random expiration duration between 1 and 5 minutes | ||
func (client *ecrClient) expirationWindowWithJitter() time.Duration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of this and just use utils.AddJitter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well that's much more clever. Will use that 😄
log.Debugf("Token found, but expires at %s", aws.TimeValue(cachedAuthData.ExpiresAt)) | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to add a log.Debug
here showing that it's actually making a call to GetAuthorizationToken
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense - will add
} | ||
|
||
const ( | ||
maxTokenExpirationWindowInSeconds = 300 // 5 minutes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you only want to cut 5 minutes off this? Might be worthwhile to set this closer to 15 or 20 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was worried a bit about shorter lived tokens. I think the minimum for an MFA or STS token is 15 minutes. How about 10 minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ECS agent only runs on EC2 and typically uses the EC2 instance profile for credentials. While credentials can be specified through environment variables as well, those credentials cannot change for the lifetime of the agent. If you really want to be clever, you can detect whether the agent is using the EC2 instance profile and examine when those credentials expire by looking at the instance metadata service and checking the expiration time for those credentials (curl 169.254.169.254/latest/meta-data-/iam/security-credentials/$PROFILE_NAME
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense - I think I'm okay giving a much larger window in that case, and we can change if we end up with use cases that are not running using the EC2 instance profile. I've moved to renewing within 30-60 minutes of credentials expiring (seems pretty reasonable).
* Using AddJitter * Adding test cases for IsTokenValid (also moved to public for testing) * Updated window to be from 30-60 minutes
I think I've addressed the outstanding feedback. Will likely create new PR with ECR integ tests - let me know if this looks good! 😄 |
} | ||
} | ||
|
||
log.Debugf("Calling GetAuthorizationToken for %q", registryId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any particular reason you picked %q
instead of %s
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor preference - sometimes I find it more clear with quoted strings if they happen to be empty. Unlikely to happy, but I don't mind the extra quotes when there are there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, no need to change it (unless someone else objects).
aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now()) | ||
} | ||
|
||
func (client *ecrClient) expirationWindowWithJitter() time.Duration { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name feels inaccurate to me since it doesn't include the existing expiration window. Can you change it to expirationWindowJitter
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, will do
func (client *ecrClient) IsTokenValid(authData *ecrapi.AuthorizationData) bool { | ||
return authData != nil && | ||
authData.ExpiresAt != nil && | ||
aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about rewriting this so it's a bit more verbose (and maybe more readable)?
refreshTime := aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowJitter())
return time.Now().Before(refreshTime)
(I also like using Before
instead of After
since that feels like a more natural question; you're not asking whether the expiration time is after now but whether now is before the expiration time.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can go either way on before/after. I really also wish time.Sub() took a time.Duration and not a time.Time 😞 . I agree on breaking it out to a multi-line for readability.
) | ||
|
||
const ( | ||
minimumExpirationDuration = 30 * time.Minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are minimum/maximum jitter, not expiration duration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking "minimum" and "maximum" expiration window (as in, the earliest and latest credentials could expire. I can change to "minimum/maximumJitterDuration" if that's clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer either minimumJitterDuration
or minimumExpirationJitter
over minimumExpirationDuration
.
{-1 * time.Minute, false}, | ||
{time.Duration(0), false}, | ||
{1 * time.Minute, false}, | ||
{30 * time.Minute, false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use the minimumExpirationJitter
and maximumExpirationJitter
constants here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those aren't exported (and I have to use ecr_test because of the mock circular dependency). Will export & use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine. Just noticed some minor copyright header changes that i'd like to see.
@@ -0,0 +1,16 @@ | |||
// Copyright 2014-2015 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be 2014-2016
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix
@@ -0,0 +1,62 @@ | |||
// Copyright 2015-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be an issue with the mock gen script. This should be 2014-2016
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this is set in mockgen.sh
// Copyright 2015-${year} Amazon.com, Inc. or its affiliates. All Rights Reserved. |
I can update mockgen.sh if you want - will mean I'm updating the older mocks as well to back date them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think 2015-${year}
is right, since that's when we started doing mockgen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, 2015 seems right for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for doing this!
@@ -0,0 +1,62 @@ | |||
// Copyright 2015-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, 2015 seems right for this.
🎉 Should I merge or squash/merge? |
I usually 'squash and merge'. |
Summary
Adding ECR credential caching. This avoids having to get fetch new authorization tokens on each ECR image pull.
Implementation details
This adds an LRU expiring cache for each ECR Client. This also adds a new client wrapper for ECR client vs. using the SDK directly. This does not yet address early expiration for shorter use tokens (still TODO).
Testing
make release
)make short-test
) passmake test
) passmake run-functional-tests
) passNew tests cover the changes: unit & integration, but no functional tests yet. Need to add setup for ECR testing.
Description for the changelog
Licensing
This contribution is under the terms of the Apache 2.0 License: yes