Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding credential caching for ECR #523

Merged
merged 7 commits into from
Sep 21, 2016
Merged

Conversation

sentientmonkey
Copy link
Contributor

Summary

Adding ECR credential caching. This avoids having to get fetch new authorization tokens on each ECR image pull.

Implementation details

This adds an LRU expiring cache for each ECR Client. This also adds a new client wrapper for ECR client vs. using the SDK directly. This does not yet address early expiration for shorter use tokens (still TODO).

Testing

  • Builds (make release)
  • Unit tests (make short-test) pass
  • Integration tests (make test) pass
  • Functional tests (make run-functional-tests) pass

New tests cover the changes: unit & integration, but no functional tests yet. Need to add setup for ECR testing.

Description for the changelog

  • Enhancement - Add credential caching for ECR

Licensing

This contribution is under the terms of the Apache 2.0 License: yes

* Also adding testify to LICENSE
* Also adding mocks for async.Cache
* Adding benchmark target
* Updating factory
* Updating GoDeps

func (client *ecrClient) GetAuthorizationToken(registryId string) (*ecrapi.AuthorizationData, error) {
cachedAuthData, ok := client.tokenCache.Get(registryId)
if ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still need to verify if the token hasn't expired here, yes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if the current token has a shorter expiration than the TTL, we'll have to check to see if it's still valid here. Will add the check.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems ExpiresAt in response is 12 hours.

There is delay to post the Token to ECR.
It needs to check if the token will not expire in few seconds.

@samuelkarp samuelkarp mentioned this pull request Sep 12, 2016
4 tasks
Before using a cached token, we validate that it's not yet expired. This also early expires
to handle request time and clock skew, as well adds some jitter to ensure that they don't
request new tokens all at once.
@bobzoller bobzoller mentioned this pull request Sep 14, 2016
Copy link
Contributor

@aaithal aaithal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Some minor comments.

aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now())
}

const (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can you move this to the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good - wasn't sure what we prefer for style on agent (const close to use or const at top of file)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the top!


// Ensure token is still within it's expiration window. We early expire to allow for timing in calls and add jitter to avoid
// refreshing all of the tokens at once.
func (client *ecrClient) isTokenValid(authData *ecrapi.AuthorizationData) bool {
Copy link
Contributor

@aaithal aaithal Sep 15, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add unit tests just around this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

)

// returns a random expiration duration between 1 and 5 minutes
func (client *ecrClient) expirationWindowWithJitter() time.Duration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get rid of this and just use utils.AddJitter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's much more clever. Will use that 😄

log.Debugf("Token found, but expires at %s", aws.TimeValue(cachedAuthData.ExpiresAt))
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be good to add a log.Debug here showing that it's actually making a call to GetAuthorizationToken.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense - will add

}

const (
maxTokenExpirationWindowInSeconds = 300 // 5 minutes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you only want to cut 5 minutes off this? Might be worthwhile to set this closer to 15 or 20 minutes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was worried a bit about shorter lived tokens. I think the minimum for an MFA or STS token is 15 minutes. How about 10 minutes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ECS agent only runs on EC2 and typically uses the EC2 instance profile for credentials. While credentials can be specified through environment variables as well, those credentials cannot change for the lifetime of the agent. If you really want to be clever, you can detect whether the agent is using the EC2 instance profile and examine when those credentials expire by looking at the instance metadata service and checking the expiration time for those credentials (curl 169.254.169.254/latest/meta-data-/iam/security-credentials/$PROFILE_NAME).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sense - I think I'm okay giving a much larger window in that case, and we can change if we end up with use cases that are not running using the EC2 instance profile. I've moved to renewing within 30-60 minutes of credentials expiring (seems pretty reasonable).

* Using AddJitter
* Adding test cases for IsTokenValid (also moved to public for testing)
* Updated window to be from 30-60 minutes
@sentientmonkey
Copy link
Contributor Author

I think I've addressed the outstanding feedback. Will likely create new PR with ECR integ tests - let me know if this looks good! 😄

}
}

log.Debugf("Calling GetAuthorizationToken for %q", registryId)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason you picked %q instead of %s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor preference - sometimes I find it more clear with quoted strings if they happen to be empty. Unlikely to happy, but I don't mind the extra quotes when there are there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, no need to change it (unless someone else objects).

aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now())
}

func (client *ecrClient) expirationWindowWithJitter() time.Duration {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name feels inaccurate to me since it doesn't include the existing expiration window. Can you change it to expirationWindowJitter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do

func (client *ecrClient) IsTokenValid(authData *ecrapi.AuthorizationData) bool {
return authData != nil &&
authData.ExpiresAt != nil &&
aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowWithJitter()).After(time.Now())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about rewriting this so it's a bit more verbose (and maybe more readable)?

refreshTime := aws.TimeValue(authData.ExpiresAt).Add(-1*client.expirationWindowJitter())
return time.Now().Before(refreshTime)

(I also like using Before instead of After since that feels like a more natural question; you're not asking whether the expiration time is after now but whether now is before the expiration time.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can go either way on before/after. I really also wish time.Sub() took a time.Duration and not a time.Time 😞 . I agree on breaking it out to a multi-line for readability.

)

const (
minimumExpirationDuration = 30 * time.Minute
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are minimum/maximum jitter, not expiration duration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking "minimum" and "maximum" expiration window (as in, the earliest and latest credentials could expire. I can change to "minimum/maximumJitterDuration" if that's clearer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer either minimumJitterDuration or minimumExpirationJitter over minimumExpirationDuration.

{-1 * time.Minute, false},
{time.Duration(0), false},
{1 * time.Minute, false},
{30 * time.Minute, false},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the minimumExpirationJitter and maximumExpirationJitter constants here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those aren't exported (and I have to use ecr_test because of the mock circular dependency). Will export & use.

Copy link
Contributor

@aaithal aaithal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks fine. Just noticed some minor copyright header changes that i'd like to see.

@@ -0,0 +1,16 @@
// Copyright 2014-2015 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be 2014-2016

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

@@ -0,0 +1,62 @@
// Copyright 2015-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be an issue with the mock gen script. This should be 2014-2016.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is set in mockgen.sh

// Copyright 2015-${year} Amazon.com, Inc. or its affiliates. All Rights Reserved.

I can update mockgen.sh if you want - will mean I'm updating the older mocks as well to back date them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 2015-${year} is right, since that's when we started doing mockgen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, 2015 seems right for this.

Copy link
Contributor

@aaithal aaithal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: thanks a lot for doing this!

@@ -0,0 +1,62 @@
// Copyright 2015-2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, 2015 seems right for this.

@sentientmonkey
Copy link
Contributor Author

🎉 Should I merge or squash/merge?

@aaithal
Copy link
Contributor

aaithal commented Sep 20, 2016

I usually 'squash and merge'.

@sentientmonkey sentientmonkey merged commit dba442c into aws:dev Sep 21, 2016
@samuelkarp samuelkarp added this to the 1.13.0 milestone Sep 27, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants