Skip to content

remove cached secret immediately after stream close#11160

Merged
hklai merged 4 commits intoistio:release-1.1from
quanjielin:quanlinrmcache0122
Jan 23, 2019
Merged

remove cached secret immediately after stream close#11160
hklai merged 4 commits intoistio:release-1.1from
quanjielin:quanlinrmcache0122

Conversation

@quanjielin
Copy link
Contributor

@quanjielin quanjielin commented Jan 22, 2019

#9035

Problem
remove cached item immediately rather than wait the auto-eviction to happen, this cause the cert rotation issue in vault.

Root cause:
Nodeagent uses cache to lookup(version + token) to decide whether a SDS request is ack request.
We used to assume envoy will reconnect with a new version number, which is wrong; envoy new request(regardless of ack or new, uses version got from last server response).
Citadel/vault case envoy uses normal k8s jwt(which is the same all the time) to nodeagent; together with same version number, nodeagent treat new request as ack request and didn't response, which result in cert rotation didn't happen.

This problem didn't repro for google CA because envoy sends either oauth token or trustworthy jwt, which got updated for each request(token is different); and cache has an eviction job to remove staled items.

}()

_, err := sc.GenerateSecret(context.Background(), "proxy1-id", testResourceName, "jwtToken1")
testProxyId := "proxy1-id"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var testProxyId should be testProxyID (from golint)

defer removeConn(key)
defer func() {
removeConn(key)
// Remove the secret from cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more informational to comment why we do this instead of what this is doing :)

Maybe something like "To avoid duplicated notifications of same secret expiry"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added more comments

Copy link
Contributor

@wattli wattli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Copy link
Member

@venilnoronha venilnoronha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

ProxyID: testProxyID,
ResourceName: testResourceName,
}
if _, ok := sc.secrets.Load(key); ok != true {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok != true -> !ok ? Or, rename the variable to found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

}

sc.DeleteSecret(testProxyID, testResourceName)
if _, ok := sc.secrets.Load(key); ok != false {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok != false -> ok ? Or, rename the variable to found?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

// The ID of proxy from which the connection comes from.
proxyID string

// The ResourceName of SDS request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of the

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

removeConn(key)
// Remove the secret from cache.
s.st.DeleteSecret(con.proxyID, con.ResourceName)
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be kept simple, like so:

defer s.st.DeleteSecret(con.proxyID, con.ResourceName)
defer removeConn(key)

ProxyID: proxyID,
ResourceName: req.ResourceNames[0],
}
if _, ok := st.secrets.Load(key); ok != true {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

if len(sdsClients) != 0 {
t.Errorf("sdsClients, got %d, expected 0", len(sdsClients))
}
if _, ok := st.secrets.Load(key); ok != false {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments as above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

ResourceName: req.ResourceNames[0],
}
if _, ok := st.secrets.Load(key); ok != true {
t.Errorf("failed to find cached secret")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/failed/Failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error message should start with lowercase, to align with rest of code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t.Errorf("sdsClients, got %d, expected 0", len(sdsClients))
}
if _, ok := st.secrets.Load(key); ok != false {
t.Errorf("found cached secret after stream close, expected non-exist")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/found/Found

t.Errorf("sdsClients, got %d, expected 0", len(sdsClients))
}
if _, ok := st.secrets.Load(key); ok != false {
t.Errorf("found cached secret after stream close, expected non-exist")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: expected non-exist -> expected the secret to not exist

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated/

@quanjielin
Copy link
Contributor Author

/assign @duderino

ProxyID: testProxyID,
ResourceName: testResourceName,
}
if _, found := sc.secrets.Load(key); found != true {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop the != true and != false in the if conditionals. Replace them with !found and found respectively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@venilnoronha venilnoronha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@istio-testing
Copy link
Collaborator

@quanjielin: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
prow/istio-integ-k8s-tests.sh ee51121 link /test istio-integ-k8s-tests
prow/istio-pilot-multicluster-e2e.sh ee51121 link /test istio-pilot-multicluster-e2e
Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@quanjielin quanjielin added this to the 1.1 milestone Jan 23, 2019
@quanjielin
Copy link
Contributor Author

/assign @wenchenglu

@wenchenglu
Copy link
Contributor

/approve

@wenchenglu
Copy link
Contributor

/lgtm
/approve

@istio-testing
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: quanjielin, venilnoronha, wattli, wenchenglu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hklai hklai merged commit 2be98a5 into istio:release-1.1 Jan 23, 2019
@quanjielin quanjielin deleted the quanlinrmcache0122 branch January 23, 2019 19:10
hklai pushed a commit to hklai/istio that referenced this pull request Jan 27, 2019
* remove cached secret immediately after stream close

* lint

* cleanup

* clean up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants