remove cached secret immediately after stream close#11160
remove cached secret immediately after stream close#11160hklai merged 4 commits intoistio:release-1.1from
Conversation
| }() | ||
|
|
||
| _, err := sc.GenerateSecret(context.Background(), "proxy1-id", testResourceName, "jwtToken1") | ||
| testProxyId := "proxy1-id" |
There was a problem hiding this comment.
var testProxyId should be testProxyID (from golint)
| defer removeConn(key) | ||
| defer func() { | ||
| removeConn(key) | ||
| // Remove the secret from cache. |
There was a problem hiding this comment.
It would be more informational to comment why we do this instead of what this is doing :)
Maybe something like "To avoid duplicated notifications of same secret expiry"
There was a problem hiding this comment.
added more comments
| ProxyID: testProxyID, | ||
| ResourceName: testResourceName, | ||
| } | ||
| if _, ok := sc.secrets.Load(key); ok != true { |
There was a problem hiding this comment.
ok != true -> !ok ? Or, rename the variable to found?
| } | ||
|
|
||
| sc.DeleteSecret(testProxyID, testResourceName) | ||
| if _, ok := sc.secrets.Load(key); ok != false { |
There was a problem hiding this comment.
ok != false -> ok ? Or, rename the variable to found?
| // The ID of proxy from which the connection comes from. | ||
| proxyID string | ||
|
|
||
| // The ResourceName of SDS request. |
| removeConn(key) | ||
| // Remove the secret from cache. | ||
| s.st.DeleteSecret(con.proxyID, con.ResourceName) | ||
| }() |
There was a problem hiding this comment.
Can this be kept simple, like so:
defer s.st.DeleteSecret(con.proxyID, con.ResourceName)
defer removeConn(key)| ProxyID: proxyID, | ||
| ResourceName: req.ResourceNames[0], | ||
| } | ||
| if _, ok := st.secrets.Load(key); ok != true { |
| if len(sdsClients) != 0 { | ||
| t.Errorf("sdsClients, got %d, expected 0", len(sdsClients)) | ||
| } | ||
| if _, ok := st.secrets.Load(key); ok != false { |
| ResourceName: req.ResourceNames[0], | ||
| } | ||
| if _, ok := st.secrets.Load(key); ok != true { | ||
| t.Errorf("failed to find cached secret") |
There was a problem hiding this comment.
error message should start with lowercase, to align with rest of code
There was a problem hiding this comment.
Okay, there are a couple of other places where it's capitalized: See https://github.com/istio/istio/pull/11160/files/9ce00ddeb28f956fdf3f4a3180b01655f779cea8#diff-5230137c1e069ce6fd20fab17cbf88acR204.
| t.Errorf("sdsClients, got %d, expected 0", len(sdsClients)) | ||
| } | ||
| if _, ok := st.secrets.Load(key); ok != false { | ||
| t.Errorf("found cached secret after stream close, expected non-exist") |
| t.Errorf("sdsClients, got %d, expected 0", len(sdsClients)) | ||
| } | ||
| if _, ok := st.secrets.Load(key); ok != false { | ||
| t.Errorf("found cached secret after stream close, expected non-exist") |
There was a problem hiding this comment.
nit: expected non-exist -> expected the secret to not exist
|
/assign @duderino |
| ProxyID: testProxyID, | ||
| ResourceName: testResourceName, | ||
| } | ||
| if _, found := sc.secrets.Load(key); found != true { |
There was a problem hiding this comment.
You can drop the != true and != false in the if conditionals. Replace them with !found and found respectively.
|
@quanjielin: The following tests failed, say
DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/assign @wenchenglu |
|
/approve |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: quanjielin, venilnoronha, wattli, wenchenglu The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
* remove cached secret immediately after stream close * lint * cleanup * clean up
#9035
Problem
remove cached item immediately rather than wait the auto-eviction to happen, this cause the cert rotation issue in vault.
Root cause:
Nodeagent uses cache to lookup(version + token) to decide whether a SDS request is ack request.
We used to assume envoy will reconnect with a new version number, which is wrong; envoy new request(regardless of ack or new, uses version got from last server response).
Citadel/vault case envoy uses normal k8s jwt(which is the same all the time) to nodeagent; together with same version number, nodeagent treat new request as ack request and didn't response, which result in cert rotation didn't happen.
This problem didn't repro for google CA because envoy sends either oauth token or trustworthy jwt, which got updated for each request(token is different); and cache has an eviction job to remove staled items.