Skip to content

Configure upstream TLS context with peer root certs#13321

Merged
freddygv merged 6 commits intomainfrom
peering/upstream-tls
Jun 1, 2022
Merged

Configure upstream TLS context with peer root certs#13321
freddygv merged 6 commits intomainfrom
peering/upstream-tls

Conversation

@freddygv
Copy link
Contributor

@freddygv freddygv commented Jun 1, 2022

Description

For mTLS to work between two proxies in peered clusters with different root CAs,
proxies need to configure their outbound listener to use different root certificates
for validation.

Up until peering was introduced proxies would only ever use one set of root certificates
to validate all mesh traffic, both inbound and outbound. Now an upstream proxy
may have a leaf certificate signed by a CA that's different from the dialing proxy's.

This PR makes changes to proxycfg and xds so that the upstream TLS validation
uses different root certificates depending on which cluster is being dialed.

Changes by pkg

proxycfg

When an explicit upstream is in a peered cluster, we watch the trust bundle
for that cluster. These trust bundles then get stored in a map keyed on peer name.

xds

When generating upstream clusters for discovery chains we use the RootPEMs
from the appropriate peer trust bundle if the upstream is in another cluster.

Testing & Reproduction steps

These changes have unit tests, but have not been validated against running Envoy proxies yet. That will happen after this PR and the public listener changes have been merged.

PR Checklist

  • updated test coverage
  • external facing docs updated
  • not a security concern
  • checklist folder consulted

@freddygv freddygv requested review from a team, acpana, kisunji and rboyer and removed request for a team June 1, 2022 14:49
@github-actions github-actions bot added the theme/envoy/xds Related to Envoy support label Jun 1, 2022
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the trust domain here didn't match the one from L54. It needs to match because both upstreams are in the same peered "cloud" cluster.

@freddygv freddygv force-pushed the peering/upstream-tls branch 2 times, most recently from d6198cf to d36b72b Compare June 1, 2022 14:58
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, this needed to be in the same trust domain as the other upstream, since they're in the same peered cluster

Comment on lines 112 to 150
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to keep auto-generated types/methods in a different file with the commands to regenerate if necessary (I like //go:generate mockery blah... at the top of the file cause Goland can run it)

Comment on lines 257 to 260
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we intentionally not write to map if resp.Bundle is nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, because the output isn't usable

Comment on lines 544 to 553
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use strings.Builder here to save memory in case there's a non-trivial number of roots

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should only ever be two at most, for the rotation case

Comment on lines 135 to 136
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a package design issue. Is it this cycle?

pbpeering -> connect/ca -> connect -> pbpeering

OOS for this PR but sounds annoying

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proto/<anything> is structurally in a similar position as agent/structs or api in that it's pretty leaf-y when it comes to dependencies by design.

You could simply move ca.EnsureTrailingNewLine to lib instead since it really just does string manipulation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the function to lib.

The cycle was:

pbpeering -> connect/ca -> agent/consul/state

@freddygv freddygv force-pushed the peering/upstream-tls branch from d36b72b to a7c7110 Compare June 1, 2022 19:29
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the garbage collection logic here, or would a TODO to cash-out at tproxy time be sufficient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a tproxy concern. Will add it to Asana.

@freddygv freddygv added the pr/no-changelog PR does not need a corresponding .changelog entry label Jun 1, 2022
freddygv added 6 commits June 1, 2022 15:45
This commit modifies proxycfg for connect proxies so that when an
explicit upstream is in a peered cluster, we watch the trust bundle
for that cluster.

These trust bundles then get stored in a map keyed on peer name. This
map is used to only spin up one watch per peer, since there may be
multiple upstreams in a peered cluster.
  * Skip creating cluster for upstreams in peered clusters that we do
  not have a trust bundle for yet.
  * If the upstream has a peer name specified, use the trust bundle for
    the peered cluster, rather than our local roots.
  * Fixup whitespace usage in test bundles for peers.
 * Add go:generate command for new cache-type mock
 * Move ca.EnsureNewLine to lib pkg
 * Fixup assertion expectation in proxycfg test
@freddygv freddygv force-pushed the peering/upstream-tls branch from 53e9279 to 9732269 Compare June 1, 2022 21:45
@freddygv freddygv merged commit 74ca640 into main Jun 1, 2022
@freddygv freddygv deleted the peering/upstream-tls branch June 1, 2022 21:53
@kisunji kisunji added the theme/cluster-peering Related to Consul's cluster peering feature label Jul 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr/no-changelog PR does not need a corresponding .changelog entry theme/cluster-peering Related to Consul's cluster peering feature theme/envoy/xds Related to Envoy support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants