Refactor lib/teleterm gateway by greedy52 · Pull Request #27685 · gravitational/teleport

greedy52 · 2023-06-09T17:13:42Z

Prep for:

TLS Routing behind ALB: Teleport Connect Kube Proxy Support #26836

Changes:

Move construction of DbcmdCLICommandProvider from lib/teleterm/daemon to lib/teleterm/clusters
Refactor cert reissue logic
- clusters.Cluster hides gateway types from daemon.GatewayCertReissuer
- gateway.Gateway takes tls.Certficate instead of cert paths, and gets a tls.Certificate directly on reissue callback

smallinsky

First pass. Seems to be reasonable. I left some nit comments.

smallinsky · 2023-06-13T09:40:19Z

+	if err := c.reissueDBCerts(ctx, routeToDatabase); err != nil {
+		return tls.Certificate{}, trace.Wrap(err)
+	}
+	tlsCert, err := c.loadDBCert(routeToDatabase)


the LoadDBCert is always called after reissueDBCerts since the reissueDBCerts uses the same information to issue a new cert at first glance the routeToDatabase.CheckCertSubject validation inside loadDBCert function is not needed.

nit: Also instead of loadingCert from disc directly we can use the introduce a helper function for our local agent c.clusterClient.LocalAgent() that will be responsible for loading the cert.

Checking the cert subject after calling reissueDBCerts was originally added because reissueDBCerts saves the cert to disk without returning it from the function. As such, we need to load the cert then and there's a small chance that between saving and loading, the cert on disk might be modified by e.g. tsh db login being ran from a separate shell session. See #18740.

After we switch from using ReissueUserCerts to IssueUserCertsWithMFA (which returns the cert), checking the cert subject will become unnecessary.

smallinsky · 2023-06-13T09:42:54Z

+
+// GatewayCertReissuer defines an interface of a helper that manages the
+// process of reissuing certificates.
+type GatewayCertReissuer interface {


seems that is used in scope of this package so we can make it enexported:

Suggested change

type GatewayCertReissuer interface {

type gatewayCertReissuer interface {

smallinsky · 2023-06-13T09:43:35Z

 	CLICommandProvider gateway.CLICommandProvider
-	TCPPortAllocator   gateway.TCPPortAllocator
-	OnExpiredCert      gateway.OnExpiredCertFunc
+	CertReissuer       GatewayCertReissuer


nit: go docs.

ravicious · 2023-06-14T09:40:51Z

+	if err := c.reissueDBCerts(ctx, routeToDatabase); err != nil {
+		return tls.Certificate{}, trace.Wrap(err)
+	}
+	tlsCert, err := c.loadDBCert(routeToDatabase)


Checking the cert subject after calling reissueDBCerts was originally added because reissueDBCerts saves the cert to disk without returning it from the function. As such, we need to load the cert then and there's a small chance that between saving and loading, the cert on disk might be modified by e.g. tsh db login being ran from a separate shell session. See #18740.

After we switch from using ReissueUserCerts to IssueUserCertsWithMFA (which returns the cert), checking the cert subject will become unnecessary.

ravicious · 2023-06-14T10:02:21Z

 }

 func (d DbcmdCLICommandProvider) GetCommand(gateway *gateway.Gateway) (*exec.Cmd, error) {
-	cluster, err := d.storage.GetByResourceURI(gateway.TargetURI())


I think I originally made it so that DbcmdCLICommandProvider holds storage instead of cluster because I wanted to guarantee that the cluster.clusterClient passed to dbcmd.NewCmdBuilder will be a fresh client constructed from the state of the profile on disk, which is what storage.GetByResourceURI returns.

Do you foresee any problems with essentially "caching" it in a long-lived object such as the gateway? I'm not totally sure how TeleportClient handles scenarios where the cert expires after TeleportClient is instantiated. It might be that there are no problems with that, I've just never checked if that's the case.

Same goes for cluster.status which holds things such as Roles and Username. If I start a gateway, the user cert expires, Connect asks me to relogin and I log in again, will DbcmdCLICommandProvider have access to fresh profile status?

When connecting to the local proxy (tunnel mode), DbcmdCLICommandProvider only needs tc.SiteName (for things like psql profile name, which is probably not required either) and maybe tc.InsecureSkipVerify. TeleportClient and cluster.status are used more when NOT connecting to the local proxy (it has to know how to load the cert and where to send to, which both not required for local proxy), or connecting to local proxy with TLS cert.

One exception to above is Oracle where profile.OracleWalletDir is called by DbcmdCLICommandProvider. But I don't think Oracle works anyway in Connect since custom logic is needed to create that OracleWalletDir in the first place.

So maybe I can refactor DbcmdCLICommandProvider to decouple it from TeleportClient and ProfileStatus. What do guys think?

dbcmd depending on TeleportClient instead of on a more specific data structure feels like an oversight which just naturally happened over time. I think it might be another side effect of tsh not having to care about long-lived TeleportClient instances – why make dbcmd depend on something else than TeleportClient when you can just read stuff from tc and you can be sure that it's fresh on each tsh call.

This might be another argument for returning a fresh TeleportClient in dbcmd command builder, so that the semantics of using TeleportClient in Connect resemble those of using TeleportClient in tsh. I don't have a horse in this race though so feel free to pick what works best.

ravicious · 2023-06-14T10:12:27Z

 		TargetSubresourceName: params.TargetSubresourceName,
 		LocalPort:             params.LocalPort,
-		CLICommandProvider:    cliCommandProvider,
-		TCPPortAllocator:      s.cfg.TCPPortAllocator,


Could you fill me in on how you got rid of passing TCPPortAllocator here?

I think the only place which actually set TCPPortAllocator on daemon's config was daemon_test.go, but that same place mocks the gateway creator so I see you were able to "inject" the mocked TCP port allocator exactly where it's needed vs passing it through the daemon.

I know it was all quite convoluted, so I appreciate the effort to clean this up!

I believe I passed gateway.TCPPortAllocator to mockGatewayCreator now. We probably can even completely get rid of it from gateway.Config if gateway is an interface.

ravicious · 2023-06-14T10:42:44Z

 // CreateGateway creates a gateway
 func (c *Cluster) CreateGateway(ctx context.Context, params CreateGatewayParams) (*gateway.Gateway, error) {
+	if params.CLICommandProvider == nil {
+		params.CLICommandProvider = NewDbcmdCLICommandProvider(c, dbcmd.SystemExecer{})


How does moving DbcmdCLICommandProvider initialization from the daemon to the cluster help with adding kube gateways?

My original comment pointed that the creation of the CLI command provider should be moved up to outer layers, not further down.

Instead of creating a new struct for every gateway, we'd have only two CLI command provider structs, say DbcmdCLICommandProvider and KubeCLICommandProvider, both initialized before calling daemon.New and passed to that function as a field on Config. Then when CreateGateway on the daemon is called, we inspect the target URI of the gateway and pass a reference to an appropriate provider.

This goes back to how dependencies are typically injected in Go and how we should approach mocking them out in tests. I'm no expert on this, so by all means please push back on this if you disagree!

I'm not sure if any of this matters to be honest. Because of how clusters.Cluster is written, it is usually mocked out wholesale anyway. So with how things are right now, there won't be a situation where I want to use clusters.Cluster or daemon.Service but I want the CLICommandProvider to be mocked out – CLICommandProvider always passes through clusters.Cluster. As I mentiode, clusters.Cluster has to be mocked out itself due to how it's designed.

This is clearly demonstrated by daemon_test.go where you were able to shift passing a mocked out TCP port allocator to the mock gateway creator.

Now that I think about it, DbcmdCLICommandProvider and KubeCLICommandProvider could be initialized within daemon.New. If DbcmdCLICommandProvider needs clusters.Storage for initialization, then daemon.New would simply pass the Config.Storage.

I asked for some opinions on Slack regarding passing the deps up. https://gravitational.slack.com/archives/C08HF8E9F/p1686740351896489

In my opinion, daemon.Service has to own CLICommandProvider only if the db command provider has to depend on storage so a fresh cluster is required. That may not be the case. Let me see what I can do on the DbcmdCLICommandProvider side to decouple the need for TeleportClient and ProfileStatus. If that's possible, cluster can be the one creating it and there is really no need for daemon.Service to know anything about the command provider.

But I think the same problem applies to the way I am doing the reissue cert callback. Do we need the storage to get a fresh Cluster for reissuing the database cert?

Seems so right? so the current refactor on issue cert won't work. Let me rethink through this.

wonder if we can just have a Cluster.Reload() function after relogin, or recreate the gateways after relogin.

I'm not sure if a fresh cluster is needed for the db cert, but if we're to have Cluster.Reload, I wonder if it wouldn't be better to just pass the storage there. 🤔

AFAIK, TeleportClient was originally made for tsh which doesn't really have the concept of long-running processes, maybe other than proxies. Actually, I'd check how this those TeleportClient instances are managed in the current local proxy middlewares (or if TeleportClient is used there in the first place).

I'm not sure if a fresh cluster is needed for the db cert, but if we're to have Cluster.Reload, I wonder if it wouldn't be better to just pass the storage there. 🤔

I think it's safer to get a fresh cluster for reissuing db cert the same way a fresh cluster is used for DBCliCommandProvider. If so, I would say the current implemented way is better so refactoring is probably not needed at all =D.

Thinking through this, I found a problem with this change I made earlier:

teleport/lib/teleterm/gateway/config.go

Lines 77 to 82 in 0bd2517

// TLSRoutingConnUpgradeRequired indicates that ALPN connection upgrades

// are required for making TLS routing requests.

TLSRoutingConnUpgradeRequired bool

// RootClusterCACertPoolFunc is callback function to fetch Root cluster CAs

// when ALPN connection upgrade is required.

RootClusterCACertPoolFunc alpnproxy.GetClusterCACertPoolFunc

So the gateway is caching TLSRoutingConnUpgradeRequired and RootClusterCACertPoolFunc. It's rare they change but when they do change after login, gateway has no way to reload these values, at least not supported in the current local proxy implementation. Might be cleaner to recreate the gateway by the new cluster after relogin (which will break active connections though).

Refactor lib/teleterm gateway

04eb3da

greedy52 added database-access Database access related issues and PRs teleport-connect Issues related to Teleport Connect. labels Jun 9, 2023

greedy52 self-assigned this Jun 9, 2023

greedy52 mentioned this pull request Jun 9, 2023

Teleport Connect Kube gateway POC #27287

Closed

ravicious self-requested a review June 12, 2023 09:51

smallinsky self-requested a review June 12, 2023 13:12

smallinsky reviewed Jun 13, 2023

View reviewed changes

ravicious reviewed Jun 14, 2023

View reviewed changes

smallinsky self-requested a review June 14, 2023 14:58

greedy52 closed this Jun 20, 2023

greedy52 mentioned this pull request Jun 26, 2023

Connect Kube gateway part 1: lib/teleterm/gateway #28312

Merged

greedy52 deleted the STeve/28636_lib_teleterm_refactor branch June 28, 2023 13:51

	type GatewayCertReissuer interface {
	type gatewayCertReissuer interface {

	// TLSRoutingConnUpgradeRequired indicates that ALPN connection upgrades
	// are required for making TLS routing requests.
	TLSRoutingConnUpgradeRequired bool
	// RootClusterCACertPoolFunc is callback function to fetch Root cluster CAs
	// when ALPN connection upgrade is required.
	RootClusterCACertPoolFunc alpnproxy.GetClusterCACertPoolFunc

Conversation

greedy52 commented Jun 9, 2023

Uh oh!

smallinsky left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greedy52 Jun 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greedy52 Jun 14, 2023 •

edited

Loading

greedy52 Jun 14, 2023 •

edited

Loading

greedy52 Jun 14, 2023 •

edited

Loading

greedy52 Jun 14, 2023 •

edited

Loading

greedy52 Jun 14, 2023 •

edited

Loading

greedy52 Jun 14, 2023 •

edited

Loading