Skip to content

Set up ClientStore when adding cluster in Connect#20263

Merged
ravicious merged 7 commits intomasterfrom
ravicious/fix-adding-cluster
Jan 17, 2023
Merged

Set up ClientStore when adding cluster in Connect#20263
ravicious merged 7 commits intomasterfrom
ravicious/fix-adding-cluster

Conversation

@ravicious
Copy link
Copy Markdown
Member

@ravicious ravicious commented Jan 16, 2023

It looks like #19420 broke adding a new cluster in Connect:

Stacktrace
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x28 pc=0x102859d54]

goroutine 83 [running]:
github.com/gravitational/teleport/lib/client.(*Config).SaveProfile(0x140002fc700, 0x10?)
   github.com/gravitational/teleport/lib/client/api.go:634 +0x314
github.com/gravitational/teleport/lib/teleterm/clusters.(*Storage).addCluster(0x14000c89260, {0x103ac0c10, 0x1400095c840}, {0x1400005632e?, 0x14000b16120?}, {0x14000b16138, 0x13})
   github.com/gravitational/teleport/lib/teleterm/clusters/storage.go:164 +0x1a0
github.com/gravitational/teleport/lib/teleterm/clusters.(*Storage).Add(0x14000c89260, {0x103ac0c10, 0x1400095c840}, {0x14000b16138, 0x13})
   github.com/gravitational/teleport/lib/teleterm/clusters/storage.go:122 +0xe8
github.com/gravitational/teleport/lib/teleterm/daemon.(*Service).AddCluster(0x140006af8d8?, {0x103ac0c10?, 0x1400095c840?}, {0x14000b16138?, 0x5000?})
   github.com/gravitational/teleport/lib/teleterm/daemon/daemon.go:99 +0x30
github.com/gravitational/teleport/lib/teleterm/apiserver/handler.(*Handler).AddCluster(0x15?, {0x103ac0c10?, 0x1400095c840?}, 0x100c9600c?)
   github.com/gravitational/teleport/lib/teleterm/apiserver/handler/handler_clusters.go:60 +0x34
github.com/gravitational/teleport/lib/teleterm/api/protogen/golang/v1._TerminalService_AddCluster_Handler.func1({0x103ac0c10, 0x1400095c840}, {0x10379fee0?, 0x14000b18180})
   github.com/gravitational/teleport/lib/teleterm/api/protogen/golang/v1/service_grpc.pb.go:1025 +0x74
github.com/gravitational/teleport/lib/teleterm/apiserver.withErrorHandling.func1({0x103ac0c10?, 0x1400095c840?}, {0x10379fee0?, 0x14000b18180?}, 0x140002c59f8?, 0x1011c2254?)
   github.com/gravitational/teleport/lib/teleterm/apiserver/middleware.go:36 +0x44
github.com/gravitational/teleport/lib/teleterm/api/protogen/golang/v1._TerminalService_AddCluster_Handler({0x103a111e0?, 0x14000881400}, {0x103ac0c10, 0x1400095c840}, 0x1400097e000, 0x140008904b0)
   github.com/gravitational/teleport/lib/teleterm/api/protogen/golang/v1/service_grpc.pb.go:1027 +0x138
google.golang.org/grpc.(*Server).processUnaryRPC(0x14000a62000, {0x103acc740, 0x140002e4000}, 0x14000926240, 0x14000175200, 0x10530f670, 0x0)
   google.golang.org/grpc@v1.51.0/server.go:1340 +0xb7c
google.golang.org/grpc.(*Server).handleStream(0x14000a62000, {0x103acc740, 0x140002e4000}, 0x14000926240, 0x0)
   google.golang.org/grpc@v1.51.0/server.go:1713 +0x82c
google.golang.org/grpc.(*Server).serveStreams.func1.2()
   google.golang.org/grpc@v1.51.0/server.go:965 +0x84
created by google.golang.org/grpc.(*Server).serveStreams.func1
   google.golang.org/grpc@v1.51.0/server.go:963 +0x290

This PR fixes this by making sure that ClientStore is set up before calling SaveProfile.

The existing integration tests didn't catch this issue because they largely skip the usual login procedure in favor of using test helpers to generate valid certs. The test user is created with a random password which we can't access. I added another integration test which actually catches the issue with missing ClientStore. Adding the test turned out to be harder than expected, I created a separate issue for that (#20273) to tackle that in the coming days and clean up this part of the codebase as well.


Unlike in tsh, logging in to a new cluster is Connect is broken into two steps. First you provide the URL of a proxy. Connect then pings the proxy and saves the profile to the home dir (clusters.Storage.addCluster). At this point, the cluster is added to the list of clusters in the app but the user is not logged in yet – the actual login process happens in the next step.

This is a remnant of times when logging out of a cluster and removing the cluster profile were two separate steps.

It keeps trying to Connect to proxy under port 3080 for some reason
cfg.WebProxyAddr = webProxyAddress
cfg.HomePath = s.Dir
cfg.KeysDir = s.Dir
cfg.ClientStore = client.NewFSClientStore(s.Dir)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I see, client.NewClient(cfg) returns the configured ClientStore, so we could have

cfg.ClientStore = clusterClient.ClientStore

Wouldn't it be better to use it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clusterClient doesn't exist at this point though. Ideally we wouldn't call SaveProfile on Config in the first place, I'm not sure why we're doing that – I left a comment to investigate this in the future.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but I mean yeah, we could first create Config, then TeleportClient, then set the ClientStore on Config to that of TeleportClient.

It's a bit convoluted though IMHO. In case of ClientStore, the choice is between an in-memory implementation or an implementation that writes to disk. The whole reason we call SaveProfile is to save that to disk so that the subsequent calls to tshd see that cluster too. So I don't think it's bad that we're duplicating the logic a little bit here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I realized that it would create some sort of a circlar dependency between these functions :/

@ravicious
Copy link
Copy Markdown
Member Author

The integration test fails because the cluster set up in tests reports "localhost" as SSH.PublicAddr in the ping response which we then attempt to use as WebProxyAddr on TeleportClient.

Since there's no port specified, we default to 3080. I need to figure out why it's reported as "localhost" and not the actual address.

@ravicious
Copy link
Copy Markdown
Member Author

It might take me a while to get back to fixing the integration test, so for now I've removed everything but the immediate fix.

If I don't manage to finish the integration test within an hour, I'll come back to it tomorrow or on Thursday, but I'd like to merge the fix in the meantime to fix broken master.

@ravicious
Copy link
Copy Markdown
Member Author

@xacrimon @probakowski Ping, Connect is broken on master and I'd like to fix it with this one line change.

@ravicious ravicious enabled auto-merge (squash) January 17, 2023 16:00
@ravicious ravicious merged commit 613bc9a into master Jan 17, 2023
@ravicious ravicious deleted the ravicious/fix-adding-cluster branch January 17, 2023 16:48
ravicious added a commit that referenced this pull request Jan 18, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
ravicious added a commit that referenced this pull request Jan 18, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
ravicious added a commit that referenced this pull request Jan 20, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
github-actions Bot pushed a commit that referenced this pull request Jan 20, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
github-actions Bot pushed a commit that referenced this pull request Jan 20, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
ravicious added a commit that referenced this pull request Jan 20, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
ravicious added a commit that referenced this pull request Jan 23, 2023
i.Web is also used as tconf.Proxy.WebAddr.Addr a couple of lines below.

When pinging the proxy, TeleportClient takes the response and updates a
couple of values based on that response. If proxySettings.SSH.PublicAddr
is not empty, it tries to parse the address and then set it as
tc.WebProxyAddr. [1]

If it cannot parse the port number, it uses the default (3080).

What is getting returned as `proxySettings.SSH.PublicAddr`? That's determined
by setProxyPublicAddressesSettings in lib/service/proxy_settings.go. [2]
It uses the first element from Proxy.PublicAddrs.

Now, the integration test helpers set up the Teleport instance in such
way that the first Proxy.PublicAddrs is set to i.Hostname which resolves
to just "localhost" in tests. So if any test pings the proxy first and then
tries to make another request with the same TeleportClient, the subsequent
request will try to reach out to localhost:3080. This happened to me when
trying to add a new test. [3]

This PR fixes this by making sure that the first element of Proxy.PublicAddrs
actually points at the address of the web UI.

See also the message from Marek about backwards compatibility of
SSHProxyHostPort. [4] [5]

[1] https://github.com/gravitational/teleport/blob/885d7397ab3746154342712bec86bcdb3ea93eab/lib/client/api.go#L3666-L3673
[2] https://github.com/gravitational/teleport/blob/806a568ada7c640a64eb60f718e7d56be34049ad/lib/service/proxy_settings.go#L112
[3] #20263
[4] https://gravitational.slack.com/archives/C0DF0TPMY/p1673895327794379?thread_ts=1673891288.249809&cid=C0DF0TPMY
[5] https://github.com/gravitational/teleport/blob/db7fdff8097bcc883af8d9dde6a271d07e418550/api/client/webclient/webclient.go#L490
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants