Skip to content

fix: data race in vnet.TestSSH#55980

Merged
nklaassen merged 2 commits intomasterfrom
nklaassen/fix-testssh-race
Jun 22, 2025
Merged

fix: data race in vnet.TestSSH#55980
nklaassen merged 2 commits intomasterfrom
nklaassen/fix-testssh-race

Conversation

@nklaassen
Copy link
Copy Markdown
Contributor

Fixes #55979

Flaky test detector caught a data race in VNet's TestSSH

==================
WARNING: DATA RACE
Write at 0x00c000f92870 by goroutine 393:
  github.com/gravitational/teleport/lib/vnet.(*sshAgent).setSessionKey()
      /__w/teleport.e/teleport.e/lib/vnet/ssh_agent.go:55 +0x99c
  github.com/gravitational/teleport/lib/vnet.(*sshProvider).sessionSSHConfig()
      /__w/teleport.e/teleport.e/lib/vnet/ssh_provider.go:219 +0x8be
  github.com/gravitational/teleport/lib/vnet.(*clientApplicationServiceClient).SessionSSHConfig()
      /__w/teleport.e/teleport.e/lib/vnet/client_application_service_client.go:193 +0x2f9
...

Previous read at 0x00c000f92870 by goroutine 405:
  github.com/gravitational/teleport/lib/vnet.(*sshAgent).List()
      /__w/teleport.e/teleport.e/lib/vnet/ssh_agent.go:62 +0x84
  github.com/gravitational/teleport/lib/vnet.(*forwardedAgents).forwarded()
      /__w/teleport.e/teleport.e/lib/vnet/vnet_test.go:1794 +0x168
  github.com/gravitational/teleport/lib/vnet.mustStartFakeWebProxy.func2.2()
      /__w/teleport.e/teleport.e/lib/vnet/vnet_test.go:1687 +0xd6

The changes in ssh_agent.go here fix the issue by adding a mutex around access to sshAgent.signer. The mutex isn't strictly necessary in real product code because the signer is always set before it could be read, but the test keeps a collection of sshAgents and may read them at any time, so the added mutex should resolve it.

The changes in vnet_test.go don't directly address the data race, but they were necessary for me to get TestSSH to pass at all under the race detector on my local PC. The race detector makes gVisor code extremely slow. The difference is the test no longer sets a 100ms timeout on gVisor DNS lookups and TCP dials that are required to succeed for the test to pass.

@nklaassen nklaassen added the no-changelog Indicates that a PR does not require a changelog entry label Jun 22, 2025
@nklaassen nklaassen added this pull request to the merge queue Jun 22, 2025
Merged via the queue into master with commit c0ee645 Jun 22, 2025
40 checks passed
@nklaassen nklaassen deleted the nklaassen/fix-testssh-race branch June 22, 2025 18:44
nklaassen added a commit that referenced this pull request Jun 23, 2025
nklaassen added a commit that referenced this pull request Jun 23, 2025
github-merge-queue Bot pushed a commit that referenced this pull request Jul 22, 2025
* [v17][vnet] feat: TCP dial to SSH targets

Backport #55087 to branch/v17

* [v17][vnet] feat: accept incoming SSH connections

Backport #55155 to branch/v17

* [v17][vnet] feat: forward SSH connections to target

Backport #55156 to branch/v17

* [v17][vnet] feat: write VNet SSH keys to TELEPORT_HOME

Backport #55228 to branch/v17

* [v17][vnet] feat: write OpenSSH-compatible config file for VNet SSH

Backport #55239 to branch/v17

* [v17][vnet] fix: support <hostname>.<leaf-cluster> for VNet SSH

Backport #55688 to branch/v17

* fix BlockUntil API for backport

* [v17][vnet] feat: add "Connect with VNet" button to SSH servers

Backport #55623 to branch/v17

* [v17][vnet] feat: support VNet SSH when cluster name does not match proxy public addr

Backport #55655 to branch/v17

* [v17][vnet] feat: add SSH configuration diagnostic

Backport #55594 to branch/v17

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>

* [v17][vnet] feat: show SSH status in VNet slider

Backport #55755 to branch/v17

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>

* [v17][vnet] feat: support proxy recording mode with VNet SSH

Backport #55788 to branch/v17

* [v17][vnet] feat: support diag checks on windows

Backport #55856 to branch/v17

* [v17] fix: data race in vnet.TestSSH

Backport #55980 to branch/v17

* [v17][vnet] feat: mention SSH on VNet info page

Backport #55973 to branch/v17

* [v17][vnet] feat: serve DNS on IPv4

Backport #55539 to branch/v17

* [v17][vnet] fix: close proxied channel only after data and requests are complete

Backport #56020 to branch/v17

* [v17][vnet] feat: automatic SSH client configuration

Backport #55923 to branch/v17

* VNet docs: Provide clear instructions for getting debug logs (#56068)

* VNet diag notification: Do not show button to open report if there's no workspace selected (#56067)

* VNet diag report: Don't show button in notification if there's no workspace

* Replace deprecated MutableRefObject with RefObject

* Make openReport not depend on value of rootClusterUri

Otherwise the effect that uses setInterval re-runs whenever the user
switches to another workspace.

* [v17][vnet] feat: automatic SSH client configuration in Connect

Backport #55924 to branch/v17

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>
Co-authored-by: Grzegorz Zdunek <grzegorz.zdunek@goteleport.com>

* [v17][vnet] fix: avoid empty host matchers in generated SSH config

Backport #56103 to branch/v17

* avoid t.Context() pre go1.24

* fix cspell lint

* [v17][docs] VNet SSH

Backport #56147 to branch/v17

* [v17][vnet] feat: SSH usage reporting

Backport #56537 to branch/v17

* [v17][vnet] fix: mask default IP route on windows

Backport #56957 to branch/v17

---------

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>
Co-authored-by: Grzegorz Zdunek <grzegorz.zdunek@goteleport.com>
github-merge-queue Bot pushed a commit that referenced this pull request Jul 22, 2025
* [v18][vnet] feat: TCP dial to SSH targets

Backport #55087 to branch/v18

* [v18][vnet] feat: accept incoming SSH connections

Backport #55155 to branch/v18

* [v18][vnet] feat: forward SSH connections to target

Backport #55156 to branch/v18

* [v18][vnet] feat: write VNet SSH keys to TELEPORT_HOME

Backport #55228 to branch/v18

* [v18][vnet] feat: write OpenSSH-compatible config file for VNet SSH

Backport #55239 to branch/v18

* [v18][vnet] fix: support <hostname>.<leaf-cluster> for VNet SSH

Backport #55688 to branch/v18

* [v18][vnet] feat: add "Connect with VNet" button to SSH servers

Backport #55623 to branch/v18

* fix test in backport

* [v18][vnet] feat: support VNet SSH when cluster name does not match proxy public addr

Backport #55655 to branch/v18

* [v18][vnet] feat: add SSH configuration diagnostic

Backport #55594 to branch/v18

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>

* [v18][vnet] feat: show SSH status in VNet slider

Backport #55755 to branch/v18

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>

* [v18][vnet] feat: support proxy recording mode with VNet SSH

Backport #55788 to branch/v18

* [v18][vnet] feat: support diag checks on windows

Backport #55856 to branch/v18

* [v18] fix: data race in vnet.TestSSH

Backport #55980 to branch/v18

* [v18][vnet] feat: mention SSH on VNet info page

Backport #55973 to branch/v18

* [v18][vnet] feat: serve DNS on IPv4

Backport #55539 to branch/v18

* [v18][vnet] fix: close proxied channel only after data and requests are complete

Backport #56020 to branch/v18

* [v18][vnet] feat: automatic SSH client configuration

Backport #55923 to branch/v18

* VNet diag notification: Do not show button to open report if there's no workspace selected (#56067)

* VNet diag report: Don't show button in notification if there's no workspace

* Replace deprecated MutableRefObject with RefObject

* Make openReport not depend on value of rootClusterUri

Otherwise the effect that uses setInterval re-runs whenever the user
switches to another workspace.

* [v18][vnet] feat: automatic SSH client configuration in Connect

Backport #55924 to branch/v18

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>
Co-authored-by: Grzegorz Zdunek <grzegorz.zdunek@goteleport.com>

* [v18][vnet] fix: avoid empty host matchers in generated SSH config

Backport #56103 to branch/v18

* [v18][docs] VNet SSH

Backport #56147 to branch/v18

* [v18][docs] add VNet warnings

Backport #56601 to branch/v18

* [v18][vnet] feat: SSH usage reporting

Backport #56537 to branch/v18

* [v18][vnet] fix: mask default IP route on windows

Backport #56957 to branch/v18

---------

Co-authored-by: Rafał Cieślak <rafal.cieslak@goteleport.com>
Co-authored-by: Grzegorz Zdunek <grzegorz.zdunek@goteleport.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-required no-changelog Indicates that a PR does not require a changelog entry size/sm vnet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

TestSSH flakiness

3 participants