Fix panic in app dialing#42719
Conversation
#35501 incorrectly checked the length on the local servers variable instead of `t.c.servers` which could lead to panics like the one below. ```bash panic: runtime error: slice bounds out of range [1:0] goroutine 6558252 [running]: github.com/gravitational/teleport/lib/web/app.(*transport).DialContext(0xc00296d5f0, {0xae8bbd8, 0xc002596a20}, {0x42c525?, 0xc001dc4d80?}, {0x120?, 0x118?}) github.com/gravitational/teleport/lib/web/app/transport.go:264 +0x5dc net/http.(*Transport).dial(0xc002596a20?, {0xae8bbd8?, 0xc002596a20?}, {0x92ff0a2?, 0x4e4fc13?}, {0xc000b8dfc0?, 0xc00226f000?}) net/http/transport.go:1183 +0xd2 net/http.(*Transport).dialConn(0xc002f9cb40, {0xae8bbd8, 0xc002596a20}, {{}, 0x0, {0x9301a85, 0x5}, {0xc000b8dfc0, 0x1a}, 0x0}) net/http/transport.go:1625 +0x7e8 net/http.(*Transport).dialConnFor(0xae8bc10?, 0xc003340370) net/http/transport.go:1467 +0x9f created by net/http.(*Transport).queueForDial in goroutine 6562349 net/http/transport.go:1436 +0x3cb ```
| // eliminate any servers from the head of the list that were unreachable | ||
| t.mu.Lock() |
There was a problem hiding this comment.
What if some other concurrent call to DialContext already eliminated some servers from the head of t.c.servers? Is there some external synchronization to prevent that?
There was a problem hiding this comment.
I think that's likely a problem this code would run into with or without this change though. I could add some check that if the length of the two servers slices didn't match to just abort modifying things this time around.
There was a problem hiding this comment.
I can't tell if t.c.servers is ever externally modified; if not, I guess that a length check would work?
Is t.c.servers ordered in some way? Does it have some uniqueness guarantee?
There was a problem hiding this comment.
Each session populates t.c.servers a single time via https://github.com/gravitational/teleport/blob/master/lib/web/app/match.go#L47-L65. Any matching servers for the session are randomly shuffled. The only place t.c.servers should be modified after the session has been created is when removing them here.
|
@rosstimothy See the table below for backport results.
|
#35501 incorrectly checked the length on the local servers variable instead of
t.c.serverswhich could lead to panics like the one below.changelog: Prevent a panic in the Proxy when accessing an offline application