Add HTTP proxy support for WebSocket leaf node connections#7242
Add HTTP proxy support for WebSocket leaf node connections#7242neilalexander merged 1 commit intonats-io:mainfrom
Conversation
kozlovic
left a comment
There was a problem hiding this comment.
Thank you for the contribution. However, this PR cannot be accepted as-is. We would need configuration file parsing, options validation and some sort of test. But more importantly, I have a concern about how this would work. Suppose that you configure the server to have a remote with a non TLS connection, but the proxy uses TLS to communicate with the HUB that requires TLS, then this would not work because the INFO protocol received from the HUB would indicate that TLS is required and the leafnode side of things would not be able to handle that. The same if situation is reversed (leaf wants TLS but proxy connects to HUB without).
I have made changes based on your feedback
|
|
@kozlovic I have addressed your feedback ready for another review, Thanks Dan |
|
@kozlovic thanks for your review. I have made another round of changes ready for another review. Thanks Dan |
|
@kozlovic thanks for review. I have made changes suggested ready for review. Thanks Dan |
kozlovic
left a comment
There was a problem hiding this comment.
I have enabled the workflow run and the linter/checks spotted few things. I would recommend that you rebase and squash all your commits into 1 and you need to have a signoff (see https://github.com/nats-io/nats-server/blob/main/CONTRIBUTING.md)
70f4c3e to
a2b4599
Compare
@kozlovic I have made changes and squashed into one signed commit. Thanks Dan |
a2b4599 to
bdec980
Compare
|
@kozlovic I noticed an issue with my squash commit. Now corrected. Thanks Dan |
kozlovic
left a comment
There was a problem hiding this comment.
Sorry about the requested changes again. The main issue is that we need to ensure that the check for the proxy response is not consuming data from the remote server, which we do right now which would cause some of the tests to fail.
We need to make those tests robust in that you should delay the closing of the connection as to not cause the read to fail, but also set some deadlines so that if there is an issue we don't block there forever.
Naming your tests with the same prefix would make it easier to run them all (with -count=10 -failfast or something like that to detect flappers). I would recommend that you run them locally with -count more than 1 to see if you catch flappers like I did.
Thank you for your patience!
|
@danbailey1000 Also, have you tried it with a real http proxy and a NATS Server hub that requires TLS? I wonder if it will even work since most TLS actors will fail if the remote is sending plain text first, which by default NATS Server does (sends the INFO protocol in plain text when getting a TCP connection). We did add the ability to do "TLS first" (see |
|
usually when putting a proxy in front of nats it should work by using the TLS first / This setup works for client connections 4222 at least. |
@kozlovic I have added some tests that uses real nats hub and leaf servers configured via proxy and using TLS and using the tlshandshakefirst and confirmed it is working as expected. |
kozlovic
left a comment
There was a problem hiding this comment.
That went too far ;-) By real tests, I meant that you have an actual HTTP proxy and you verified that it was working as expected. Please undo some of the changes that you've made in these last commits (remove all added tests files and the mockup server) and I will review again. Note that the use of proxyTunnelConn is similar to our tlsMixConn. Maybe we could consolidate or you could use tlsMixConn as-is.
Running your tests I got:
=== RUN TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure
==================
WARNING: DATA RACE
Read at 0x00c0007e3383 by goroutine 1762:
testing.(*common).logDepth()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1053 +0x8c
testing.(*common).log()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1046 +0x80
testing.(*common).Logf()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1097 +0x58
github.com/nats-io/nats-server/v2/server.(*mockTLSFirstNATSServer).handleTLSConnection()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/leafnode_proxy_test.go:1129 +0x3a8
github.com/nats-io/nats-server/v2/server.(*mockTLSFirstNATSServer).handleConnection()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/leafnode_proxy_test.go:1099 +0x208
github.com/nats-io/nats-server/v2/server.(*mockTLSFirstNATSServer).acceptConnections.gowrap1()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/leafnode_proxy_test.go:1074 +0x58
Previous write at 0x00c0007e3383 by goroutine 1755:
testing.tRunner.func1()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1779 +0x5dc
runtime.deferreturn()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/runtime/panic.go:610 +0x5c
testing.(*T).Run.gowrap1()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1851 +0x40
Goroutine 1762 (running) created at:
github.com/nats-io/nats-server/v2/server.(*mockTLSFirstNATSServer).acceptConnections()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/leafnode_proxy_test.go:1074 +0x34
github.com/nats-io/nats-server/v2/server.createMockTLSFirstNATSServer.gowrap1()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/leafnode_proxy_test.go:1059 +0x40
Goroutine 1755 (finished) created at:
testing.(*T).Run()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1851 +0x684
testing.runTests.func1()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:2279 +0x7c
testing.tRunner()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:1792 +0x180
testing.runTests()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:2277 +0x77c
testing.(*M).Run()
/Users/ik/dev/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.7.darwin-arm64/src/testing/testing.go:2142 +0xb68
github.com/nats-io/nats-server/v2/server.TestMain()
/Users/ik/dev/go/src/github.com/nats-io/nats-server/server/sublist_test.go:1607 +0x374
main.main()
_testmain.go:5533 +0x114
==================
=== RUN TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure/connection_fails_without_TLS_handshake_first
=== NAME TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure
testing.go:1490: race detected during execution of test
=== NAME TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure/connection_fails_without_TLS_handshake_first
leafnode_proxy_test.go:1031: ✅ Connection correctly failed without TLSHandshakeFirst option
--- FAIL: TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure (0.00s)
--- PASS: TestLeafNodeHttpProxyWithoutTLSHandshakeFirstFailure/connection_fails_without_TLS_handshake_first (0.00s)
Not asking to fix, instead remove those tests...
|
@kozlovic tests now removed. Thx
|
kozlovic
left a comment
There was a problem hiding this comment.
Please review and apply the proposed fix for the tests. Also, you would need to squash all your commits (and possibly rebase to take advantage of the latest changes in main - if any) because each commit needs to be signed off. If I were to trigger the CI tests, it would fail because of that. Thanks again and appreciate your patience.
64aefb4 to
d89e2cf
Compare
@kozlovic Patch now applied (thanks for that :)). Squashed changes into one commit. |
kozlovic
left a comment
There was a problem hiding this comment.
LGTM. Thank you for your contribution!
|
Will need to hold merging for now as we're in a feature freeze ahead of 2.12. Will come back to this afterwards. Thanks! |
|
@danbailey1000 I have made some small changes to your PR and tested using a proxy with a regular tcp nats connection which was successful. AFAIK an HTTP proxy which implements the CONNECT method should not care whether the protocol on top is HTTP or raw TCP (nats). PR is up here: https://github.com/danbailey1000/nats-server/pull/1/files |
@acormier-maia thanks a lot. Ideally we can also include the tcp support. This PR has been approved already so I am not sure the best approach now. Maybe wait until this PR is merged, and then we can open another PR to add the tcp support. @kozlovic what would you recommend? thx |
|
@danbailey1000 Yes, maybe wait for it to be merged and then another PR can be submitted. |
|
Would you mind rebasing on top of latest |
d89e2cf to
b45b35f
Compare
@neilalexander now done. Thanks |
| const sharedSysAccDelay = 250 * time.Millisecond | ||
|
|
||
| // establishHTTPProxyTunnel establishes an HTTP CONNECT tunnel through a proxy server | ||
| func establishHTTPProxyTunnel(proxyURL, targetHost string, timeout time.Duration, username, password string) (net.Conn, error) { |
There was a problem hiding this comment.
I think I would really rather us not build and parse HTTP requests/responses by hand here when the Go standard library has the ability to do this for us, i.e. something like this:
conn, err := natsDialTimeout("tcp", proxyAddr.Host, timeout)
if err != nil {
return nil, fmt.Errorf("failed to connect to proxy: %w", err)
}
req := &http.Request{
Method: http.MethodConnect,
URL: &url.URL{Opaque: targetHost}, // Opaque is required here
Host: targetHost,
Header: make(http.Header),
}
// do we need auth here?
req.Header.Set("Proxy-Authorization", "Basic " + ...)
if err := req.Write(conn); err != nil {
conn.Close()
return nil, fmt.Errorf("req.Write: %w", err)
}
resp, err := http.ReadResponse(bufio.NewReader(conn), req)
if err != nil {
conn.Close()
return nil, fmt.Errorf("http.ReadResponse: %w", err)
}
if resp.StatusCode != http.StatusOK {
conn.Close()
return nil, fmt.Errorf("status code: %s", resp.Status)
}
// now use "conn"
We already have net/http imported so this is fine to use.
There was a problem hiding this comment.
@neilalexander I have changed this to now use the standard http library. Thx
Enables NATS leaf node servers to connect through corporate HTTP proxies to NATS Hub servers. Configuration options: - proxy_url: HTTP proxy server URL (http/https schemes only) - username/password: Optional proxy authentication credentials - Validates proxy compatibility with WebSocket URLs and TLS requirements Signed-off-by: Dan Bailey <danbailey1000@msn.com>
4a80d45 to
201f8f8
Compare
Saw the test TestConfigCheck flap. The reason is that the proxy validation was done when parsing the "proxy" field, but there was no guarantee that this field would be parsed after "urls", which may lead to validation being successful (because there was no URLs added yet). Move the check after all fields have been parsed for a given remote. Relates to #7242 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Saw the test TestConfigCheck flap. The reason is that the proxy validation was done when parsing the "proxy" field, but there was no guarantee that this field would be parsed after "urls", which may lead to validation being successful (because there was no URLs added yet). Move the check after all fields have been parsed for a given remote. Relates to #7242 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Saw the test TestConfigCheck flap. The reason is that the proxy validation was done when parsing the "proxy" field, but there was no guarantee that this field would be parsed after "urls", which may lead to validation being successful (because there was no URLs added yet). Move the check after all fields have been parsed for a given remote. Relates to #7242 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Changes
RemoteLeafOptsinserver/opts.goserver/leafnode.goUse Case
Enables WebSocket leaf node connections in corporate environments where direct connections are blocked by firewalls, but HTTP proxy access is available.
Configuration Example