-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for STUN/TURN server configuration #140
base: main
Are you sure you want to change the base?
Conversation
I'd like to better understand the underlying issue. Generally speaking, TURN servers are basically used as a last result when peer-to-peer ICE candidates are unable to confirm bi-directional flow due restrictive firewalls on both sides. There a way to basically include an offer with a real public IP that should be wide open and reachable from anywhere. However, in the end the firewall still needs to allow some kind of outbound connection. Obviously in that case you're SOL regardless of what you're doing from the server/TURN side. WIth WIS being on one end of the negotiation we should be able to always ensure we are at least reachable. What should happen (theoretically) is we provide an ICE candidate with a public IP and listening port the client can make an outbound connection to through the firewall/NAT/etc. Generally speaking TURN servers were born out of desperation for the peer-to-peer one or more very restrictive firewall scenario. They're generally frowned upon because it's not a good idea to have an additional component in a latency, jitter, and loss sensitive audio path. It's also another component to manage, fail, scale, etc. I added some debugging in ts-client to show the offer from WIS:
If we look at these candidates even though there is a WIS public IP candidate all of the docker 172.x candidates (and SDP lines) probably shouldn't be there. It would be interesting for you to try running WIS outside of docker (where these candidates wouldn't be included) to see if that addresses the issue. If it does, we can tweak this on the WIS side to not include these candidates, doing what a lot of VoIP stuff does and make a configuration option to essentially configure the real public IP behind any NAT implementations (including docker). Alternatively, WIS could use STUN to figure this out but that's a much heavier lift and adds the requirement that the configured STUN server is reachable which IMO isn't great. What really sticks out here is the c= line. You should be able to confirm that when using a TURN server it's set to the public IP of the TURN server, in addition to the TURN server being included as an ICE candidate. This is one of the things we would want to tweak in the case of WIS always been reachable at a known (configured or discovered) IP. |
The primary goal is to support clients operating from behind a very restrictive firewall that blocks all outbound traffic except for that on TCP port 443. How might this be otherwise accomplished? |
For what it's worth, WIS already uses the Google STUN server to determine its public IP. This is apparent if you run a tcpdump like: Or if you look at the aiortc source code: This is why I did not pass an empty list for ICE servers if none was configured. Additionally, if you do configure a STUN server (with this patch) and it is unreachable by the WIS server, WIS will fail to make any WebRTC connections. |
Well that answers my question about how WIS knew its real public IP in the first place (which I've never really thought too much about before). The issue is it adds as an ICE candidate, which is kind of ok, but there are probably some other changes that could be made to reduce the number of ICE candidates. Given the docker use case it makes little sense to include the RFC 1918 address of the docker network at all, and because that's the actual address of the network interface it's used for the c= line, which isn't ideal. Unless a client is (somehow) on the docker network this candidate will always fail anyway, and it slows down ICE because it needs to be evaluated. For most of our use cases this is pretty suboptimal even though we only do ICE once at initial load. It's another point of failure (and remote call) and for the vast majority of these WIS use cases the IP is likely static and known in advance. We could/should provide an option for a known, configured IP and if enabled skip STUN altogether and use it and essentially make all aiortc messaging act as though that's the address of the local (docker) interface. If not static/configured we already have STUN today. Another option could be to just do STUN once at WIS startup, store the learned IP from the STUN response, and use that with the approach from the IP configuration option.
Same points as above. Generally speaking this is all good stuff but it's a very specific use case and I want to try to minimize the required changes in WIS as much as possible. |
Ultimately SFUs like Janus and Mediasoup allow you to specify a broadcasted external URL for cases where you are behind a NAT and you want to inject IPs into the candidates. This seems strictly necessary if you want to cover those kinds of cases where the server is available for a direct connection but the candidates that will be generated by the usual methods won't be correct. For example, on AWS, STUN will not actually discover the right IP address. In practice, many companies simply run a TURN server on 80 and 443 for the reasons mentioned. I'm not sure if the latency cost is a real concern - the "happy path" is to just run coturn alongside the webrtc server on the same machine and then just specify the TURN server as the only iceServer. It would be interesting to know more about how much latency and overhead this incurs but in practice I haven't seen it matter, and now your system is operating in a way where the world's firewalls are all assured to be configured to let traffic pass through and so on. |
I have a client that wants to use Willow for ASR in meetings but often operates behind restrictive firewalls and/or with broken IPv6 configurations. @richardklafter believes that using a TURN server might improve connection reliability and performance in this case.
Testing:
I have deployed an
eturnal
server, then set up firewall rules that drop all UDP traffic to my test instance of willow-inference-server. With TURN enabled, connections are able to proceed (relatively quickly). Without it, they fail.One note: if you specify a STUN server but the STUN server does not work, connections will fail even if the TURN server works. This appears to be an aioice limitation, though maybe it has now been resolved? This warrants a bit more investigation.