High Memory usage per TCP connection #16

SkyperTHC · 2023-05-23T17:56:56Z

Wiretap seems to use around 14MBytes of memory (rss) for each new tcp connection. That's without kernel memory and without TCP buffers (which reside inside the kernel; not userland).

The problem is that this causes wiretap to fail (and exit; or killed by the OOM).

The problem can be re-created when establishing 88k TCP connections when wiretap is running on a Linux system with 2GBytes of RAM (sending 88k TCP SYN).

Killed process 125774 (wiretap_linux_a) total-vm:2368460kB, anon-rss:1198968kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:3356kB oom_score_adj:0

It seems odd that the userland wiretap allocates 14MBytes of memory before the TCP connection has exchanged any data.

The desirable solution would be any of these two:

Reduce the memory requirement on wiretap. Not much memory needs to be allocated until the SYN-ACK is received (e.g. wiretap's connect(2) completes).
When memory allocation fails then make wiretap fail the connection (send RST/FIN upstream) instead of dying. RST/FIN either for the failed connection or start freeing connections that are outstanding (not completed) - starting with the oldest - top make memory available for the most recent connection.

The text was updated successfully, but these errors were encountered:

luker983 · 2023-06-02T17:01:11Z

Can you please share the command you used to make the 88k TCP connections? Was it nmap/masscan? Command and args would be helpful, thanks!

Edit: It would also be helpful to know if you're running the serve command as a privileged user

SkyperTHC · 2023-06-06T12:30:36Z

I'm using ssh [email protected] (password is 'segfault') as origin server and start Wireguard by executing:

curl rpc/net/up

The Exit Node is x86_64. I'm running wiretap v0.2.0 as root (change the keys etc):

export TYPE=wiretap
X='1-uEjdpgvP6JF89Bw/FDjWxwBXP4maqLL7CAEo2lqyAWg='
X+='-r8RaGtgruh+gfzQGqjuD7oQhpoP7dYzBpMpDhro2BHY='
X+='-3.126.148.157:33050'
X="$X" bash -c "$(curl -fsSL thc.org/sfwg)"

(effectively it starts wiretap with --allowed 192.168.0.0/29,fd::0/125)

On the origin server (segfault) I scan a 'black hole' of the Internet by executing:

masscan --interface wgExit --source-ip 192.168.0.3 --rate 10000 --banners -p- 30.31.32.0/24

I then check on the Exit Node the RSS of wiretap and the number of TCP sockets and compare the delta. The OOM kicks in after a few seconds (depending on the host memory).

luker983 · 2023-06-08T21:45:47Z

Profiling found that this is the most likely culprit:

https://github.com/google/gvisor/blob/7a92412c08fb1e32c68ebf036831b4175fad9d16/pkg/tcpip/stack/conntrack.go#L507

The unestablished connection timeout is 120s, so memory usage grows quickly for 2 minutes before leveling off.

Usage on my machine is peaking here:

I thought we might be able to inject a fake RST into the endpoint so the connection would be removed from the connection map, but it looks like the unestablished timeout must elapse before any connections are reaped, even if the connection has closed.

SkyperTHC · 2023-06-08T22:17:12Z

Super research.

Is removing gVisor an option? gVisor is falling apart left right and centre and seems to be the culprit of a lot of pain (including #18)

The forwarding connection object (WT TCP to target) could record "Origin-2-WT" [syn/ack/src-ip/src-port]. The whole thing can be done in 14 bytes per forwarding TCP connection.

luker983 · 2023-06-09T14:27:07Z

Getting away from gVisor entirely is probably not going to happen. It would be a ton of work to reimplement the features that Wiretap needs. However, we may be able to move away from gVisor's DNAT rule in favor of some other transparent proxy mechanism. Something that has been on my todo list for some time is looking at how Tailscale is handling their "Subnet Routing" feature that is very similar to what Wiretap does: https://github.com/tailscale/tailscale/blob/62130e6b68f629ecf41176330eb70dfc7c9d58e2/wgengine/netstack/netstack.go

luker983 · 2023-07-03T18:48:43Z

When iptables isn't used, connections aren't tracked. This branch doesn't use iptables so memory usage should be greatly reduced when running a scan: dnat-to-forwarder

SkyperTHC · 2023-07-06T12:30:08Z

Good work. The command to try is:

masscan --interface wgExit --source-ip 172.16.0.3  --rate 10000 --banners -p- 30.31.32.0/24

The OOM still kills it but only after 60 seconds (>3GB RSS) rather ten 2-3 seconds.

luker983 · 2023-07-06T15:01:22Z

You can try playing with GOGC, it allows you to make garbage collection more frequent with a value less than 100: GOGC=10 ./wiretap. (https://pkg.go.dev/runtime).

High memory usage is a known issue: tailscale/tailscale#7272 and WireGuard/wireguard-go#69.

EDIT: Forgot to mention that --conn-timeout greatly impacts memory usage for scanning. Lower this value to reduce memory usage.

SkyperTHC · 2023-07-07T18:09:23Z

Testing results, 4GB server and running masscan -e wgExit --adapter-ip 172.16.0.3-172.16.128.2 --adapter-port 1024-33791 --rate 10000 --banners -p- 30.31.32.0/24

unset GOGC
time ./wiretap serve -q --conn-timeout 1000 --allowed 172.16.0.0/16

 Killed

real    6m11.006s
user    7m0.527s
sys     3m18.973s

export GOGC=10
time ./wiretap serve -q --conn-timeout 1000 --allowed 172.16.0.0/16

 Killed

real    1m43.412s
user    2m4.709s
sys     0m59.963s

Odd. Less memory is used when not setting GOGC.

I'm unclear why it would consume 3.4GB of memory in the first place when there are only 10,000 connections (10,000 new connections every second and conn-timeout is set to 1second). Feels like is leaking somewhere. Memory consumption should be not steadily increase given the parameters.

luker983 · 2023-07-14T17:00:05Z

I haven't been able to replicate this 🙁

With --conn-timeout 1000 the scan plateaus at around 512MiB. I've tried changing my network conditions (increasing latency between origin/exit, etc.) but results seem to be consistent.

I think having you profile might help track this down. Can you please pull the profiling branch and run go tool pprof -http=localhost:8080 http://<server-ip>:6060/debug/pprof/heap after a couple of minutes of running the scan and report back?

luker983 · 2023-08-29T23:12:12Z

Couldn't replicate latest OOM, so closing. If this is still an issue feel free to reopen.

luker983 self-assigned this Jun 1, 2023

luker983 added the enhancement New feature or request label Jun 26, 2023

luker983 mentioned this issue Jul 18, 2023

Peformance improvements, bug fixes, and reverse port forwarding #23

Merged

luker983 closed this as completed Aug 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High Memory usage per TCP connection #16

High Memory usage per TCP connection #16

SkyperTHC commented May 23, 2023

luker983 commented Jun 2, 2023 •

edited

Loading

SkyperTHC commented Jun 6, 2023

luker983 commented Jun 8, 2023

SkyperTHC commented Jun 8, 2023

luker983 commented Jun 9, 2023

luker983 commented Jul 3, 2023

SkyperTHC commented Jul 6, 2023

luker983 commented Jul 6, 2023 •

edited

Loading

SkyperTHC commented Jul 7, 2023 •

edited

Loading

luker983 commented Jul 14, 2023

luker983 commented Aug 29, 2023

High Memory usage per TCP connection #16

High Memory usage per TCP connection #16

Comments

SkyperTHC commented May 23, 2023

luker983 commented Jun 2, 2023 • edited Loading

SkyperTHC commented Jun 6, 2023

luker983 commented Jun 8, 2023

SkyperTHC commented Jun 8, 2023

luker983 commented Jun 9, 2023

luker983 commented Jul 3, 2023

SkyperTHC commented Jul 6, 2023

luker983 commented Jul 6, 2023 • edited Loading

SkyperTHC commented Jul 7, 2023 • edited Loading

luker983 commented Jul 14, 2023

luker983 commented Aug 29, 2023

luker983 commented Jun 2, 2023 •

edited

Loading

luker983 commented Jul 6, 2023 •

edited

Loading

SkyperTHC commented Jul 7, 2023 •

edited

Loading