Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: NAT-PMP port forwarding not working after internal VPN restart due to unhealthy health check #1749

Closed
lollenderrofler opened this issue Jul 15, 2023 · 14 comments

Comments

@lollenderrofler
Copy link

Is this urgent?

None

Host OS

Synology NAS

CPU arch

x86_64

VPN service provider

ProtonVPN

What are you using to run the container

docker-compose

What is the version of Gluetun

Running version latest built on 2023-07-09T12:26:38.469Z (commit a681d38)

What's the problem 🤔

WireGuard VPN connection to Proton restarts from time to time due to unhealthy health check. The VPN connection itself is restored just a few seconds later, so no real issue there, but for some reason the port forwarding does not „restart“. I just get port forwarding working again with a full container restart which is unpreferable due to all connected containers are failing the connection and also need to be restarted.

Share your logs

2023-07-14T18:20:17Z INFO [healthcheck] unhealthy: dialing: dial tcp4: lookup cloudflare.com on 127.0.0.1:53: read udp 127.0.0.1:48102->127.0.0.1:53: read: connection refused
2023-07-14T18:20:25Z INFO [healthcheck] program has been unhealthy for 6s: restarting VPN (see https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md)
2023-07-14T18:20:25Z INFO [vpn] stopping
2023-07-14T18:20:25Z INFO [port forwarding] stopping
2023-07-14T18:20:25Z INFO [port forwarding] removing port file /tmp/gluetun/forwarded_port
2023-07-14T18:20:25Z INFO [firewall] removing allowed port 12345...
2023-07-14T18:20:25Z INFO [vpn] starting
2023-07-14T18:20:26Z INFO [dns over tls] init module 0: validator
2023-07-14T18:20:26Z INFO [dns over tls] init module 1: iterator
2023-07-14T18:20:26Z INFO [dns over tls] start of service (unbound 1.17.1).
2023-07-14T18:20:26Z INFO [firewall] allowing VPN connection...
2023-07-14T18:20:26Z INFO [wireguard] Using userspace implementation since Kernel support does not exist
2023-07-14T18:20:26Z INFO [wireguard] Connecting to 190.2.146.180:51820
2023-07-14T18:20:26Z INFO [wireguard] Wireguard is up
2023-07-14T18:20:29Z INFO [dns over tls] generate keytag query _ta-4a5c-4f66. NULL IN
2023-07-14T18:20:29Z INFO [dns over tls] generate keytag query _ta-4a5c-4f66. NULL IN
2023-07-14T18:20:29Z INFO [dns over tls] ready
2023-07-14T18:20:29Z INFO [vpn] VPN gateway IP address: 10.2.0.1
2023-07-14T18:20:30Z INFO [ip getter] Public IP address is 190.2.146.230 (Netherlands, North Holland, Amsterdam)
2023-07-14T18:20:30Z INFO [healthcheck] healthy!

Share your configuration

version: "3"
services:
  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun_protonvpn
    cap_add:
      - NET_ADMIN
    environment:
      - VPN_SERVICE_PROVIDER=custom
      - VPN_TYPE=wireguard
      - VPN_ENDPOINT_IP=xxx
      - VPN_ENDPOINT_PORT=xxx
      - WIREGUARD_PUBLIC_KEY=xxx
      - WIREGUARD_PRIVATE_KEY=xxx
      - WIREGUARD_ADDRESSES=10.2.0.2/32
      - VPN_PORT_FORWARDING=on
      - VPN_PORT_FORWARDING_PROVIDER=protonvpn
      - FIREWALL_OUTBOUND_SUBNETS=xxx
    volumes:
      - /xxx:/gluetun  
    ports:
      xxx
    restart: unless-stopped
@lollenderrofler lollenderrofler changed the title NAT port forwarding not working after internal VPN restart due to unhealthy health check Bug: NAT-PMP port forwarding not working after internal VPN restart due to unhealthy health check Jul 17, 2023
@qdm12
Copy link
Owner

qdm12 commented Sep 20, 2023

Can you please try the image (only amd64) qmcgaw/gluetun:pfloop? I have reworked the entire port forwarding 'run loop' with the aim of a simpler, more stable code. It may or may not work the first time, let me know (with logs 😉)!

EDIT: corresponding PR is #1874

@synfinatic
Copy link

synfinatic commented Sep 20, 2023

doesn't seem to be working for me, seems to be crashing:

gluetun  | 2023-09-20T21:10:07Z INFO [ip getter] Public IP address is 91.219.212.214 (United States, California, Los Angeles)
gluetun  | 2023-09-20T21:10:07Z INFO [vpn] There is a new release v3.35.0 (v3.35.0) created 84 days ago
gluetun  | 2023-09-20T21:10:07Z INFO [vpn] VPN gateway IP address: 10.16.0.1
gluetun  | panic: runtime error: invalid memory address or nil pointer dereference
gluetun  | [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8c6e49]
gluetun  | 
gluetun  | goroutine 98 [running]:
gluetun  | github.com/qdm12/gluetun/internal/portforward/service.(*Service).Start(0xc000d12840, {0x1092908, 0xc0004c2d70})
gluetun  |      github.com/qdm12/gluetun/internal/portforward/service/start.go:12 +0xe9
gluetun  | github.com/qdm12/gluetun/internal/portforward.(*Loop).run(0xc000b9c780, {0x1092908, 0xc0004c2d70}, 0x6ada65?, 0xc000244dc0?, 0xc0004d4660)
gluetun  |      github.com/qdm12/gluetun/internal/portforward/loop.go:85 +0x467
gluetun  | created by github.com/qdm12/gluetun/internal/portforward.(*Loop).Start in goroutine 37
gluetun  |      github.com/qdm12/gluetun/internal/portforward/loop.go:48 +0x196

@akutruff
Copy link

Getting the same error with Proton VPN with Raspberry Pi 4 / Ubuntu host.

The latest tagged image does not give that error but port forwarding does not work on container restart.

@akutruff
Copy link

Side note I store my /tmp/port_forward in a volume that persists between restarts. I need this so another container can monitor the port forward value.

version: "3"
services:
  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun
    cap_add:
      - NET_ADMIN
    devices:
      - /dev/net/tun:/dev/net/tun
    environment:
      - VPN_SERVICE_PROVIDER=custom
      - VPN_TYPE=wireguard
      - VPN_ENDPOINT_IP=*
      - VPN_ENDPOINT_PORT=*
      - VPN_PORT_FORWARDING_PROVIDER=protonvpn
      - WIREGUARD_PUBLIC_KEY=*
      - WIREGUARD_PRIVATE_KEY=*
      - WIREGUARD_ADDRESSES=*
      - FIREWALL_OUTBOUND_SUBNETS=*
      - VPN_PORT_FORWARDING=on
    ports:
      - 80:80
      - 9091:9091
    volumes:
      - gluetun_data:/tmp/gluetun
    restart: always

  port-forward-monitor:
    image: port-forward-monitor
    container_name: port-forward-monitor
    build:
      context: ./port-forward-monitor
      dockerfile: Dockerfile
    volumes:
      - gluetun_data:/data
    restart: always
    network_mode: "service:gluetun"

volumes:
    gluetun_data: {}

@qdm12
Copy link
Owner

qdm12 commented Sep 22, 2023

@akutruff I'm working on it, it's kind of complicated, so please be patient for a few days. It's my priority so I can release a v3.36.0 with protonvpn (and pia) port forwarding fixed up.

@gmillerd
Copy link

With these port forwards, there is clearly some lag from when user requests that the provider establishes the forward ... understood ... and then when the provider cannot service that request (eg, its a shared ip address and someone else already has 6881 bound, but for someone else, for example ... or they had it bound for your previous vpn but have not torn it down and freed it yet) ... then it doesn't forward.

Do users have an misguided expectation that gluetun/qdm12 is going to "solve" that?

@duracell
Copy link

duracell commented Sep 22, 2023

Do users have an misguided expectation that gluetun/qdm12 is going to "solve" that?

I don't have that expectation and I never saw this problem in their offical client. And I don't see why a reconnect shouldn't fix this, if a complete restart of the container does. But if the error appears, it stays even after multiple reconnects, but a restart solves it every time.

@qdm12
Copy link
Owner

qdm12 commented Sep 22, 2023

Can you try image qmcgaw/gluetun:pr-1874 please? It should be working (tested with a mocked out implementation of port forwarding with my provider mullvad), now let's see how resilient it is when the tunnel goes down internally (tested it as well, seems to work on my side).

@akutruff

port forwarding does not work on container restart.

You mean when the VPN restarts internally without a container restart right?

@gmillerd you're talking about another problem, this is about the port forwarding not re-triggering on a vpn internal restart in Gluetun (it's literally a bug in gluetun).

@akutruff
Copy link

@akutruff I'm working on it, it's kind of complicated, so please be patient for a few days. It's my priority so I can release a v3.36.0 with protonvpn (and pia) port forwarding fixed up.

Oh I didn't mean to apply any pressure. Was just adding more info. I really appreciate your work on this project, and take your time!

@akutruff
Copy link

@qdm12

qmcgaw/gluetun:pr-1874

Running this build for the first time on a completely fresh stack yields an error:

gluetun       | 2023-09-22T13:15:39Z INFO [routing] default route found: interface eth0, gateway 192.168.64.1, assigned IP 192.168.64.2 and family v4
gluetun       | 2023-09-22T13:15:39Z INFO [routing] adding route for 0.0.0.0/0
gluetun       | 2023-09-22T13:15:39Z INFO [firewall] setting allowed subnets...
gluetun       | 2023-09-22T13:15:39Z INFO [routing] default route found: interface eth0, gateway 192.168.64.1, assigned IP 192.168.64.2 and family v4
gluetun       | 2023-09-22T13:15:39Z INFO [routing] adding route for *****
gluetun       | 2023-09-22T13:15:39Z INFO [dns] using plaintext DNS at address 1.1.1.1
gluetun       | 2023-09-22T13:15:39Z INFO [http server] http server listening on [::]:8000
gluetun       | 2023-09-22T13:15:39Z INFO [firewall] allowing VPN connection...
gluetun       | 2023-09-22T13:15:39Z INFO [healthcheck] listening on 127.0.0.1:9999
gluetun       | 2023-09-22T13:15:39Z INFO [wireguard] Using available kernelspace implementation
gluetun       | 2023-09-22T13:15:39Z INFO [wireguard] Connecting to ******
gluetun       | 2023-09-22T13:15:39Z INFO [wireguard] Wireguard setup is complete. Note Wireguard is a silent protocol and it may or may not work, without giving any error message. Typically i/o timeout errors indicate the Wireguard connection is not working.
gluetun       | 2023-09-22T13:15:39Z INFO [dns] downloading DNS over TLS cryptographic files
gluetun       | 2023-09-22T13:15:39Z INFO [dns] downloading hostnames and IP block lists
gluetun       | 2023-09-22T13:15:41Z INFO [healthcheck] healthy!
gluetun       | 2023-09-22T13:16:40Z INFO [healthcheck] unhealthy: dialing: dial tcp4: lookup cloudflare.com: i/o timeout
gluetun       | 2023-09-22T13:16:43Z INFO [dns] init module 0: validator
gluetun       | 2023-09-22T13:16:43Z INFO [dns] init module 1: iterator
gluetun       | 2023-09-22T13:16:43Z INFO [dns] start of service (unbound 1.17.1).
gluetun       | 2023-09-22T13:16:43Z INFO [dns] generate keytag query _ta-4a5c-4f66. NULL IN
gluetun       | 2023-09-22T13:16:43Z INFO [dns] generate keytag query _ta-4a5c-4f66. NULL IN
gluetun       | 2023-09-22T13:16:43Z INFO [dns] ready
gluetun       | 2023-09-22T13:16:43Z INFO [healthcheck] healthy!
gluetun       | 2023-09-22T13:16:44Z INFO [ip getter] Public IP address is ***** (*****)
gluetun       | 2023-09-22T13:16:44Z INFO [vpn] There is a new release v3.35.0 (v3.35.0) created 86 days ago
gluetun       | 2023-09-22T13:16:44Z INFO [vpn] VPN gateway IP address: 10.2.0.1
gluetun       | panic: runtime error: invalid memory address or nil pointer dereference
gluetun       | [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x4640d4]
gluetun       |
gluetun       |
gluetun       | goroutine 93 [running]:
gluetun       | github.com/qdm12/gluetun/internal/portforward/service.(*Service).Start(0x4000afc480, {0xc308e0, 0x40001e8c80})
gluetun       |         github.com/qdm12/gluetun/internal/portforward/service/start.go:12 +0xe4
gluetun       | github.com/qdm12/gluetun/internal/portforward.(*Loop).run(0x4000126a80, {0xc308e0, 0x40001e8c80}, 0x0?, 0x0?, 0x4000078900)
gluetun       |         github.com/qdm12/gluetun/internal/portforward/loop.go:85 +0x348
gluetun       | created by github.com/qdm12/gluetun/internal/portforward.(*Loop).Start in goroutine 23
gluetun       |         github.com/qdm12/gluetun/internal/portforward/loop.go:48 +0x190
gluetun exited with code 0

You mean when the VPN restarts internally without a container restart right?

For the latest tag It's hard to say. It seems random.

  1. On a fresh launch sometimes the healthcheck fails and then no port forwarding happens.
  2. Using docker compose restart certainly makes port forwarding break most of the time.

Side note: I switch from sharing /tmp/port_forward between containers to using the control server. It didn't help anything.

Another side note that doesn't involve this issue, but you may want to note it: it's weird that I have to use the conrol server's openvpn endpoint to get the value of a port forward that was set for a wireguard connection. /v1/openvpn/portforwarded

@qdm12
Copy link
Owner

qdm12 commented Sep 22, 2023

@akutruff

Running this build for the first time on a completely fresh stack yields an error:

Are you sure? The error stack trace you have mentions github.com/qdm12/gluetun/internal/portforward/loop.go:85 +0x348 which does not correspond to the code in the PR on line 85, maybe try re-pulling the image? It's unlikely it panics now I have tested it quite a bit more (I didn't at all on the first coding try 😢 ).

For the latest tag It's hard to say. It seems random.

It basically 'deadlocks' somewhere, especially when port forwarding needs to be restarted. That's why I just threw out most of the existing code and rewrote it to be a bit simpler/cleaner/structured, hopefully to squash out any deadlocks. In my local testing, it seems to work fine at program launch, program stop, or when the vpn goes unhealthy.

it's weird that I have to use the conrol server's openvpn endpoint to get the value of a port forward that was set for a wireguard connection.

💯 percent, but basically we would need a /vpn, /vpn/openvpn and /vpn/wireguard routes, so I would prefer to remove entirely the /openvpn route when doing a v4 release (which will break compatibility). The whole control server really needs breaking changes since its code is rather convoluted to keep compatibility, and it's hard for me to maintain it. And Gluetun kind of needs a v4 breaking change sometime soon, there are years of retro-compatibility I can't wait to throw out 😄

@akutruff
Copy link

@qdm12

I repulled. Yep, looks like something had been pushed since I tried that image. No error is showing up now. It's running, and I'll leave it running. Will report back if things get weird again.

For this build, should I try to make sure docker compose restart is stable, or is that a separate issue?

@thejaykid7
Copy link

just wanted to chime in and it does seem to not get errors with this tagged image. I have been getting a different issue in addition where it seems to port forward, but when I check if the port is opened, it doesn't appear to be. Not sure if the two are related

@qdm12
Copy link
Owner

qdm12 commented Sep 23, 2023

Ok the original issue should be solved on the latest image + future release v3.36.0 with commit 7120141

For other issues if you encounter them again on the latest image (built from 2023-09-23) such as:

  • it seems to port forward, but when I check if the port is opened, it doesn't appear to be
  • should I try to make sure docker compose restart is stable, or is that a separate issue?

Please create new issues 😉 Thanks!!

@qdm12 qdm12 closed this as completed Sep 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants