Skip to content

Fix kube port-forward to exit on pod removal#57051

Merged
rana merged 1 commit intomasterfrom
rana/kube-fix-portforward
Aug 8, 2025
Merged

Fix kube port-forward to exit on pod removal#57051
rana merged 1 commit intomasterfrom
rana/kube-fix-portforward

Conversation

@rana
Copy link
Copy Markdown
Contributor

@rana rana commented Jul 22, 2025

Fixes #54814

kubectl port-forward now exits when a pod disconnects.

This addresses kubectl user expectations that the Teleport Kubernetes proxy behave transparently, and identical to direct-to-Kubernetes. Previously kubectl over Teleport port-forwarding would remain open indefinitely even though no pods were available. That behavior was different from the kubectl direct-to-Kubernetes experience.

In this PR:

  • Changed error stream copying to uni-directional (target -> source) in SPDY and WebSocket forwardStreamPair() functions
  • Added WaitGroup to SPDY run() enabling all port-forward streams to fully complete before returning
  • Added an integration test for Kubernetes port-forwarding with pod disconnection
  • Updated existing port-forward integration tests to allow and expect ErrLostConnectionToPod

Manual Bug Reproduction

The bug was manually reproduced using kubectl direct-to-Kubernetes, and kubectl over Teleport.

Bug Behavior (Teleport)

Observing the bug with kubectl and Teleport.

Here we see the kubectl port-forward process running after all pods are removed.

> tsh logout
Credentials expired for proxy "localhost:3080", skipping...
Logged out all users from all proxies.

> k delete all --all
service "kubernetes" deleted
deployment.apps "test-web" deleted
replicaset.apps "test-web-6875846678" deleted

> tsh login --proxy=localhost:3080
Enter password for Teleport user rana.ian:
Enter an OTP code from a device:

> Profile URL:        https://localhost:3080
  Logged in as:       rana.ian
  Cluster:            kube-test-cluster
  Roles:              access, editor, kube-admin
  Kubernetes:         enabled
  Kubernetes groups:  system:masters
  Valid until:        2025-07-22 23:47:05 -0700 PDT [valid for 12h0m0s]
  Extensions:         login-ip, permit-agent-forwarding, permit-port-forwarding, permit-pty, private-key-policy

> tsh kube login colima
Logged into Kubernetes cluster "colima". Try 'kubectl version' to test the connection.

> kubectl create deployment test-web --image=nginx
deployment.apps/test-web created

> kubectl wait --for=condition=ready pod -l app=test-web
pod/test-web-6875846678-pk8kd condition met

> kubectl port-forward deployment/test-web 8080:80 &
PF_PID=$!
[1] 18578

> Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
kubectl scale deployment test-web --replicas=0
deployment.apps/test-web scaled

> sleep 5

> curl --max-time 2 http://localhost:8080
Handling connection for 8080
curl: (52) Empty reply from server

> ps -p $PF_PID > /dev/null && echo "❌ BUG: kubectl still running"
❌ BUG: kubectl still running

> kill $PF_PID
[1]  + terminated  kubectl port-forward deployment/test-web 8080:80 

Expected Behavior (Direct k8s)

Observing kubectl behavior when directly connected to Kubernetes (no Teleport).

Here we see the kubectl port-forward process exits after all pods are removed.

> tsh logout
Logged out all users from all proxies.

> k delete all --all
service "kubernetes" deleted
deployment.apps "test-web" deleted

> kubectl config use-context colima
Switched to context "colima".

> kubectl create deployment test-web --image=nginx
deployment.apps/test-web created

> kubectl wait --for=condition=ready pod -l app=test-web
pod/test-web-6875846678-jx8cn condition met

> kubectl port-forward deployment/test-web 8080:80 &
DIRECT_PID=$!
[1] 19650

> Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
kubectl scale deployment test-web --replicas=0
deployment.apps/test-web scaled

> sleep 5

> curl --max-time 2 http://localhost:8080
Handling connection for 8080
E0722 11:59:15.648965   19650 portforward.go:424] "Unhandled Error" err="an error occurred forwarding 8080 -> 80: error forwarding port 80 to pod d085c4246eb130a7fd585cda71cf69d4a90a6156b8cf659bcc0a4adeca7d66f5, uid : network namespace for sandbox \"d085c4246eb130a7fd585cda71cf69d4a90a6156b8cf659bcc0a4adeca7d66f5\" is closed"
curl: (52) Empty reply from server
error: lost connection to pod
[1]  + exit 1     kubectl port-forward deployment/test-web 8080:80    

> ps -p $DIRECT_PID > /dev/null || echo "✓ EXPECTED: kubectl exited"
✓ EXPECTED: kubectl exited

Fix Behavior (Teleport)

Observing the fix with kubectl and Teleport.

Here we see the kubectl port-forward process exits after all pods are removed.

> tsh logout
All users logged out.

> kubectl delete deployment test-web
Error from server (NotFound): deployments.apps "test-web" not found

> tsh login --proxy=localhost:3080
Enter password for Teleport user rana.ian:
Enter an OTP code from a device:
> Profile URL:        https://localhost:3080
  Logged in as:       rana.ian
  Cluster:            teleport-laptop
  Roles:              access, editor, kube-admin
  Kubernetes:         enabled
  Kubernetes groups:  system:masters
  Valid until:        2025-07-26 00:30:19 -0700 PDT [valid for 12h0m0s]
  Extensions:         login-ip, permit-agent-forwarding, permit-port-forwarding, permit-pty, private-key-policy

> tsh kube login colima
Logged into Kubernetes cluster "colima". Try 'kubectl version' to test the connection.

> kubectl create deployment test-web --image=nginx
deployment.apps/test-web created

> kubectl wait --for=condition=ready pod -l app=test-web
pod/test-web-6875846678-ds94t condition met

> kubectl port-forward deployment/test-web 8080:80 & PF_PID=$!
[1] 34723
> Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80

> kubectl scale deployment test-web --replicas=0
deployment.apps/test-web scaled

> sleep 5

> curl --max-time 2 http://localhost:8080
Handling connection for 8080
E0725 12:32:31.608718   34723 portforward.go:424] "Unhandled Error" err="an error occurred forwarding 8080 -> 80: error forwarding port 80 to pod f3382c71de162fc90a1c696de37bc0a6473813f8e6776e356046b25b5bb8b5b8, uid : network namespace for sandbox \"f3382c71de162fc90a1c696de37bc0a6473813f8e6776e356046b25b5bb8b5b8\" is closed"
curl: (52) Empty reply error: lost connection to pod
from server
[1]  + exit 1     kubectl port-forward deployment/test-web 8080:80

> ps -p $PF_PID > /dev/null && echo "❌ BUG: kubectl still running" || echo "✓ Fixed: kubectl exited"
✓ Fixed: kubectl exited

Changelog: Kubernetes Access: kubectl port-forward now exits cleanly when backend pods are removed

Comment thread lib/kube/proxy/portforward_spdy.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
@creack
Copy link
Copy Markdown
Contributor

creack commented Jul 22, 2025

Looks good so far, but we'll need to update the websocket version as well, right?

Copy link
Copy Markdown
Contributor

@tigrato tigrato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also include an integration test for this functionality?
We run a kind cluster in our CI so this should be trivial to model with a real Kubernetes cluster and native SPDY + SPDY over websocket protocol

Comment thread lib/kube/proxy/portforward_spdy.go
Comment thread fixtures/alpine/Dockerfile
Comment thread fixtures/alpine/alpine-minirootfs-3.20.3-aarch64.tar.gz Outdated
Comment thread integration/kube_integration_test.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread fixtures/alpine/webserver Outdated
Comment thread lib/kube/proxy/portforward_websocket.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
@rana
Copy link
Copy Markdown
Contributor Author

rana commented Jul 31, 2025

Looks good so far, but we'll need to update the websocket version as well, right?

@creack Here's a summary of the additions to lib/kube/proxy/portforward_spdy.go and lib/kube/proxy/portforward_websocket.go with eye towards what can be symmetric.

  • run() function in SPDY & Websocket
    • <-h.targetConn.CloseChan(): only in SPDY. Originally I think you were asking about the SPDY (h *portForwardProxy) run() case <-h.targetConn.CloseChan(): also being added to the websocket version. I looked at adding a similar case to websocket, but it wasn't added. Looking at the websocket run() function, the websocket version is setup a bit differently without select and cases.
    • WaitGroup was added to SPDY to to allow all transfers to complete before closing a source connection. Similar WaitGroup logic was already in websocket.
  • forwardStreamPair() function in SPDY & Websocket
    • io.Copy(p.errorStream, targetErrorStream) was added to both SPDY and websocket.

Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread integration/kube_integration_test.go
Comment thread integration/kube_integration_test.go Outdated
Comment thread lib/kube/proxy/portforward_spdy.go
Comment thread lib/kube/proxy/portforward_websocket.go
Comment thread lib/kube/proxy/portforward_spdy.go
Comment thread lib/kube/proxy/portforward_spdy.go
Copy link
Copy Markdown
Contributor

@creack creack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something, but I still can reproduce the issue on 35ce01b.

Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080

Even after deleting the pod, it keeps handling the connections.

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

Port forward:

tsh kube login kind1
kubectl apply -f deploy.yaml
kubectl port-forward deployments/nginx 8080:80

Check loop:

while true; do curl localhost:8080; sleep 1; done

Scale down:

kubectl scale deployment nginx --replicas=0

After scale down, curl returns:

curl: (52) Empty reply from server

instead of

curl: (7) Failed to connect to localhost port 8080 after 0 ms: Couldn't connect to server

and the port-forward doesn't exits.

Comment thread integration/kube_integration_test.go
Comment thread lib/kube/proxy/portforward_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread integration/kube_integration_test.go Outdated
Comment thread lib/kube/proxy/portforward_spdy.go Outdated
Comment thread lib/kube/proxy/portforward_spdy.go
Comment thread lib/kube/proxy/portforward_spdy.go Outdated
@creack
Copy link
Copy Markdown
Contributor

creack commented Aug 4, 2025

Maybe I am missing something, but I still can reproduce the issue on 35ce01b.

Forwarding from 127.0.0.1:8080 -> 80
Forwarding from [::1]:8080 -> 80
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080
Handling connection for 8080

Even after deleting the pod, it keeps handling the connections.

Deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        ports:
        - containerPort: 80

Port forward:

tsh kube login kind1
kubectl apply -f deploy.yaml
kubectl port-forward deployments/nginx 8080:80

Check loop:

while true; do curl localhost:8080; sleep 1; done

Scale down:

kubectl scale deployment nginx --replicas=0

After scale down, curl returns:

curl: (52) Empty reply from server

instead of

curl: (7) Failed to connect to localhost port 8080 after 0 ms: Couldn't connect to server

and the port-forward doesn't exits.

It was user error. Helm using a different version as the built one. Sorry about this. Tested properly and confirmed it works as expected.

@rana rana requested review from tigrato and zmb3 August 6, 2025 02:36
Copy link
Copy Markdown
Contributor

@fspmarshall fspmarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the pending points are resolved.

@rana rana changed the title [kube] fix: kubectl port-forward now exits cleanly when backend pods are removed Fix kube port-forward to exit on pod removal Aug 8, 2025
Comment thread integration/kube_integration_test.go Outdated
@rana rana force-pushed the rana/kube-fix-portforward branch from 2e54b37 to 271a4f0 Compare August 8, 2025 22:32
`kubectl port-forward` now exits when a pod disconnects. This addresses `kubectl` user expectations that the Teleport Kubernetes proxy behave transparently, and identical to direct-to-Kubernetes.

- Changed error stream copying to uni-directional (target -> source) in SPDY and WebSocket `forwardStreamPair()` functions
- Added `WaitGroup` to SPDY `run()` enabling all port-forward streams to fully complete before returning
- Added an integration test for Kubernetes port-forwarding with pod disconnection
- Updated existing port-forward integration tests to allow and expect `ErrLostConnectionToPod`

Fixes #54814
@rana rana force-pushed the rana/kube-fix-portforward branch from 271a4f0 to 2cee6be Compare August 8, 2025 22:44
@rana rana added this pull request to the merge queue Aug 8, 2025
Merged via the queue into master with commit 505e1f0 Aug 8, 2025
40 checks passed
@rana rana deleted the rana/kube-fix-portforward branch August 8, 2025 23:30
@backport-bot-workflows
Copy link
Copy Markdown
Contributor

@rana See the table below for backport results.

Branch Result
branch/v16 Failed
branch/v17 Failed
branch/v18 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Kubernetes port-forward does not disconnect on loss of connection

6 participants