Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os/exec: resource leak on exec failure #69284

Closed
rustyx opened this issue Sep 5, 2024 · 14 comments
Closed

os/exec: resource leak on exec failure #69284

rustyx opened this issue Sep 5, 2024 · 14 comments
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done. OS-Linux
Milestone

Comments

@rustyx
Copy link

rustyx commented Sep 5, 2024

Go version

go version go1.23.0 linux/amd64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/home/user/.cache/go-build'
GOENV='/home/user/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/user/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/user/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.23.0'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/user/.config/go/telemetry'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2691713470=/tmp/go-build -gno-record-gcc-switches'

What did you do?

It seems that when process.Start fails, there are some files left open, which after a while leads to a "too many open files" error.

Here's a minimal repro:

package main

import (
	"context"
	"errors"
	"os/exec"
	"syscall"
	"testing"
)

func TestExecResources(t *testing.T) {
	ctx := context.Background()
	var oldRLimit syscall.Rlimit
	err := syscall.Getrlimit(syscall.RLIMIT_NOFILE, &oldRLimit)
	if err != nil {
		t.Fatal(err)
	}
	newRLimit := oldRLimit
	newRLimit.Cur = 20
	err = syscall.Setrlimit(syscall.RLIMIT_NOFILE, &newRLimit)
	if err != nil {
		t.Fatal(err)
	}
	defer func() {
		syscall.Setrlimit(syscall.RLIMIT_NOFILE, &oldRLimit)
	}()

	for i := 0; i < 22; i++ {
		ctx, cancel := context.WithCancel(ctx)
		process := exec.CommandContext(ctx, "/bin/nonexistent")
		err = process.Start()
		cancel()
		process.Wait() // not really needed, just to demonstrate that even calling Wait doesn't help
		var se syscall.Errno
		if !errors.As(err, &se) || se != syscall.ENOENT {
			t.Fatal(err)
		}
	}
}

What did you see happen?

The test fails with:

--- FAIL: TestExecResources (0.00s)
    process_unix_test.go:35: fork/exec /bin/nonexistent: too many open files

What did you expect to see?

I expected the test to pass.

Note that the issue occurs also on other exec failures, such as permission denied, invalid ELF format, etc.

@rogpeppe
Copy link
Contributor

rogpeppe commented Sep 5, 2024

This is a regression in Go 1.23.0. I've bisected it to commit 2f64268, introduced by this CL https://go-review.googlesource.com/c/go/+/570036/

@rogpeppe
Copy link
Contributor

rogpeppe commented Sep 5, 2024

Looks like a bug in syscall.StartProcess to me. The docs for PidFD say:

	// PidFD, if not nil, is used to store the pidfd of a child, if the
	// functionality is supported by the kernel, or -1. Note *PidFD is
	// changed only if the process starts successfully.
	PidFD *int

but this program prints "PidFD set on failure", which seems wrong

package main

import (
	"fmt"
	"syscall"
)

func main() {
	var pidfd int = -1
	_, _, err := syscall.StartProcess("nonexistent", []string{"nonexistent"}, &syscall.ProcAttr{
		Sys: &syscall.SysProcAttr{
			PidFD: &pidfd,
		},
	})
	if err == nil {
		panic("unexpected StartProcess success")
	}
	if pidfd != -1 {
		fmt.Printf("PidFD set on failure to %v\n", pidfd)
	}
}

@dmitshur
Copy link
Contributor

dmitshur commented Sep 5, 2024

CC @bradfitz, @ianlancetaylor.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Sep 5, 2024
@dmitshur dmitshur added this to the Backlog milestone Sep 5, 2024
@ianlancetaylor
Copy link
Member

CC @kolyshkin

@ianlancetaylor
Copy link
Member

@rogpeppe I don't think your program is buggy. The program does get a valid pidfd that refers to the child. It's true that the child exits immediately because the exec failed. That just means that the pidfd refers to the zombie process.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/611217 mentions this issue: os: release pidfd if StartProcess fails

@rogpeppe
Copy link
Contributor

rogpeppe commented Sep 6, 2024

I don't think your program is buggy.

I'd say it's buggy because the behaviour does not conform to the documentation. The docs say "PidFD is changed only if the process starts successfully". The process did not start successfully, but PidFD was nonetheless changed.

@ianlancetaylor
Copy link
Member

@rogpeppe Thanks, I see what you mean. I wasn't quite grasping that syscall.StartProcess was returning failure and in that case it has already waited for the zombie process.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/611495 mentions this issue: syscall: on exec failure, close pidfd

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. FixPending Issues that have a fix which has not yet been reviewed or submitted. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Sep 10, 2024
@dmitshur dmitshur modified the milestones: Backlog, Go1.24 Sep 10, 2024
@mvdan
Copy link
Member

mvdan commented Sep 11, 2024

Is the plan to backport this to Go 1.23.x?

@ianlancetaylor
Copy link
Member

@gopherbot Please open a backport issue to 1.23.

This causes os/exec to leak file descriptors when used to run a non-existent file on Linux. There is no simple workaround.

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #69402 (for 1.23).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/613616 mentions this issue: [release-branch.go1.23] syscall: on exec failure, close pidfd

gopherbot pushed a commit that referenced this issue Sep 18, 2024
For #69284
Fixes #69402

Change-Id: I6350209302778ba5e44fa03d0b9e680d2b4ec192
Reviewed-on: https://go-review.googlesource.com/c/go/+/611495
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: roger peppe <[email protected]>
Reviewed-by: Tim King <[email protected]>
Auto-Submit: Ian Lance Taylor <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
(cherry picked from commit 8926ca9)
Reviewed-on: https://go-review.googlesource.com/c/go/+/613616
Reviewed-by: Ian Lance Taylor <[email protected]>
Auto-Submit: Ian Lance Taylor <[email protected]>
Reviewed-by: Tobias Klauser <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FixPending Issues that have a fix which has not yet been reviewed or submitted. NeedsFix The path to resolution is known, but the work has not been done. OS-Linux
Projects
None yet
Development

No branches or pull requests

7 participants