Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on server shutdown #1504

Closed
mpldr opened this issue Feb 25, 2023 · 7 comments
Closed

Panic on server shutdown #1504

mpldr opened this issue Feb 25, 2023 · 7 comments

Comments

@mpldr
Copy link
Contributor

mpldr commented Feb 25, 2023

I encounter this panic when shutting down the server.

The steps are:

  • start server
  • get index page with firefox
  • send sigint (invalidates the context)
  • see panic

Server setup/shutdown:

	lis, err := net.Listen("tcp", "localhost:8080")
	if err != nil {
		glog.Errorf("failed to listen on %s: %v", ":8080", err)
		return
	}
	srv := &fasthttp.Server{
		Handler:                      handler.RootHandler,
		MaxConnsPerIP:                1,
		MaxRequestBodySize:           260 * 1024 * 1024,
		DisablePreParseMultipartForm: true,
		StreamRequestBody:            true,
		TLSConfig:                    &tls.Config{},
	}
	httpErr := make(chan error)
	go func() {
		httpErr <- srv.Serve(lis)
	}()

	select {
	case err = <-httpErr:
		if err != nil {
			glog.Errorf("server encoutered an error. not good: %v", err) // TODO: might want to rephrase this
		}
		errchan <- err
		fmt.Println(err)
	case <-ctx.Done():
		ctx, cancel := context.WithTimeout(context.Background(), 3*time.Second)
		srv.ShutdownWithContext(ctx)
		cancel()
		return
	}

The panic I get is:

panic: BUG: negative per-ip counter=-1 for ip=2130706433

goroutine 6 [running]:
github.com/valyala/fasthttp.(*perIPConnCounter).Unregister(0xc0001660e8, 0x68cd0?)
	github.com/valyala/[email protected]/peripconn.go:35 +0x179
github.com/valyala/fasthttp.(*perIPConn).Close(0xc00030e000)
	github.com/valyala/[email protected]/peripconn.go:70 +0x4d
github.com/valyala/fasthttp.(*Server).closeIdleConns(0xc000166000)
	github.com/valyala/[email protected]/server.go:2904 +0x10f
github.com/valyala/fasthttp.(*Server).ShutdownWithContext(0xc000166000, {0x673328, 0xc00011a600})
	github.com/valyala/[email protected]/server.go:1907 +0x272
main.httpListener({0x6732b8, 0xc000090140}, 0x0?, 0xc00010e000)
	mpldr.codes/tarbash/main.go:71 +0x408
main.main.func1()
	mpldr.codes/tarbash/main.go:32 +0xcd
created by main.main
	mpldr.codes/tarbash/main.go:30 +0x111

At times I also get this:

^Cpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x5ae6c4]

goroutine 22 [running]:
github.com/valyala/fasthttp.(*perIPConn).Close(0xc000110080)
	github.com/valyala/[email protected]/peripconn.go:69 +0x24
github.com/valyala/fasthttp.(*workerPool).workerFunc(0xc000120140, 0xc0001100c0)
	github.com/valyala/[email protected]/workerpool.go:238 +0x374
github.com/valyala/fasthttp.(*workerPool).getCh.func1()
	github.com/valyala/[email protected]/workerpool.go:196 +0x38
created by github.com/valyala/fasthttp.(*workerPool).getCh
	github.com/valyala/[email protected]/workerpool.go:195 +0x1b0

But the latter is only sometimes.

@mpldr
Copy link
Contributor Author

mpldr commented Feb 25, 2023

Not sure if the first even has to be a panic, to be honest. Looks like nothing but a sanity check to me.

@li-jin-gou
Copy link
Contributor

hello @mpldr Can you provide a complete runnable code ? I would like to try to reproduce this problem in my computer.

@mpldr
Copy link
Contributor Author

mpldr commented Feb 26, 2023

Sure, here you go: http://0x0.st/Hs5T.tzst

It's not yet part of any repo, so a tarball is the best I can offer you right away :)

@erikdubbelboer
Copy link
Collaborator

The problem is that Shutdown calls closeIdleConns. Which calls Close on all connections here:

_ = c.Close()

But then when the worker goroutine stops it also calls Close on the connection here:
_ = c.Close()

We should somehow stop one of them from happening. I think closeIdleConns needs to call Close to stop the read in serveConn. So we'll need some way for workerFunc not to close the connection. I'm not sure in which case this close is needed.

I'm afraid I don't have time to investigate this further right now. I'll have another look in the future.

@mpldr
Copy link
Contributor Author

mpldr commented Mar 15, 2023

Thinking of it, I more and more think this shouldn't be a panic but instead handled gracefully™ (which might mean discarding after logging in this case)

erikdubbelboer pushed a commit that referenced this issue Mar 30, 2023
* client: simplify (*HostClient).do()

Remove an allocation in favour of deferring a call to release the
response.

* client: remove panic in dialAddr

Return an error instead of panicking if the user supplied a nonsensical
DialFunc.

* compression: remove panic on invalid compression level

If a compression level exceeding gzip's boundaries is provided, fasthttp
will panic. Instead it would be better to handle this error for them by
limiting it to the minimum or maximum value, depending on the direction
the user has exceeded the limits.

Clamp the value of gzip to always be between gzip.BestSpeed and
gzip.BestCompression.

* peripconn: remove panic on negative connection count

When a negative count is reached when unregistering a connection, a
panic is caused even though data-integrity is not at risk.

Replace the panic() with a simple clamp on the value to ensure the
value does not exceed it's expected lower bounds.

References: #1504

* compress: remove error on failed nonblocking writes

Since there is no way of handling or even logging non-critical errors in
stateless non-blocking writecalls, just drop them and hope the user
notices and tries again.

* workerPool: remove panic on redundant Start and Stop calls

Instead of panicking for invalid behaviour, it's preferable to just turn
the function into a noop.

* http: remove panic on invalid form boundary

* http: remove panic on negative reads

Since bufio already panics on negative reads, it is not necessary to do
so as well. If the length is zero and for some reason no error is
returned, readBodyIdentity and appendBodyFixedSize now errors in these
cases.

Link: https://github.com/golang/go/blob/851f6fd61425c810959c7ab51e6dc86f8a63c970/src/bufio/bufio.go#L246

* fs: remove panic on negative reader count

When a negative count is reached when unregistering a reader, a panic is
thrown even though data-integrity is not at risk.

Replace the panic() with a simple clamp on the value to ensure the
value does not exceed it's expected lower bounds.

* server: remove panic in favour of a segfault

Panicking with "BUG: " obscures the error. As the segfault causes a
panic anyway, just let the chaos unfold.

* server: remove panic in favour of returning an error

Writing on a timed-out response is not endangering data integrity and
just fails.

* chore: add comments to all panics

* chore: fix minor typo
@erikdubbelboer
Copy link
Collaborator

Should be fixed in #1526

@mpldr
Copy link
Contributor Author

mpldr commented Mar 30, 2023

As mentioned in #1526 it now doesn't panic anymore. The underlying double close is not fixed. So not sure if this ticket should be renamed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants