Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: a smart way to check internal/poll ErrTimeout #32735

Open
detailyang opened this issue Jun 22, 2019 · 13 comments
Open

net: a smart way to check internal/poll ErrTimeout #32735

detailyang opened this issue Jun 22, 2019 · 13 comments
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@detailyang
Copy link

Hello guys.

It looks like we have no smart way to check the internal/poll ErrTimeout when we set the deadline on the connection.

The net.Error interface is not good enough as following

An Error represents a network error.

type Error interface {
        error
        Timeout() bool   // Is the error a timeout?
        Temporary() bool // Is the error temporary?
}

The Timeout() method may mislead caller think it's deadline error if the error is actually caused by sycall.Errno (e == EAGAIN || e == EWOULDBLOCK || e == ETIMEDOUT) rather than internal/poll ErrTimeout

Maybe we need a smart way to check the deadline error?

Many thanks in advance:)

@gopherbot gopherbot added this to the Proposal milestone Jun 22, 2019
@detailyang
Copy link
Author

#31449 is a good example which the TCP keepalive probe failure cause the ETIMEDOUT

@agnivade agnivade changed the title proposal: a smart way to check internal/poll ErrTimeout net: a smart way to check internal/poll ErrTimeout Jun 23, 2019
@agnivade agnivade added FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. and removed Proposal labels Jun 23, 2019
@agnivade agnivade modified the milestones: Proposal, Unplanned Jun 23, 2019
@agnivade
Copy link
Contributor

I have removed the proposal tag and made it a feature request because you have not proposed anything concrete (which a proposal should be), but you would like something like this to happen.

/cc @mikioh

@ianlancetaylor
Copy link
Member

Can you explain further why checking for and calling the Timeout method is not sufficient? Note that network descriptors will never return EAGAIN or EWOULDBLOCK in practice.

@detailyang
Copy link
Author

detailyang commented Jun 24, 2019

@agnivade @ianlancetaylor

I wrote a gist in Linux to trigger the keepalive probe ETIMEDOUT in https://gist.github.com/detailyang/2f5ce65321c6f97ab66babedc8d92f16

The code above is that using iptables to drop TCP keepalive probe (dport is 12345) and see kernel code in https://elixir.bootlin.com/linux/v5.2-rc5/source/net/ipv4/tcp_timer.c#L642, we see when the TCP keepalive timer failed then kernel will set the error to ETIMEDOUT

/**
 *  tcp_write_err() - close socket and save error info
 *  @sk:  The socket the error has appeared on.
 *
 *  Returns: Nothing (void)
 */

static void tcp_write_err(struct sock *sk)
{
	sk->sk_err = sk->sk_err_soft ? : ETIMEDOUT;
	sk->sk_error_report(sk);

	tcp_write_queue_purge(sk);
	tcp_done(sk);
	__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONTIMEOUT);
}

@ianlancetaylor
Copy link
Member

@detailyang Thanks, but I don't understand how that is an answer to my question. After all, you can already test for ETIMEDOUT for the cases where you need to distinguish that case.

@detailyang
Copy link
Author

detailyang commented Jun 24, 2019

@ianlancetaylor Yep, because I know the error is ETIMEDOUT but for some guys who expect the net.Timeout means deadline timeout, it will be confused. See #31449

@detailyang
Copy link
Author

detailyang commented Jun 24, 2019

People always check the deadline timeout error as the following

if err := conn.SetDeadline(time.Now().Add(1 * time.Hour)); err != nil {
    panic(err)
}

if _, err := conn.Write([]byte("test")); err != nil {
    if ne, ok := err.(net.Error); ok && ne.Timeout() {
        fmt.Println("deadline timeout)
    }
}

@networkimprov
Copy link

I think this is a dup of #31449, which hasn't received any attention.

@ianlancetaylor
Copy link
Member

Well, maybe.

This should be considered in conjunction with #29934, which may suggest a better approach.

@ianlancetaylor
Copy link
Member

I agree that this seems like a dup of #31449.

@detailyang
Copy link
Author

detailyang commented Jun 24, 2019

@ianlancetaylor yep, we need better error observability

I will keep the issue open until the #29934 or any better way land in Go

@networkimprov
Copy link

We don't need a new feature, we need a bugfix. Keepalive errors are connection failures, not deadline events.

@detailyang
Copy link
Author

PS: the comment in net.Conn it looks not right in keepalive probe failure which is

Read can be made to time out and return an Error with Timeout() == true
after a fixed time limit; see SetDeadline and SetReadDeadline.

// Multiple goroutines may invoke methods on a Conn simultaneously.
type Conn interface {
	// Read reads data from the connection.
	// Read can be made to time out and return an Error with Timeout() == true
	// after a fixed time limit; see SetDeadline and SetReadDeadline.
	Read(b []byte) (n int, err error)

	// Write writes data to the connection.
	// Write can be made to time out and return an Error with Timeout() == true
	// after a fixed time limit; see SetDeadline and SetWriteDeadline.
	Write(b []byte) (n int, err error)

	// Close closes the connection.
	// Any blocked Read or Write operations will be unblocked and return errors.
	Close() error

	// LocalAddr returns the local network address.
	LocalAddr() Addr

	// RemoteAddr returns the remote network address.
	RemoteAddr() Addr

	// SetDeadline sets the read and write deadlines associated
	// with the connection. It is equivalent to calling both
	// SetReadDeadline and SetWriteDeadline.
	//
	// A deadline is an absolute time after which I/O operations
	// fail with a timeout (see type Error) instead of
	// blocking. The deadline applies to all future and pending
	// I/O, not just the immediately following call to Read or
	// Write. After a deadline has been exceeded, the connection
	// can be refreshed by setting a deadline in the future.
	//
	// An idle timeout can be implemented by repeatedly extending
	// the deadline after successful Read or Write calls.
	//
	// A zero value for t means I/O operations will not time out.
	SetDeadline(t time.Time) error

	// SetReadDeadline sets the deadline for future Read calls
	// and any currently-blocked Read call.
	// A zero value for t means Read will not time out.
	SetReadDeadline(t time.Time) error

	// SetWriteDeadline sets the deadline for future Write calls
	// and any currently-blocked Write call.
	// Even if write times out, it may return n > 0, indicating that
	// some of the data was successfully written.
	// A zero value for t means Write will not time out.
	SetWriteDeadline(t time.Time) error
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FeatureRequest Issues asking for a new feature that does not need a proposal. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

5 participants