Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http client doesn't retry after error #118

Open
2opremio opened this issue Jun 13, 2024 · 3 comments
Open

http client doesn't retry after error #118

2opremio opened this issue Jun 13, 2024 · 3 comments

Comments

@2opremio
Copy link

2opremio commented Jun 13, 2024

I am using jrpc2 v1.2.0

This test:

package main

import (
	"context"
	"fmt"
	"net/http"
	"testing"

	"github.com/creachadair/jrpc2"
	"github.com/creachadair/jrpc2/handler"
	"github.com/creachadair/jrpc2/jhttp"
	"github.com/stretchr/testify/require"
)

func TestClientReconnectsIfConnectionRefused(t *testing.T) {
	ch := jhttp.NewChannel("http://localhost:8000", nil)
	client := jrpc2.NewClient(ch, nil)
	defer client.Close()

	var result string
	err := client.CallResult(context.Background(), "Hello", nil, &result)
	require.Error(t, err)

	b := jhttp.NewBridge(handler.Map{
		"Hello": handler.New(func(context.Context) (string, error) {
			return "Hello, world!", nil
		}),
	}, nil)
	defer b.Close()
	go http.ListenAndServe("localhost:8000", b)

	err = client.CallResult(context.Background(), "Hello", nil, &result)
	require.NoError(t, err)
	fmt.Println(result)
}

fails with:

=== RUN   TestClientReconnectsIfConnectionRefused
        	Error:      	Received unexpected error:
        	            	Post "http://localhost:8000": dial tcp [::1]:8000: connect: connection refused
        	Test:       	TestClientReconnectsIfConnectionRefused
--- FAIL: TestClientReconnectsIfConnectionRefused (0.00s)

FAIL

It works if I recreate the after the server, i.e.:

func TestClientReconnectsIfConnectionRefused(t *testing.T) {
	ch := jhttp.NewChannel("http://localhost:8000", nil)
	client := jrpc2.NewClient(ch, nil)
	defer client.Close()

	var result string
	err := client.CallResult(context.Background(), "Hello", nil, &result)
	require.Error(t, err)

	b := jhttp.NewBridge(handler.Map{
		"Hello": handler.New(func(context.Context) (string, error) {
			return "Hello, world!", nil
		}),
	}, nil)
	defer b.Close()
	go http.ListenAndServe("localhost:8000", b)

	ch2 := jhttp.NewChannel("http://localhost:8000", nil)
	client2 := jrpc2.NewClient(ch2, nil)
	defer client2.Close()
	err = client2.CallResult(context.Background(), "Hello", nil, &result)
	require.NoError(t, err)
	fmt.Println(result)
}

I don't know if it's expected behavior, but I was expecting the client to try again after getting a connection-refused error (Just like Go's http client).

I spent hours debugging what the problem was, see stellar/stellar-rpc#207

@2opremio 2opremio changed the title http client doesn't retry after connection refused http client doesn't retry after error Jun 13, 2024
@creachadair
Copy link
Owner

It isn't necessary to create a new channel, the problem you're seeing arises in the jrpc2.Client: It receives an error trying to read from the channel so it closes the client, and that is a persistent status.

It should also work to say, for example:

client2 := jrpc2.NewClient(ch, nil)  // N.B. reusing the same channel

The jhttp.Channel is somewhat limited in what it can do to report errors: It can't[1] block on sending the request, because that will not return until the whole request is complete. So instead it reports all errors via its Recv, with the side effect you noted for the Client.

Usually when a stream breaks, it's not recoverable—and the Client doesn't have any deeper knowledge of its Channel so it can't tell that in this one case it would be OK to keep trying. For a socket, however, that would lead to the client looping forever reading EOF or EBADF or whatever the underlying transport says when it's closed/invalid.

One way to work around this might be to wrap the client rather than the channel, e.g.,

// Warning: schematic, untested

type Client struct{
   ch *jhttp.Channel
   cli *jrpc2.Client
   opts *jrpc2.ClientOptions
}

func NewClient(ch *jhttp.Channel, opts *jrpc2.ClientOptions) *Client {
   return &Client{ch: ch, cli: jrpc2.NewClient(ch, opts), opts: opts}
}

func (c *Client) CallResult(ctx context.Context, method string, params, result any) error {
   for ctx.Err() == nil {
      err := c.cli.CallResult(ctx, method, params, result)
      if err == nil {
         return nil
      } else if !isRetryable(err) {
         return err
      }
      backoffForRetry()
      // Maybe also: limit retry count

      c.cli = jrpc2.NewClient(c.ch, c.opts)
   }
   return ctx.Err()
}

or words to that effect. (You might also need a lock if you want it to be usable concurrently, and so on).

[1]: Or at least, not using the standard library's HTTP client. And reimplementing all of HTTP is out of scope here 🙂

@2opremio
Copy link
Author

I see, thanks. I was expecting the client to work like Go's http client (which doesn't bail out if it cannot connect)

@creachadair
Copy link
Owner

I see, thanks. I was expecting the client to work like Go's http client (which doesn't bail out if it cannot connect)

Unfortunately not, the client assumes a channel that's already connected, so there's no separate dial phase the client could use to detect that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants