Improving client call options flexibility and replace retry interceptor #16216

CaojiamingAlan · 2023-07-11T01:09:38Z

What would you like to be added?

1. Expose more client call options

Most of the gRPC call options of clientV3 cannot be configured by users (while some of them can, like MaxCallSendMsgSize) , and results in issues like

We should provide users with more flexibility to make RPC calls.

2. Replace the customized retry interceptor with RetryPolicy

Old gRPC versions do not have retry options, and etcd client implemented retry through the retry interceptor. #12671 had replaced the balancer logic with the gRPC solution, and left a TODO here. Maybe we can finish this replacement now.

3. (Maybe) Set `defaultWaitForReady` to True

I'm not sure about this. This is motivated by #14631. Actually, by my experiments, not only NewSession() but common usages like Get() may hang: whenever the call is directed to the SIGSTOP-ed server (this always happens since it uses round-robin), it hangs indefinitely waiting for the server to be ready. Neither did I find anywhere configures a default timeout.

It seems the motivation to set WaitForReady to True is for

preventing retry requests from overwhelming the network grpcproxy: Disable fast fail on lease grant call to cluster #8122
minimizing client request error responses due to transient failures

(FYI WaitForReady is the opposite of FailFast, with the latter one being deprecated)

But if once we have the retry and WaitForReady configs exposed, we can change the default behavior to fail fast, and users may turn it on manually, or set a proper retry limit.

I do see caveats of clientV2 but I think we may try to solve it in V3.

Why is this needed?

To provide more flexibility and reduce code volume

The text was updated successfully, but these errors were encountered:

dusk125 · 2024-01-11T17:36:43Z

+1 for this. Something we see in OpenShift is that the API server can fail to retry on slower platforms (non-default LEADER_ELECTION_TIMEOUT & HEARTBEAT_INTERVAL) with an "etcdserver: leader changed" error. The server tries to set the retries to avoid this, but has them overwritten by the etcd retry.

Because of this, I would be happy to take on this work (specifically number 2 if the work needs to be broken up).

serathius · 2024-01-11T19:32:14Z

cc @ahrtr @serathius

ahrtr · 2024-01-11T20:02:17Z

There are 3 different feature requests, let's breakdown them into three different tickets.

For 1, I vaguely recall that someone raised a PR to resolve it about 1.5 years ago, and ptabor proposed a generic solution to let users configure all gRPC options. But I couldn't find the PR anymore. Anyway, it seems a valid feature.

For 2, Note both grpc/resolver and grpc/balancer are still experimental features/packages. We should try not to make any related change. But It's open to investigate and discuss.

There is a related issue #15145. But it should be fine because etcd's usege of the grpc-go's resolver package is really very simple and basic cc @dfawley

For 3, please add the context into #14631, and let's continue the discussion there.

CaojiamingAlan added the type/feature label Jul 11, 2023

CaojiamingAlan changed the title ~~Improving client~~ Improving client call options flexibility and replace retry interceptor Jul 11, 2023

jmhbnz added the area/clientv3 label Jul 27, 2023

chaochn47 mentioned this issue Oct 14, 2023

[3.5] Bump grpc-go to1.47 (and fix the connection-string format) #16625

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving client call options flexibility and replace retry interceptor #16216

Improving client call options flexibility and replace retry interceptor #16216

CaojiamingAlan commented Jul 11, 2023

dusk125 commented Jan 11, 2024 •

edited

Loading

serathius commented Jan 11, 2024

ahrtr commented Jan 11, 2024

Improving client call options flexibility and replace retry interceptor #16216

Improving client call options flexibility and replace retry interceptor #16216

Comments

CaojiamingAlan commented Jul 11, 2023

What would you like to be added?

1. Expose more client call options

2. Replace the customized retry interceptor with RetryPolicy

3. (Maybe) Set defaultWaitForReady to True

Why is this needed?

dusk125 commented Jan 11, 2024 • edited Loading

serathius commented Jan 11, 2024

ahrtr commented Jan 11, 2024

3. (Maybe) Set `defaultWaitForReady` to True

dusk125 commented Jan 11, 2024 •

edited

Loading