Retry with backoff on cluster connection failures #2358

walles · 2021-01-29T06:00:13Z

Before this change, if there were connection failures to the cluster, we did all our retries without any backoff.

With this change in place:

We first do the previous no-backoff tactic for one third of our maxAttempts (see the shouldBackOff() method)
Then we start backing off as determined by the getBackoffSleepMillis() method

Additionally, this change adds unit tests for the retries / backoff logic.

This change is based on the changes in #2355 (approved, not yet merged, currently waiting for more reviewers).

No behavior changes, just a refactoring. Changes: * Replaces recursion with a for loop * Extract redirection handling into its own method * Extract connection-failed handling into its own method Note that `tryWithRandomNode` is gone, it was never `true` so it and its code didn't survive the refactoring.

Inspired by redis#1334 where this went real easy :). Would have made redis#2355 shorter. Free public updates for JDK 7 ended in 2015: <https://en.wikipedia.org/wiki/Java_version_history> For JDK 8, free public support is available from non-Orace vendors until at least 2026 according to the same table. And JDK 8 is what Jedis is being tested on anyway: <https://github.com/redis/jedis/blob/ac0969315655180c09b8139c16bded09c068d498/.circleci/config.yml#L67-L74>

walles · 2021-02-01T12:59:45Z

✅ 👀 Ready for review!

sazzad16

This PR breaks backward compatibility. Breaking backward compatibility means it won't be released until next major release. As of this moment, next major release for Jedis is 4.0.0 which, you can imagine, is a long away.

Try to find a backward compatible solution. Don't make the code too ugly for that purpose though :)

src/main/java/redis/clients/jedis/JedisClusterCommand.java

src/main/java/redis/clients/jedis/BinaryJedisCluster.java

walles · 2021-02-02T13:15:50Z

Thank you for your short turnaround time in reviewing, I really appreciate that @sazzad16!

walles · 2021-02-02T13:36:51Z

Try to find a backward compatible solution. Don't make the code too ugly for that purpose though :)

Another constructor is needed either way (I think).

But if #2364 would get merged before this PR, that constructor could be made private and wouldn't have to clutter the public API.

walles · 2021-02-02T13:42:35Z

src/main/java/redis/clients/jedis/BinaryJedisCluster.java

+  /**
+   * Default timeout in milliseconds.
+   */
+  public static final int DEFAULT_TIMEOUT = 2000;


public makes these reachable from JedisClusterCommand.java for its default timeout.

src/main/java/redis/clients/jedis/JedisClusterCommand.java

* consider connection exceptions and disregard random nodes * reset redirection

yangbodong22011 · 2021-03-29T03:17:49Z

I disagree. Firstly, it doesn't suit there. Secondly, when we'd try to improve this (targeting Jedis 4.0.0), this would mess the config interface and/or could be bottlenecked by it.

@sazzad16 Okay, If we have an improvement plan, then I agree to continue, but I still think the default value of maxTotalRetriesDuration should be: maxAttempts * soTimeout, not equal to soTimeout.

It's just that those commands were implemented & merged after this PR is crafted and simple git merge doesn't add those. We'll always have time to add those.

This is the responsibility of this PR, and maxTotalRetriesDuration should be added to the new command before merged.

sazzad16 · 2021-03-29T04:08:26Z

@yangbodong22011

the default value of maxTotalRetriesDuration should be: maxAttempts * soTimeout

agreed

maxTotalRetriesDuration should be added to the new command before merged

We can do this after the PR is approved.

src/main/java/redis/clients/jedis/JedisClusterCommand.java

sazzad16 · 2021-03-29T16:52:13Z

@gkorland @yangbodong22011 Please check #2490. Hopefully that PR addresses your concerns.

Conflicts: src/main/java/redis/clients/jedis/BinaryJedisCluster.java src/main/java/redis/clients/jedis/JedisCluster.java

walles · 2021-03-31T07:58:34Z

🥳

nitinware · 2025-04-09T16:33:13Z

@walles we are seeing similar for jedis version:5.1.0, I see this PR has fixed tis issue, which jedis version we need to use to address the error.

Error:
org.springframework.dao.InvalidDataAccessApiUsageException: No more cluster attempts left.
	at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:67)
	at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:42)
	at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:40)

ggivo · 2025-04-10T07:00:36Z

Hi @nitinware

Took a brief look at the history of commits and it shows that this PR is already part of 5.1.0.

"No more cluster attempts left." is a pretty generic error thrown when there is a persisting error even after retries are exhausted. Looking at Jedis code the actual cause is stored as a suppressed exception inside JedisClusterOperationException here.

I see you are using spring framework and it wraps the original Jedis exception, probably somewhere down the stack there should be JedisClusterOperationException with the actual error causing the failure inside suppressed.

Hope it helps

Johan Walles and others added 5 commits January 25, 2021 09:11

Drop redundant null check

5a4fdbd

Replace ConnectionGetters with lambdas

cdf56b2

Retrigger CI

d99ef7b

walles marked this pull request as draft January 29, 2021 07:51

Johan Walles added 4 commits February 1, 2021 13:45

Add backoff to Redis connections

8978ca5

Add unit tests for backoff logic

85fa21c

Add retries logging

f8d09c2

Always use the user requested timeout

9c7ef1d

walles force-pushed the j/backoff branch from 8362e90 to 9c7ef1d Compare February 1, 2021 12:52

walles marked this pull request as ready for review February 1, 2021 12:58

sazzad16 suggested changes Feb 1, 2021

View reviewed changes

src/main/java/redis/clients/jedis/JedisClusterCommand.java Outdated Show resolved Hide resolved

src/main/java/redis/clients/jedis/BinaryJedisCluster.java Outdated Show resolved Hide resolved

walles mentioned this pull request Feb 2, 2021

jedis-3.2.0 JedisClusterMaxAttemptsException connect to redis-5.0.7 cluster #2130

Closed

Remedy review feedback

9bce8eb

walles force-pushed the j/backoff branch from fc2349f to 9bce8eb Compare February 2, 2021 13:39

walles commented Feb 2, 2021

View reviewed changes

walles requested a review from sazzad16 February 2, 2021 13:45

walles mentioned this pull request Feb 2, 2021

Add logging for cluster retries logic #2350

Closed

sazzad16 suggested changes Feb 3, 2021

View reviewed changes

src/main/java/redis/clients/jedis/JedisClusterCommand.java Outdated Show resolved Hide resolved

This was referenced Feb 3, 2021

Kindly document the Jedis fork buildfarm/buildfarm#664

Closed

redis.clients.jedis.exceptions.JedisConnectionException: java.net.ConnectException: Connection refused #2345

Closed

Consider connection exceptions and disregard random nodes

67a062a

* consider connection exceptions and disregard random nodes * reset redirection

Merge branch 'master' into j/backoff

7aa0b74

sazzad16 dismissed their stale review via 7aa0b74 March 29, 2021 03:48

sazzad16 reviewed Mar 29, 2021

View reviewed changes

src/main/java/redis/clients/jedis/JedisClusterCommand.java Outdated Show resolved Hide resolved

sazzad16 reviewed Mar 29, 2021

View reviewed changes

src/main/java/redis/clients/jedis/JedisClusterCommand.java Outdated Show resolved Hide resolved

Use maxAttempts

0ef36d3

sazzad16 reviewed Mar 29, 2021

View reviewed changes

src/main/java/redis/clients/jedis/JedisClusterCommand.java Show resolved Hide resolved

sazzad16 added 6 commits March 29, 2021 11:52

format import

25303b7

Re-add missing codes due to merge

9e3fbcc

avoid NPE while zero max attempts

882dd49

Remove zero attempts test

7430b9b

More cluster constructors and customizability

9eb8d58

Use maxTotalRetriesDuration everywhere

27bce50

sazzad16 force-pushed the j/backoff branch from 7430b9b to ddd4038 Compare March 29, 2021 16:46

sazzad16 mentioned this pull request Mar 29, 2021

Retry with backoff on cluster connection failures (II) #2490

Closed

sazzad16 added 2 commits March 31, 2021 07:53

Merge remote-tracking branch 'redis/master' into j/backoff

b900a87

Conflicts: src/main/java/redis/clients/jedis/BinaryJedisCluster.java src/main/java/redis/clients/jedis/JedisCluster.java

more missing maxTotalRetriesDuration after merge

4501b0d

sazzad16 added ready to merge and removed wait for more reviews labels Mar 31, 2021

sazzad16 merged commit 270bb71 into redis:master Mar 31, 2021

sazzad16 removed the ready to merge label Mar 31, 2021

walles deleted the j/backoff branch March 31, 2021 07:47

This was referenced Apr 6, 2021

[Ehancement]Try run command again after renew the redis cluster slot cache #2195

Closed

Call renew slots before final retry #1443

Closed

joshua5201 mentioned this pull request Aug 25, 2022

Add support for Jedis maxTotalRetriesDuration in config spring-projects/spring-data-redis#2389

Closed

4 tasks

Retry with backoff on cluster connection failures #2358

Retry with backoff on cluster connection failures #2358

Uh oh!

Conversation

walles commented Jan 29, 2021

Uh oh!

walles commented Feb 1, 2021

Uh oh!

sazzad16 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

walles commented Feb 2, 2021

Uh oh!

walles commented Feb 2, 2021

Uh oh!

walles Feb 2, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yangbodong22011 commented Mar 29, 2021

Uh oh!

sazzad16 commented Mar 29, 2021

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sazzad16 commented Mar 29, 2021

Uh oh!

walles commented Mar 31, 2021

Uh oh!

nitinware commented Apr 9, 2025

Uh oh!

ggivo commented Apr 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

ggivo commented Apr 10, 2025 •

edited

Loading