Skip to content

Conversation

@walles
Copy link
Contributor

@walles walles commented Jan 29, 2021

Before this change, if there were connection failures to the cluster, we did all our retries without any backoff.

With this change in place:

  • We first do the previous no-backoff tactic for one third of our maxAttempts (see the shouldBackOff() method)
  • Then we start backing off as determined by the getBackoffSleepMillis() method

Additionally, this change adds unit tests for the retries / backoff logic.

This change is based on the changes in #2355 (approved, not yet merged, currently waiting for more reviewers).

Johan Walles and others added 5 commits January 25, 2021 09:11
No behavior changes, just a refactoring.

Changes:
* Replaces recursion with a for loop
* Extract redirection handling into its own method
* Extract connection-failed handling into its own method

Note that `tryWithRandomNode` is gone, it was never `true` so it and its
code didn't survive the refactoring.
Inspired by redis#1334 where this went real easy :).

Would have made redis#2355 shorter.

Free public updates for JDK 7 ended in 2015:
<https://en.wikipedia.org/wiki/Java_version_history>

For JDK 8, free public support is available from non-Orace vendors until
at least 2026 according to the same table.

And JDK 8 is what Jedis is being tested on anyway:
<https://github.com/redis/jedis/blob/ac0969315655180c09b8139c16bded09c068d498/.circleci/config.yml#L67-L74>
@walles walles marked this pull request as draft January 29, 2021 07:51
@walles walles marked this pull request as ready for review February 1, 2021 12:58
@walles
Copy link
Contributor Author

walles commented Feb 1, 2021

✅ 👀 Ready for review!

Copy link
Contributor

@sazzad16 sazzad16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR breaks backward compatibility. Breaking backward compatibility means it won't be released until next major release. As of this moment, next major release for Jedis is 4.0.0 which, you can imagine, is a long away.

Try to find a backward compatible solution. Don't make the code too ugly for that purpose though :)

@walles
Copy link
Contributor Author

walles commented Feb 2, 2021

Thank you for your short turnaround time in reviewing, I really appreciate that @sazzad16!

@walles
Copy link
Contributor Author

walles commented Feb 2, 2021

Try to find a backward compatible solution. Don't make the code too ugly for that purpose though :)

Another constructor is needed either way (I think).

But if #2364 would get merged before this PR, that constructor could be made private and wouldn't have to clutter the public API.

/**
* Default timeout in milliseconds.
*/
public static final int DEFAULT_TIMEOUT = 2000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public makes these reachable from JedisClusterCommand.java for its default timeout.

* consider connection exceptions and disregard random nodes

* reset redirection
@yangbodong22011
Copy link
Contributor

I disagree. Firstly, it doesn't suit there. Secondly, when we'd try to improve this (targeting Jedis 4.0.0), this would mess the config interface and/or could be bottlenecked by it.

@sazzad16 Okay, If we have an improvement plan, then I agree to continue, but I still think the default value of maxTotalRetriesDuration should be: maxAttempts * soTimeout, not equal to soTimeout.

It's just that those commands were implemented & merged after this PR is crafted and simple git merge doesn't add those. We'll always have time to add those.

This is the responsibility of this PR, and maxTotalRetriesDuration should be added to the new command before merged.

@sazzad16
Copy link
Contributor

@yangbodong22011

the default value of maxTotalRetriesDuration should be: maxAttempts * soTimeout

agreed

maxTotalRetriesDuration should be added to the new command before merged

We can do this after the PR is approved.

@sazzad16
Copy link
Contributor

@gkorland @yangbodong22011 Please check #2490. Hopefully that PR addresses your concerns.

 Conflicts:
	src/main/java/redis/clients/jedis/BinaryJedisCluster.java
	src/main/java/redis/clients/jedis/JedisCluster.java
@sazzad16 sazzad16 merged commit 270bb71 into redis:master Mar 31, 2021
@walles walles deleted the j/backoff branch March 31, 2021 07:47
@walles
Copy link
Contributor Author

walles commented Mar 31, 2021

🥳

@nitinware
Copy link

@walles we are seeing similar for jedis version:5.1.0, I see this PR has fixed tis issue, which jedis version we need to use to address the error.

Error:
org.springframework.dao.InvalidDataAccessApiUsageException: No more cluster attempts left.
	at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:67)
	at org.springframework.data.redis.connection.jedis.JedisExceptionConverter.convert(JedisExceptionConverter.java:42)
	at org.springframework.data.redis.PassThroughExceptionTranslationStrategy.translate(PassThroughExceptionTranslationStrategy.java:40)

@ggivo
Copy link
Collaborator

ggivo commented Apr 10, 2025

Hi @nitinware

Took a brief look at the history of commits and it shows that this PR is already part of 5.1.0.

"No more cluster attempts left." is a pretty generic error thrown when there is a persisting error even after retries are exhausted. Looking at Jedis code the actual cause is stored as a suppressed exception inside JedisClusterOperationException here.

I see you are using spring framework and it wraps the original Jedis exception, probably somewhere down the stack there should be JedisClusterOperationException with the actual error causing the failure inside suppressed.

Hope it helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants