-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Description
This is sort of related to #1236 that I filed earlier today, and might be related to #1120.
We had a machine that is running one of our redis cluster masters go down hard today, and while it was down we saw some interesting behavior from the jedis pool. We got a lot of errors like this:
redis.clients.jedis.exceptions.JedisClusterMaxRedirectionsException: Too many Cluster redirections?
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:97)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:152)
at redis.clients.jedis.JedisClusterCommand.runWithRetries(JedisClusterCommand.java:131)
at redis.clients.jedis.JedisClusterCommand.runBinary(JedisClusterCommand.java:59)
at redis.clients.jedis.BinaryJedisCluster.get(BinaryJedisCluster.java:98)
It looks like redis.clients.jedis.Connection.connect() is wrapping the java.net.ConnectException from the failed socket up in a JedisConnectionException, which gets wrapped again by redis.clients.util.Pool.getResource(), and because it's a JedisConnectionException, it triggers a retry.
I think this issue is maybe a little less obviously a bug than #1236, because I can see some uses where someone might want to retry immediately if a socket threw a ConnectException, but I think one could also make an argument that if a ConnectException is being thrown something is seriously wrong, and failing immediately seems reasonable. (If users were experiencing the ConnectException for nodes that weren't down, they could alway increase the connect timeout.) As it is, when a node goes down, users could potentially end up waiting a total of (maxRedirects * connectTimeout) ms, and still get an error, which is kind of unfortunate, especially when a regular cache hit is over in just a few ms. :)
Thoughts?
Redis / Jedis Configuration
Jedis version:
2.8.0
Redis version:
3.0.6
Java version:
JDK8