Skip to content

Snowball effect with reconnecting to poor performing node #1252

@Spikhalskiy

Description

@Spikhalskiy

We have a problem with JedisPool that aggravates perf issues of redis nodes when this issues only start to appear.

What we have:

  • Short timeouts. Like 3ms.
  • Significantly loaded redis that sometimes starts to respond slowly with number of connections like 40000 per node.

If redis is starting to stuck for 4ms, instead of each read, we do

  • read
  • after timeout and marking Jedis as broken, JedisFactory gently sends quit to Redis in destroyObject
  • we establish new connection
  • PING-PONG

and only after that we have new Jedis instance for new read, but... actually nothing changed, we could just continue to use old instance.

So, when our Redis Cluster start to experience some perf issues - we finish it off by invalidating Jedis.

Any thoughts?
Only one from me - maybe we could add an ability to pass some type of "InvalidationStrategy" to Jedis? For example, strategy by default will mark as broken and do everything like now and 3rd party can implement it's own strategy, for example, send PING-PONG before quit. "read with timeout - PING-PONG, give it a chance - read" looks better than current mandatory invalidation flow.

I could implement and provide PR for any solution solving or providing possibility to improve current standard flow.

What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions