Skip to content
This repository has been archived by the owner on Jun 21, 2023. It is now read-only.

In worker, retry forever with an exponential back off when Redis interactions time out #22

Open
wants to merge 13 commits into
base: github
Choose a base branch
from

Conversation

nathansobo
Copy link

@nathansobo nathansobo commented Nov 7, 2019

Currently, when we time out talking to Redis, we reconnect and retry the operation. For a fail-over scenario where the Redis server has moved to a new host, this behavior works. For scenarios in which the Redis is still available but is overwhelmed with load, repeatedly reconnecting and retrying operations has the potential to make the situation worse.

In this PR, I introduce Worker#with_exponential_backoff and use it in the Worker instead of with_retries.

  • When retrying, exponentially back off by powers of 2, up to a maximum of 60 seconds, with 5 seconds of random jitter.
  • Continue retrying forever until the worker is explicitly shut down. This prevents a scenario where the worker process dies after N attempts only to be restarted by Resqued. This ensures that we continue to retry at a reduced frequency until Redis service health recovers. Restarting the process would cause us to start retrying at a faster rate.

I limit these changes to the worker because backing off and retrying forever in Unicorn processes when enqueuing jobs could cause request timeouts.

I also change the behavior of with_retries slightly so that attempts to reconnect also count as a retry attempt. The existing logic can end up trying to reconnect up to 9 times in certain scenarios.

@dbussink
Copy link

dbussink commented Nov 7, 2019

Sorry, I missed this PR when opening #23 and after @nronas approved it, I already merged it before this change.

Feel free to incorporate some of the further changes here though, #23 was aiming at the most minimal fix I could come up with.

@nathansobo nathansobo changed the title Avoid infinite loop in retry logic when exceptions occur talking to Redis In worker, retry forever with an exponential back off when Redis interactions time out Nov 7, 2019
@nathansobo nathansobo marked this pull request as ready for review November 7, 2019 16:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants