-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Description
Version: What redis-py and what redis version is the issue happening on?
redis-py 4.5.0
Platform: What platform / version? (For example Python 3.5.1 on Windows 7 / Ubuntu 15.10 / Azure)
Python 3.10
Description: Description of your issue, stack traces from errors and code that reproduces the issue
scan_iter
family commands (scan_iter
, sscan_iter
, hscan_iter
, zscan_iter
) might give inconsistent result when the client is created using a connection pool, and when there are multiple concurrent requests.
Assume we have this setup
- 2 replicas, host A and host B
- use
SentinelConnectionPool
to manage connections to different server - 2 concurrent
scan_iter
commands, in which each will issue multiplescan
commands.scan
commands issued by thesescan_iter
commands are labelledscan (1)
andscan (2)
below.
What might happen is:
scan (1)
is issuedscan (1)
gets connection from the pool- The pool is empty so it creates a new connection
- For sentinel connection pool, creating a new connection means getting the next replica in the
connection_pool.rotate_slaves
rotation. - Since this can return any replicas on rotation, let's say it arbitrarily connects to host A
scan (1)
executed at host Ascan (2)
is issued in the meantimescan (2)
gets connection from the pool- The pool is empty (there was 1 connection created but it's still in use)so it creates a new connection
- Get the next replica in the
connection_pool.rotate_slaves
rotation. - Since this can return any replicas on rotation, let's say it arbitrarily connects to host B
scan (2)
executed on host Bscan (1)
is finished. Connection to host A is put back to the poolscan (2)
is finished. Connection to host B is put back to the poolscan (1)
gets connection from connection pool, it gets the connection to host B (since connection pool will justpop()
the last element from the available connections)scan (1)
is executed on host B
Step 9 is the bug. All scan
commands coming from the same scan_iter
command needs to go to the same replica. This is because the 'state' of the scan_iter
command is stored in the cursor and different replicas will store keys in a different order.
Hence, if we use the cursor from host A to do a scan on host B, we'll get an inconsistent result.
There are 3 different base implementations of a connection pool, ConnectionPool
, SentinelConnectionPool
and BlockingConnectionPool
. All of them does something similar when getting a new connection from the pool. It creates a 'dummy' connection object, and call connection.connect()
, which will actually connect to the intended replica.
There are 4 different implementations of a connection, Connection
, SSLConnection
, SentinelManagedConnection
, and SentinelManagedSSLConnection
.
- For
SentinelManagedConnection
andSentinelManagedSSLConnection
, this is fixable by makingSentinelConnectionPool
maintaining an id of the scan iter command to the host it has previously issued command to - For
Connection
andSSLConnection
,connection.connect()
, will depend on the impl of the connection class'.connect
but by default will connect toself.host
andself.port
of the connection.