scan_iter family commands gives inconsistent result when using Sentinel connection pool

**Version**: What redis-py and what redis version is the issue happening on?
redis-py 4.5.0

**Platform**: What platform / version? (For example Python 3.5.1 on Windows 7 / Ubuntu 15.10 / Azure)
Python 3.10

**Description**: Description of your issue, stack traces from errors and code that reproduces the issue

`scan_iter` family commands (`scan_iter`, `sscan_iter`, `hscan_iter`, `zscan_iter`) might give inconsistent result when the client is created using a connection pool, and when there are multiple concurrent requests. 

Assume we have this setup
- 2 replicas, host A and host B
- use `SentinelConnectionPool` to manage connections to different server
- 2 concurrent `scan_iter` commands, in which each will issue multiple `scan` commands. `scan` commands issued by these `scan_iter` commands are labelled `scan (1)` and `scan (2)` below.

What might happen is: 
1. `scan (1)` is issued
2. `scan (1)` gets connection from the pool
     - The pool is empty so it creates a new connection
     - For sentinel connection pool, creating a new connection means getting the next replica in the `connection_pool.rotate_slaves` rotation. 
     - Since this can return any replicas on rotation, let's say it arbitrarily connects to host A
3. `scan (1)` executed at host A
4. `scan (2)` is issued in the meantime
5. `scan (2)` gets connection from the pool
     - The pool is empty (there was 1 connection created but it's still in use)so it creates a new connection
     - Get the next replica in the `connection_pool.rotate_slaves` rotation. 
     - Since this can return any replicas on rotation, let's say it arbitrarily connects to host B
6. `scan (2)` executed on host B 
7. `scan (1)` is finished. Connection to host A is put back to the pool
8. `scan (2)` is finished. Connection to host B is put back to the pool
9. `scan (1)` gets connection from connection pool, it gets the connection to host B (since connection pool will just `pop()` the last element from the available connections) 
10. `scan (1)` is executed on host B

Step 9 is the bug. All `scan` commands coming from the same `scan_iter` command needs to go to the same replica. This is because the 'state' of the `scan_iter` command is stored in the cursor and different replicas will store keys in a different order.
Hence, **if we use the cursor from host A to do a scan on host B, we'll get an inconsistent result.**

There are 3 different base implementations of a connection pool, `ConnectionPool`, `SentinelConnectionPool` and `BlockingConnectionPool`. All of them does something similar when getting a new connection from the pool. It creates a 'dummy' connection object, and call `connection.connect()`, which will actually connect to the intended replica. 

There are 4 different implementations of a connection, `Connection`, `SSLConnection`, `SentinelManagedConnection`, and `SentinelManagedSSLConnection`. 
- For `SentinelManagedConnection` and `SentinelManagedSSLConnection`, this is fixable by making `SentinelConnectionPool` maintaining an id of the scan iter command to the host it has previously issued command to 
- For `Connection` and `SSLConnection`, `connection.connect()`, will depend on the impl of the connection class' `.connect` but by default will connect to `self.host` and `self.port` of the connection. 
  

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

scan_iter family commands gives inconsistent result when using Sentinel connection pool #3197

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

scan_iter family commands gives inconsistent result when using Sentinel connection pool #3197

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions