You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
drbd: allow application IO concurrently with resync requests sent to same peer
Since the previous commit ("drbd: respect peer writes held due to
conflicts when barrier received"), the receiver also waits for requests
that are held due to conflicts when a barrier is received. If these
conflicts are due to resync requests towards the same peer, then they
will only be resolved once the resync reply has been received. This
causes a deadlock. Avoid this by allowing application IO to be submitted
while there are conflicting resync requests pending, but only when they
relate to the same peer.
Implement this by adding "sent" and "received" flags for the intervals.
Resync requests that we initiate pass through two conflict resolution
phases. The first one is for waiting until the request can be sent. This
involves waiting until there are no conflicting application writes from
other peers. The second one occurs when we receive the reply and has the
purpose of ensuring we do not submit conflicting writes.
This change has similarities to
4dc38cd drbd: Break resync deadlock
which was reverted because a different solution was required to avoid a
distributed deadlock with 3 nodes:
c0cd45a drbd: Break distributed deadlock in request processing
ddc742b drbd: drop special-casing peer_device in al_begin_io_for_peer
However, this is distinctly different and this time it is valid. The key
difference is that resync requests are no longer blocked by an active
activity log extent. This means that they only wait for local disk
writes to complete, and not for a peer ack, which is a distributed
event.
It is no longer necessary to include a special case for compatibility in
receive_Data(). This approach removes the exclusivity of peer writes
with resync requests that we have sent and are waiting for the reply
for. In fact, the previous solution was incomplete, because it still
held peer writes back when they conflicted with resync requests that had
been sent and pending a reply. This approach handles that scenario as
well.
0 commit comments