Skip to content

Conversation

@kjnilsson
Copy link
Contributor

@kjnilsson kjnilsson commented Sep 17, 2021

Fixes #3445

If the queue is empty when a consumer is cancelled it would leave the
consumer id inside the service queue. If an application subscribes/unsubscibes
in a loop from an empty queue this would cause the service queue to never be
cleared up.

NB: whenever we make a change to how the quorum queue state machien is
calculated we need to consider how this effects determinism as during an
upgrade different members may calculate a different service queue state.
In this case it should be ok as they will eventually converge on the same
state once all "dead" consumer ids have been removed from the queue.

QQ: emit release cursors after consumer cancel, purge_nodes, garbage_collection and
update_config commands to ensure that repeated use of these commands
against an empty queue will not grow the log excessively.

If the queue is empty when a consumer is cancelled it would leave the
consumer id inside the service queue. If an application subscribes/unsubscibes
in a loop from an empty queue this would cause the service queue to never be
cleared up.

NB: whenever we make a change to how the quorum queue state machien is
calculated we need to consider how this effects determinism as during an
upgrade different members may calculate a different service queue state.
In this case it should be ok as they will eventually converge on the same
state once all "dead" consumer ids have been removed from the queue.

In any case it should not affect how messages are assigned to consumers.
If this is not done apps that consume/cancel from empty queues in a loop
will grow the raft log in an unbounded manner. This could also be the
case for the garbage_collect command.
It should be rare that repeated use of these commands would grow the
Raft log excessively but just incase we evaluate the release cursors
here anyway so that if the queue is empty we may trigger a snapshot
anyway.
@kjnilsson kjnilsson marked this pull request as ready for review September 20, 2021 11:30
@kjnilsson kjnilsson changed the title Qq consumer cancellation fixes Quorum Queue consumer cancellation fixes Sep 20, 2021
@kjnilsson
Copy link
Contributor Author

Acceptance steps:

  1. configure a system with a very small WAL size (optional but speeds stuff up)
  2. subscribe / unsubscribe from an empty QQ in a loop and check that segments in the mnesia/quorum/NODE/queue directory are deleted periodically

@acogoluegnes acogoluegnes merged commit 9ea1a82 into master Sep 20, 2021
@acogoluegnes acogoluegnes deleted the qq-consumer-cancellation-fixes branch September 20, 2021 15:32
acogoluegnes added a commit that referenced this pull request Sep 21, 2021
Quorum Queue consumer cancellation fixes (backport #3448)
acogoluegnes added a commit that referenced this pull request Sep 21, 2021
Quorum Queue consumer cancellation fixes (backport #3448) (backport #3460)
@acogoluegnes acogoluegnes added this to the 3.9.7 milestone Sep 21, 2021
@acogoluegnes
Copy link
Contributor

Backported to v3.9.x and v3.8.x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Subscribing/unsubscribing from an empty Quorum Queue causes unbounded log growth.

3 participants