-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consumer dispatcher improvements #997
Consumer dispatcher improvements #997
Conversation
10f7aec
to
d63ba22
Compare
d63ba22
to
14f7bf5
Compare
rebased to master. this is ready for review |
1d67ae0
to
83a8e4a
Compare
What's left here to be merged? (#1009 should be merged first, I'll update this one afterwards) |
lock (_consumers) | ||
{ | ||
#if NETSTANDARD | ||
var consumer = _consumers[tag]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line may cause the same exception as #1013 if the tag isn't in the dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but question is, can this ever happen? If the server sends a tag twice or a wrong one, then causing an exception seems fine to me. (GetConsumerOrDefault on the other hand does need to care as the protocol allows scenario where this could happen.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in #1012, the system has tried to get a consumer that wasn't in the dictionary and it caused chaos in the client. It may have been related to a network glitch with timeouts when unsubscribing a consumer that caused the message to not be there, or to get sent in from the server multiple times. The dictionary exception kept reoccurring consistently and frequently. I haven't tracked down the root cause though, or spent much time trying to duplicate, but I have had that exception a enough times in production that it got on my radar to fix.
I don't think we should throw an exception here since it isn't handled well further up the stack. Logging the error seems fairly reasonable though, and may help track down any bugs. Then we can ignore subsequent actions, or resort to some default predictable behaviour (like defaultConsumer).
This function looks like it gets called on shutdown, so handling the error gracefully is probably not a big problem. I'm not familiar enough with the full code base though to comment on this aspect authoritatively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the information. I overlooked that it happened on CancelOk.
Yes, for CancelOk there might be some overlapping in some strange scenarios possible. Look at.
If for whatever reason the server sends some other reply instead of CancelOk, the consumer will be removed here and later again when the actual CancelOk comes. This is the reason in my PR I removed this double removing.
But what interests me more is the failure to handle the exception more gracefully, I'll take a look at it sometime.
AFAIK on protocol error (e.g. getting a consumer tag that doesn't exist) should result in an exception, @michaelklishin is that right? Or how should this be handled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The active consumer state is present in channel state on both sides, so there is a race condition between what the client and server do. E.g. you can call a basic.cancel
and immediately remove the tag, only to get a basic.cancel-ok
with it later. Bot sides in this example do something that makes sense at first glance but the relative timing is problematic.
In other clients we remove the consumer after receiving a basic.cancel-ok
. If we don't have a consumer but received a consumer operation frame for it, some clients log a warning. Gracefully doing nothing would also be acceptable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented it now in a ultra safe manner, in which the consumer is:
- found via the tag, if not
- DefaultConsumer is returned, if null then
- FallbackConsumer is returned, where the implementation will just log the call itself and continues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LamarLugli Would you agree that #1012 / #1013 is still fixed with the current proposed implementation?
dda2352
to
f9e049f
Compare
This now has some conflicts. @bollhals please rebase 🙏 |
f9e049f
to
e63f7f5
Compare
done |
Any final comments on this or can we merge it? |
projects/RabbitMQ.Client/client/impl/ConsumerDispatching/ConsumerDispatcher.cs
Show resolved
Hide resolved
@michaelklishin Can we finish this? I have another change coming that depends on this and there seem to be no open points to do. |
e63f7f5
to
a12da06
Compare
@bollhals sorry, I thought there was more to do. I can begin QA'ing this tomorrow. |
Thanks. 👍 |
Proposed Changes
some simplifications / improvements around consumer dispatcher
Types of Changes
What types of changes does your code introduce to this project?
Put an
x
in the boxes that applyChecklist
CONTRIBUTING.md
documentFurther Comments
Before
After
Before
data:image/s3,"s3://crabby-images/8be75/8be755d32316f742bfa08268dbc3ccc1ead3ab27" alt="image"
After
data:image/s3,"s3://crabby-images/4a045/4a045400a06340908f58700358de66ab8e4bcde1" alt="image"