-
Notifications
You must be signed in to change notification settings - Fork 637
Inconsistent behavior when using replication #778
Conversation
mdprotacio
commented
Jun 3, 2021
•
edited
Loading
edited
- channel is removed from set when any of the user unsubscribed without considering that other users may still be connected to the channel
- channel may be non existent on server A when client is served through server B, but message is received through pusher trigger on server A
- redundant replicate on call to broadcast on channel, when prior calling broadcast, there is already a call to broadcastAcrossServers
- if continuously we receive events/message for 2 minutes, ping is never called from pusherjs which renders the connections stale as the lastPongedAt never gets updated. This renders the connection being marked as obsoleted and removed.
…so it can propagate message across instances
…ross server prior the call
This reverts commit d255a4e.
…id cleaning up when no ping was called due to continuous stream of events
…r active connections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please test and ensures that when you disconnect your web socket connections, connections are actually removed.
and also ensure that connections are not being piled up on every time you disconnect connection and resubscribe to presence channel.
Tested the branch and some sockets connections just piling up.
Maybe its already an existing issue,
im not sure but it will be better you try to reproduce it.
It does seem that the users key has multiple same user values as the key used is the socketId. It could be possible that you have multiple tabs. Say for user_id 4004, you have initially 3 tabs opened and on the next screen you have 4 tabs opened? The problem that I've been having trouble with when doing and testing the code changes is on the issue of triggered pusher event from backend when I had to mark the connections as ponged when they receive messages, otherwise the pusher javascript will not issue a ping when there is a continuous stream of events getting in. Initially, the removeObsoleteConnections will clean the connections if not ponged after 120 seconds and when continuous stream of events are being received, the client will not issue a ping and will get disconnected. Since I've marked them now as ponged when I received a message on a connection to keep the connection, when that happens, how can I know if the client is actually still connected and may eventually issue a ping? This is where we may actually be having stale connections that are already disconnected and is not being removed. I'd rather have that for the time being instead of having to aggressively remove connections that are not being ponged due to continuous event stream which results to clients being not notified of the events when they are actually still connected. |
Nice!
As you said connections will be removed after 120 secs.
If i may understand ur quotes. you meant you solved a Or maybe i didn't get you correctly. |
Yes. There appears to be still some issues on stale connections specifically on presence channels. We have deployed these changes on production and there seems to be some inconsistencies on unsubscribe. Sometimes, we have say 50 connected users shown but when we refresh the page it was actually just around 30. It appears that there are times that member_removed is not being broadcasted across all instances. I’ll look onto it further and hopefully can get it fixed. Perhaps on a different PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested this branch for some minutes or more, so far so good.
All seems to be working fine.
- pusher client ping/pong working fine thou i have just two sockets connected
- noticed global stale connections are being removed after 120 secs.
Thou i couldn't try to reproduce the original issue this request was created to solve but i can confirm no issues produced by this pull request.
There are still problems existing with presence channels which its stale connections doesn't get removed but this pull request wasn't created to solve it i believe.
as @mdprotacio mentioned he would try to fix on a new push request.
all good at testing with redis
connection from my end.
@mdprotacio dude, you are legend, last few days for many hours I was struggling to debug lost WS message issues in my project |
// if there are no connections. | ||
// If the total connections count gets to 0 after unsubscribe, | ||
// try again to check & unsubscribe from the PubSub topic if needed. | ||
if ($count < 1) { | ||
$this->unsubscribeFromTopic($connection->app->id, $channelName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mdprotacio hey! I am still fighting with one issue, this is not exactly related with your PR, as problem also persists on latest 2.0.0-beta.36 but I am hoping you can help me
what currently I am able to debug, is that after execution unsubscribeFromTopic()
on at least one of ws server nodes first message/-s on client side is not received, commenting out this line - problem solved, kinda, but what is consequences? and maybe you will better understand why is that happening
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I perfectly understand your issue. But it appears that maybe when you have unsubscribed from one node, the same is not propagated across other nodes. This may happen if the client is subscribed or connected using the other node but the unsubscribe happened on the other node. This, from what I can recall should have been cleaned up by the removeConnectionFromSet that is the next then of this block. It may be possible that unsubscribeFromTopic may have produced a failure and will stop processing the promise chain and may not remove the connection from the set.
Hi There, Great work have been debugging to find an answer for
all morning... Editing Any ETA on when this will be merged with master? |
@myckhel @IlmLV @markeilander guys, would you be able to check #881? I've created another branch and will likely close this PR instead. Hopefully that PR addresses some more other issues. Thanks. |
Closed in favor of #881 |