-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dropped WebSocket messages due to race condition in WebSocket frame handling #11081
Comments
Signed-off-by: Lachlan Roberts <[email protected]>
Signed-off-by: Lachlan Roberts <[email protected]>
Opened PR #11084 |
Issue #11081 - fix race condition in WebSocket FrameHandlers
@darnap we merged a PR which should fix this. |
@darnap from your analysis, I can see you have a custom CometD Can you detail if it is based on the Jetty APIs or the standard Jakarta APIs? Also, what are the reasons for using a custom transport? |
@lachlan-roberts Thanks for the prompt fix. We will re-run tests with this change and see what happens. Would it be enough to cherry-pick just this change onto the 11.0.18 tag of websocket-jetty-common? This would simplify our test deployment. Just to understand how the fix is supposed to work: what prevents the same race from occurring since onTextFrame() simply resets the activeMessageSink to the same textSink instance if it's null? @sbordet The custom transport is based on the Jetty APIs. We needed to override the transport and Endpoint in order to:
If there's any better way to accomplish these goals we'd gladly avoid using a custom transport. |
Yes it would be enough.
Good point. @lachlan-roberts I think the problem is the Thread T1 in I'm surprised there are no NPEs!
The demand should be after the nulling of the |
…bSocket frame handling. Now the reset of the MessageSink internal accumulators happens before the demand. This avoids the race, since as soon as there is demand another thread could enter the MessageSink, but the accumulator has already been reset. Signed-off-by: Simone Bordet <[email protected]>
@sbordet I will, thanks. |
@sbordet The issue did not occur again in the last run of tests after building with this change in place. Hopefully it's solved, thanks. |
Issue #11081 - fix race condition in WebSocket FrameHandlers (jetty-12)
…bSocket frame handling. (#11090) Now the reset of the MessageSink internal accumulators happens before the demand. This avoids the race, since as soon as there is demand another thread could enter the MessageSink, but the accumulator has already been reset. Signed-off-by: Simone Bordet <[email protected]>
Jetty version(s)
11.0.18
Java version/vendor
(use: java -version)
11.0.17
OS type/version
Windows 11/Linux CentOS 8
Description
After migrating a CometD-based application from Jetty 9 to Jetty 11, we started finding that some automated tests started randomly failing when using WebSockets. We traced the issue to some CometD messages being lost/never delivered to the server.
Further analysis showed that the content of multiple WebSocket frames ends up packed into a single onMessage event on the application side, which is unexpected. Indeed, CometD expects multiple messages to be delivered as an array, not as back-to-back objects in a single message, so it only parses the first one found, while all subsequent ones are discarded.
Annotated application logs are attached to show that the same Utf8StringBuilder instance in StringMessageSink is used by 2 separate threads even though both are delivering a FIN frame, which should case immediate delivery to the application and a builder reset.
The logs were obtained using the distributed Jetty 11.0.18 with only the following modification to the org.eclipse.jetty.websocket.core.internal.messages.StringMessageSink class to add debugging traces:
How to reproduce?
Not reproducible systematically.
analysis on possibile race condition on StringMessageSink.txt
The text was updated successfully, but these errors were encountered: