Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronization issue with AbstractDBusConnection.getHandledSignals #97

Closed
lbeuster opened this issue Mar 18, 2020 · 7 comments
Closed

Comments

@lbeuster
Copy link

With the current master if have indeterministic synchronization issues resulting in 'org.freedesktop.dbus.errors.NoReply: No reply within specified time'.
This issue came with commit cfa0e0f.

The problem is the synchronization of getHandledSignals in DBusConnection.addSigHandler. While holding the lock on the handlers a remote call (dbus.AddMatch) is executed (in the previous commit this call was outside of the sync-block). At the same time a NameAcquired messages/signal NameAcquired in that also needs the lock on the handlers - but the signal has to wait because it is hold by addSigHandler.
Seems like the server NameAcquired-signal and the clients AddMatch request block each other.

Here are some log statements showing the problem

Create Connection
1713 [DBusConnection] : -> handleMessage, acquiring sync, DBusSignal(1,2) { Path=>/org/freedesktop/DBus, Interface=>org.freedesktop.DBus, Member=>NameAcquired, Destination=>:1.1591, Sender=>org.freedesktop.DBus, Signature=>s } [:1.1591]
1714 [main]           : -> addSignalHandler [:1.1591], acquiring sync
1714 [main]           :    addSignalHandler: got sync
... MethodCall.REPLY_WAIT_TIMEOUT=2 sec (but it's the same issue with a higher value)
3721 [main]           : <- ERROR addSigHandler  1584485323721 main [:1.1591] -> released sync
                           Caused by: org.freedesktop.dbus.errors.NoReply: No reply within specified time
                               at org.freedesktop.dbus.RemoteInvocationHandler.executeRemoteMethod(RemoteInvocationHandler.java:160)
                               at org.freedesktop.dbus.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:228)
                               at com.sun.proxy.$Proxy29.AddMatch(Unknown Source)
                               at org.freedesktop.dbus.connections.impl.DBusConnection.addSigHandler(DBusConnection.java:783)
3722 [DBusConnection] :    handleMessage, got sync [:1.1591]
3722 [DBusConnection] : <- handleMessage, released sync [:1.1591]

I have tested inside a Ubuntu 18.04 VM with 4 processors.

@hypfvieh
Copy link
Owner

I'm pretty sure that the sychronized blocks are not needed because the underlying map is already a ConcurrentHashMap.

I added a new branch which removes the sychronized blocks and will also change the value of the signal lists from List<> to Queue<>, so the value should also be threadsafe like the map.

Please try the changes here: https://github.com/hypfvieh/dbus-java/tree/no-sychronized
I'll merge it upstream if we are sure that this will not cause other issues.

@lbeuster
Copy link
Author

The map may be thread safe for single method calls, but you still need sync across several calls, e.g. addSigHandler

sync {
    get()
    doSomething
    put/add
}

And beside this the access to the dbusSignalList (ArrayList) is also not synchronized.

@lbeuster
Copy link
Author

Maybe Map.computeXXX/putIfAbsent will help.

@hypfvieh
Copy link
Owner

That is what the old code was doing, synchronize while some work is performed.

I don't think the get() needs to be in the synchronized block, because get() on ConcurrentHashMap will be atomic.

I was already thinking of the computeXX methods but this does not work when throwing exceptions. Every time e.g. addMatch is called a DBusExceptionException maybe thrown.
Catching, wrapping and re-throwing is something I dont want to do.

@hypfvieh
Copy link
Owner

I've pushed some changes to the new branch which will use computeIfAbsent to create a new collection. Calling addMatch is performed outside of the lambda so the exception does not have to be caught and rethrown.

@lbeuster
Copy link
Author

That was the point, the addMatch must be outside of any synchronization code ;)

@lbeuster
Copy link
Author

Thanks for the quick fix - works for me.

@hypfvieh hypfvieh closed this as completed May 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants