Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support custom threads #415

Open
wants to merge 152 commits into
base: develop
Choose a base branch
from
Open

Support custom threads #415

wants to merge 152 commits into from

Conversation

eupp
Copy link
Collaborator

@eupp eupp commented Oct 11, 2024

No description provided.

@eupp eupp marked this pull request as ready for review October 12, 2024 01:30
@eupp eupp requested a review from ndkoval October 12, 2024 01:30
@eupp
Copy link
Collaborator Author

eupp commented Oct 12, 2024

The only failing CI configuration is "Integration Test with kotlinx.coroutines".
I believe it is because of #412, and once we merge #413 the problem should be fixed.
I've create a separate test branch where I merged the #413 into my branch, and it seems that there all kotlinx.coroutines tests are passed.

@eupp eupp force-pushed the dynamic-threads branch 2 times, most recently from 0ad50bd to 25c0c45 Compare October 31, 2024 17:15
@eupp eupp requested a review from ndkoval October 31, 2024 17:20
@eupp
Copy link
Collaborator Author

eupp commented Oct 31, 2024

@ndkoval while working on this I discovered a few problems, which are probably should be addressed in separate PRs.

One of the problems is related to the local objects tracking --- the current implementation does not work correctly with custom threads (see example below and the comment in DataStructuresTests::incorrectHashMap test).
So the problem can be illustrated by the following example:

class Box(var x: Int)

fun test(): Int {
    val box = Box() // <- this object is incorrectly classified as a local object
    thread {
        box.x = 42 // the local object tracker does not detect here that the `box` object,
                   // stored in the local variable, escapes into another thread;
                   // thus it will not insert a switch point before accesses to this object fields
    }
    return box.x
}

I would propose that we can address this problem separately in another PR after we merge this one.
The reasons is that adding support of this case would require significant refactoring of local objects tracking algorithm,
and this PR is already big.

Alternatively, we can first perform the necessary refactoring of the local objects tracking algorithm in a separate PR,
and after this rebase and merge custom threads PR.

@eupp
Copy link
Collaborator Author

eupp commented Nov 26, 2024

Another small bug fix on which this PR relies on: #426

eupp added 16 commits December 11, 2024 16:22
Signed-off-by: Evgeniy Moiseenko <[email protected]>
* in preparation of implementing ignored sections for custom threads

Signed-off-by: Evgeniy Moiseenko <[email protected]>
Signed-off-by: Evgeniy Moiseenko <[email protected]>
…in `cancelByLincheck`

Signed-off-by: Evgeniy Moiseenko <[email protected]>
Signed-off-by: Evgeniy Moiseenko <[email protected]>
gradle.properties Outdated Show resolved Hide resolved
bootstrap/src/sun/nio/ch/lincheck/Injections.java Outdated Show resolved Hide resolved
bootstrap/src/sun/nio/ch/lincheck/Injections.java Outdated Show resolved Hide resolved
return ((TestThread) thread).descriptor;
}
int hashCode = System.identityHashCode(thread);
ArrayList<ThreadDescriptor> threadDescriptors = threadDescriptorsMap.get(hashCode);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a weak hash map instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am afraid no, because we need concurrent identity hash map with weak keys, and there is no such data structure out-of-the-box in the Java standard library.

Please also have a look at the comment before threadDescriptorsMap field declaration --- it explains the reason and the proposed solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using Collections.synchronizedMap(...) solve the issue?

Copy link
Collaborator Author

@eupp eupp Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collections.synchronizedMap(...) would not fully solve the problem. It would make the collection thread-safe, but WeakIdentityHashMap is still not a thing in Java stdlib (remember we need thread-safe weak identity hash-map).

Also, there is a method to lookup for a descriptor of the given thread:

public static ThreadDescriptor getThreadDescriptor(Thread thread)

which uses threadDescriptorsMap.get.

Forcing the descriptor lookup function to acquire a lock on each lookup (instead of using concurrent hash map), may in theory have a performance penalty.

@@ -79,7 +79,8 @@ fun shouldReplayInterleaving(): Boolean {
*/
@Suppress("UNUSED_PARAMETER")
fun beforeEvent(eventId: Int, type: String) {
val strategy = (Thread.currentThread() as? TestThread)?.eventTracker ?: return
val strategy = Injections.getCurrentThreadDescriptor()?.eventTracker
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you checked how custom threads work with the plugin?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it does not currently work.

I found one problem in the plugin code that I've managed to fix by myself, but it still does not work.
There are some other bugs, and I need help from someone on the plugin side to debug the remaining problems.

* with this file, You can obtain one at http://mozilla.org/MPL/2.0/.
*/

package org.jetbrains.kotlinx.lincheck.util
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to have to Utils.kt files?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to eventually leave only one file, inside the util subpackage, and move the stuff from the top-level package Utils.kt file either into util subpackage or to other more suitable places.

However, I didn't want to clutter this PR with unnecessary changes, thus I haven't modified the top-level Utils.kt file.

@@ -204,7 +204,7 @@ internal class LoopDetector(
// Has the thread changed? Reset the counters in this case.
check(lastExecutedThread == iThread) { "reset expected!" }
// Ignore coroutine suspension code locations.
if (codeLocation == COROUTINE_SUSPENSION_CODE_LOCATION) return Decision.Idle
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why UNKNOWN? The documentation here and of the UNKNOWN_CODE_LOCATION field says it is the coroutine suspension code location.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was COROUTINE_SUSPENSION_CODE_LOCATION previously, I renamed it into UNKNOWN_CODE_LOCATION because I re-used it for in a different context, but with a similar purpose (that is to represent an unknown code location).

I fixed the comments in the code so they now mention UNKNOWN_CODE_LOCATION.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, I need help understanding the logic. Why does codeLocation == UNKNOWN_CODE_LOCATION indicate coroutine suspension?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. The code now looks like this:

        // Ignore unknown code locations.
        if (codeLocation == UNKNOWN_CODE_LOCATION) return Decision.Idle

@eupp
Copy link
Collaborator Author

eupp commented Dec 12, 2024

From what I can see, there is no significant changes in the time of CI builds between this branch and develop:

The reason is that in this PR I strive to preserve the old behavior whenever possible, by adding special treatment of TestThread class, see an example here:

if (thread instanceof TestThread) {

Thus, all the code interacting with TestThread-s (i.e. threads created by Lincheck) should work almost the same as before, and there should be no new performance penalties for it.

eupp added 11 commits December 12, 2024 18:17
Signed-off-by: Evgeniy Moiseenko <[email protected]>
Signed-off-by: Evgeniy Moiseenko <[email protected]>
* do not switch to a thread awaiting thread join until the awaited thread finishes
* print the awaited thread id in the thread switch trace point

Signed-off-by: Evgeniy Moiseenko <[email protected]>
Signed-off-by: Evgeniy Moiseenko <[email protected]>
@eupp eupp requested a review from ndkoval December 13, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants