Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ConflatedBroadcastChannelLincheckTest in kotlinx.coroutines #273

Merged
merged 3 commits into from
Feb 19, 2024

Conversation

ndkoval
Copy link
Collaborator

@ndkoval ndkoval commented Feb 13, 2024

Fixes failing ConflatedBroadcastChannelLincheckTest; see the build below
https://teamcity.jetbrains.com/buildConfiguration/KotlinTools_KotlinxLincheck_IntegrationTestWithKotlinxCoroutinesDevelopLinuxOnJava17/4485333?hideTestsFromDependencies=false&hideProblemsFromDependencies=false&expandBuildDeploymentsSection=false&expandBuildTestsSection=true&expandBuildProblemsSection=true

Investigation

There were two issues.

  1. Some classes had to be excluded from transformation and loaded by the system class loader.

Solution: ignore the corresponding classes in doNotTransform(..).

  1. When a spin lock was detected, suddenInvocationResult in ManagedStrategy was set to a non-null value. After that, all further inIgnoredSection(..) calls returned true (it checked for suddenInvocationResult being non-null, I don't know why), making it impossible for other threads to notice the spin-lock detection and also finish. This resulted in an infinite spin-lock and deadlock detection. However, the latter was ignored, as suddenInvocationResult was non-null. Notably, the "hanging" thread was still calling newSwitchPoint(..) periodically (there was a veeery long spin-lock). Then, after starting the next interleaving, the suddenInvocationResult field was reset to null, and the hanging threads, that were calling newSwitchPoint(..) periodically, published events related to the previous interleaving. This caused non-deterministic behavior in the model checker.

Solution: check whether suddenInvocationResult is non-null in the beginning of the analysis, removing the corresponding check from inIgnoredSection(..).

@ndkoval ndkoval marked this pull request as ready for review February 13, 2024 14:58
@ndkoval ndkoval changed the title Fix non-determinism in ConflatedBroadcastChannelLincheckTest Fix ConflatedBroadcastChannelLincheckTest in kotlinx.coroutines Feb 13, 2024
@ndkoval ndkoval marked this pull request as draft February 13, 2024 15:08
@ndkoval ndkoval requested a review from eupp February 13, 2024 16:51
@ndkoval ndkoval force-pushed the fix-coroutines-build branch 3 times, most recently from 7ec930a to b17c09b Compare February 19, 2024 16:12
@ndkoval ndkoval force-pushed the fix-coroutines-build branch from 71eb2f1 to 08f9877 Compare February 19, 2024 19:23
@ndkoval ndkoval marked this pull request as ready for review February 19, 2024 19:25
setResult(iThread, Done)
}
}

fun waitUntilAllThreadsFinishTheCurrentTasks() {
Copy link
Collaborator

@eupp eupp Feb 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just awaitAllThreads or awaitAllTasks ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we do not need this function at all -- the code always waits for all tasks to be completed. I've checked that we never call Thread.yield() in the implementation.

@@ -283,8 +284,11 @@ abstract class ManagedStrategy(
* @param codeLocation the byte-code location identifier of the point in code.
*/
private fun newSwitchPoint(iThread: Int, codeLocation: Int, tracePoint: TracePoint?) {
if (!isTestThread(iThread)) return // can switch only test threads
if (!isTestThread(iThread)) return
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this check is duplicated by the inIgnoredSection(iThread) check on the next line?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's fix that under #276 -- there are many such cases.

@ndkoval ndkoval requested a review from eupp February 19, 2024 20:31
@ndkoval
Copy link
Collaborator Author

ndkoval commented Feb 19, 2024

This issue was quite a journey. Thanks to @eupp for the help with the investigation and the review!

@ndkoval ndkoval merged commit 71cbd95 into develop Feb 19, 2024
13 checks passed
@ndkoval ndkoval deleted the fix-coroutines-build branch February 19, 2024 21:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants