Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race condition in MonitorConnectionClient #160

Merged

Conversation

lvfxx
Copy link
Member

@lvfxx lvfxx commented Mar 17, 2020

Problem:
Undefined order of execution between monitorConnectionClient.eventLoop() goroutine that passes events to fanoutEventCh and goroutine that's waiting <--context.Done() and closes fanoutEventCh leads to different number of testOnHeal.Request() calls in /pkg/networkservice/common/heal/client_test.go
TestNewClient_MissingConnectionsOnInit test that leads to different number (2 or 3) testOnHeal.Request() invocations. Explanation diagram

Solutions i see:

  1. time.Sleep(100 * time.Millisecond) after sending INITIAL_STATE_TRANSFER event to ensure that this event will be handled earlier than context cancellation.

  2. Guarantee that chain context cancellation will be handled after connection event, if it was cancelled after connection event sending. One way to do that i see is to emulate the server falling down by closing eventCh, not chain context cancelling. Chain context cancelling is rather about client's intention to shut down and logically may be async, but server's falling down is not.

Tried to implement the second one in this PR :)

@lvfxx lvfxx changed the title Heal client event ch refactor Fix race condition in MonitorConnectionClient Mar 17, 2020
@lvfxx lvfxx force-pushed the heal-client-eventCh-refactor branch 2 times, most recently from 2e6b4e3 to b1dddf0 Compare March 17, 2020 05:45
Signed-off-by: Sergey Semenov <[email protected]>
@edwarnicke edwarnicke merged commit 285b6a4 into networkservicemesh:master Mar 18, 2020
@lvfxx lvfxx deleted the heal-client-eventCh-refactor branch March 18, 2020 05:28
nsmbot pushed a commit that referenced this pull request Jun 27, 2023
…i@main

PR link: networkservicemesh/api#160

Commit: 9785eac
Author: Danil Uzlov
Date: 2023-06-27 21:36:11 +0700
Message:
  - Add RESELECT_REQUESTED connection state (#160)
* add reselect connection state

Signed-off-by: Danil Uzlov <[email protected]>

* fix ci

Signed-off-by: Danil Uzlov <[email protected]>

---------

Signed-off-by: NSMBot <[email protected]>
denis-tingaikin pushed a commit that referenced this pull request Jun 27, 2023
…i@main (#1477)

PR link: networkservicemesh/api#160

Commit: 9785eac
Author: Danil Uzlov
Date: 2023-06-27 21:36:11 +0700
Message:
  - Add RESELECT_REQUESTED connection state (#160)
* add reselect connection state



* fix ci



---------

Signed-off-by: NSMBot <[email protected]>
Co-authored-by: NSMBot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants