-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the test stable #5346
Make the test stable #5346
Conversation
@@ -1575,6 +1575,9 @@ class FunctionPullingContainerProxyTests | |||
|
|||
machine ! Initialize(invocationNamespace.asString, fqn, action, schedulerHost, rpcPort, messageTransId) | |||
probe.expectMsg(Transition(machine, Uninitialized, CreatingClient)) | |||
awaitAssert { | |||
machine.stateData shouldBe a[ContainerCreatedData] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When ClientCreationCompleted
is sent to the proxy before the data is updated, it causes a kind of cycle and the test keeps running.
https://github.com/apache/openwhisk/blob/master/core/invoker/src/main/scala/org/apache/openwhisk/core/containerpool/v2/FunctionPullingContainerProxy.scala#L342
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this test instability introduced recently? I know this is just a test, but I'm wondering if there could be an issue with this #5333 since I'm seeing weird orphaned non-existent containers in etcd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me explain the details.
case Event(ClientCreationCompleted(proxy), _: NonexistentData) =>
self ! ClientCreationCompleted(proxy.orElse(Some(sender())))
stay()
According to this logic, if the proxy receives ClientCreationCompleted
before the data becomes the ContainerCreatedData
that comes from here, it repeatedly send the message to itself.
And it creates an akka message cycle.
Generally, it takes some time to create and initialize the activation client proxy.
It takes more than the time for a container proxy to receive the ContainerCreatedData
.
So I believe there was no issue with this.
But in the test code, there is no client proxy initialization and it sent ClientCreationCompleted
as soon as the proxy status changes. And it made a cycle according to the timing.
LGTM thanks for fixing this, seems like this is why I've been having trouble with the scheduler tests. |
1a3259c
to
c6e4532
Compare
@@ -124,7 +124,9 @@ class ActivationClientProxy( | |||
stay() | |||
|
|||
case Event(e: RescheduleActivation, client: Client) => | |||
logging.info(this, s"[${containerId.asString}] got a reschedule message ${e.msg.activationId} for action: ${e.msg.action}") | |||
logging.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how it passed the scalaFmt in the previous PR.
@@ -339,7 +339,11 @@ class FunctionPullingContainerProxy( | |||
|
|||
// wait for container creation when cold start | |||
case Event(ClientCreationCompleted(proxy), _: NonexistentData) => | |||
self ! ClientCreationCompleted(proxy.orElse(Some(sender()))) | |||
akka.pattern.after(3.milliseconds, actorSystem.scheduler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if it generally does not happen in production, it would be great to add a small delay to avoid the cycle as a last resort.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5346 +/- ##
===========================================
+ Coverage 38.96% 76.45% +37.48%
===========================================
Files 240 240
Lines 14378 14383 +5
Branches 614 614
===========================================
+ Hits 5602 10996 +5394
+ Misses 8776 3387 -5389 ☔ View full report in Codecov by Sentry. |
c6e4532
to
88887d5
Compare
Description
This is to make the test stable.
Related issue and scope
My changes affect the following components
Types of changes
Checklist: