-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connectivity test concurrent run #2496
Conversation
fde3e78
to
5777f53
Compare
5777f53
to
44c3f3c
Compare
@viktor-kurchenko could you take a look at kind test failures? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's add a matrix for the kind job and run with both --test-concurrency=1
and --test-concurrency=5
: https://github.com/cilium/cilium-cli/blob/main/.github/workflows/kind.yaml#L24
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome change overall, very happy to see it! I think the only blocking comment from me is showing the user which test Fatald.
This commit changes the behavior of --test-namespace flag. Instead of only deleting the namespace that matches --test-namespace flag exactly, delete all the namespaces with the prefix specified in --test-namespace flag. This is in preparation to support running connectivity tests in parallel using multiple namespaces (#2496). Signed-off-by: Michi Mutsuzaki <[email protected]>
44c3f3c
to
d0158de
Compare
d0158de
to
bdfbbc4
Compare
c118ec6
to
871489f
Compare
b3f8469
to
e011654
Compare
e011654
to
9fe19fb
Compare
9fe19fb
to
13e5f28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for my codeowners. I think the new policy lock retains the same assumptions we have today where only one piece of software should be manipulating policies at the time that you do the connectivity test (ie, only the CLI does it). And that should be good enough for now. It would be cool to explore better mechanisms for ensuring that datapath updates have occurred for a specific policy, but for now we don't really have a solution to that problem and it's not worth holding back this effort until we do that.
.github/workflows/kind.yaml
Outdated
fail-fast: false | ||
matrix: | ||
include: | ||
# run connectivity tests explicitly without concurrency | ||
- test-concurrency: 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to retain fail-fast for test-concurrency: 1 for quicker iteration and keep it off for higher concurrency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I've disabled fail-fast
for the debugging purpose.
I think it's better to remove it at all for the workflow.
13e5f28
to
7e58779
Compare
New `test-concurrency` input param added for the connectivity test command to provide namespace count for concurrent tests execution. The hidden flag will be removed after internal testing. Signed-off-by: viktor-kurchenko <[email protected]>
This PR introduces connectivity test concurrent run: - Parameters struct extended with ExternalDeploymentPort. - ConnectivityTest.Cleanup method implemented for instance reusability. - ConnectivityTest instantiated concerning test-concurrency input param. - GetTestSuites function implemented to return test suite functions. - Run function refactored to properly setup and run ConnectivityTest instances. Signed-off-by: viktor-kurchenko <[email protected]>
Package global policy mutex added to guarantee proper policy provisioning and deletion in case of concurrent run. Signed-off-by: viktor-kurchenko <[email protected]>
Signed-off-by: viktor-kurchenko <[email protected]>
The MultiError struct implemented and used instead of `hashicorp/go-multierror` library. The MultiError allows running many goroutine concurrently and returns a joined error if at least one goroutine returns an error. Signed-off-by: viktor-kurchenko <[email protected]>
The kind workflow improved with matrix to test `test-concurrency` parameter. Signed-off-by: viktor-kurchenko <[email protected]>
Signed-off-by: viktor-kurchenko <[email protected]>
The check.EchoServerHostPort constant moved into check.Parameters to make it unique for each test namespace in case of concurrent run. Signed-off-by: viktor-kurchenko <[email protected]>
ClusterMesh connectivity tests setup fixed to support test-concurrency param. Kind workflow (clustermesh part) updated with test-concurrency. Signed-off-by: viktor-kurchenko <[email protected]>
7e58779
to
9f98ef0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's do it
This PR introduces connectivity test concurrent run.
The new
test-concurrency
input param can be used to specify a count of K8S test namespaces.So, most connectivity tests will run concurrently across test namespaces.
The new parameter is marked
hidden
because of the following reasons:debug
andverbose
modes). I want to address this in a separate PR.Command run example:
cilium connectivity test --test-concurrency=5
Local test in a 4 node kind cluster result:
--test-concurrency=1
--test-concurrency=5