Stabilize e2e dual-stack test#936
Conversation
|
|
|
Hi @unsortedhashsets. Thanks for your PR. I'm waiting for a istio-ecosystem or istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
@unsortedhashsets did you see the error with Kind setup or OCP deployments? |
With OCP, in general problem was in too fast check of pods existing |
33c5319 to
161adf7
Compare
tests/e2e/util/common/e2e_utils.go
Outdated
| @@ -294,16 +294,15 @@ func GetVersionFromIstiod() (*semver.Version, error) { | |||
| func CheckPodsReady(ctx SpecContext, cl client.Client, namespace string) (*corev1.PodList, error) { | |||
There was a problem hiding this comment.
What if we remove all the eventually inside the function and we call the eventually from outside, I mean. Something like this:
- Currently, we are using the function:
_, err = common.CheckPodsReady(ctx, cl, DualStackNamespace)
Expect(err).ToNot(HaveOccurred(), fmt.Sprintf("Error checking status of dual-stack pods: %v", err))
- Change the use for this:
Eventually(common.CheckPodsReady).WithArguments(ctx, cl, DualStackNamespace).Should(Succeed(), "Error checking status of dual-stack pods")
With this, we will use only one eventually, and we will simplify the code in the test. The only downside of this is that the function CheckPodsReady currently returns *corev1.PodList and error, and my suggestion will not work with those returns. But in my opinion, the function should only check that the pods in the namespace are ready, and after we check that the pods are ready and not return a list of pods, we can get the pods of the namespace in a following step to be used in later validations. Wdyt @sridhargaddam ?
There was a problem hiding this comment.
It has also use as:
sleepPod, err = common.CheckPodsReady(ctx, cl, SleepNamespace)
here:
There was a problem hiding this comment.
Yeah, in that case, I suggest avoiding using a function with the name CheckPodsReady to return a list of pods. So, we can check the pods are ready in the ns with a function that only checks the state of the pods, and then we can get the list of the pods from the ns to use it later in the next steps. We already get only the pods like that using the client:
For example:
samplePodsCluster2 := &corev1.PodList{}
It("updates the pods status to Ready", func(ctx SpecContext) {
Expect(clRemote.List(ctx, samplePodsCluster2, client.InNamespace(sampleNamespace))).To(Succeed())
Expect(samplePodsCluster2.Items).ToNot(BeEmpty(), "No pods found in sample namespace")
There was a problem hiding this comment.
@unsortedhashsets I mean this is not something strictly related to your PR, I'm just seeing that this can be improved also at the same time with your PR. Using two eventually inside the same function seems a little redundant for me, we can simplified easily moving our the eventually
There was a problem hiding this comment.
Thanks for clarification, will try that approach
There was a problem hiding this comment.
I have just replaced function call with two internal Eventually with more go one and external Eventually.
On dualstack works fine, also found few places to use in controle plane tests, lets see the test run.
f9f4a86 to
7fd60b7
Compare
fjglira
left a comment
There was a problem hiding this comment.
lgtm, only a smal small comment
|
|
||
| samplePods := &corev1.PodList{} | ||
| Eventually(common.CheckPodsReady).WithArguments(ctx, cl, sampleNamespace).Should(Succeed(), "Error checking status of sample pods") | ||
| Expect(cl.List(ctx, samplePods, client.InNamespace(sampleNamespace))).To(Succeed(), "Error deploying sample pod") |
There was a problem hiding this comment.
I suggested that you change the error message from Error deploying sample pod to Error getting the pods in sample namespace. Note: in all the places that you are using the same message :)
There was a problem hiding this comment.
Thanks, agree, it's describing operation better
Signed-off-by: Mikhail Abramov <mabramov@redhat.com>
7fd60b7 to
26f572e
Compare
|
@unsortedhashsets please, if possible, backport to |
* upstream/main: Stabilize e2e dual-stack test (istio-ecosystem#936)
Signed-off-by: Mikhail Abramov <mabramov@redhat.com>
Signed-off-by: Mikhail Abramov <mabramov@redhat.com>
* Stabilize e2e dual-stack test (istio-ecosystem#936) (istio-ecosystem#939) Signed-off-by: Mikhail Abramov <mabramov@redhat.com> * fix IstioRevisionTag uninstall when revision namespace changes (istio-ecosystem#945) Fixes istio-ecosystem#917. This is quite a complex edge case, I'll do my best to explain. If you look at the integration test I wrote (part of this commit), you can follow along. When a tag was created that referenced a revision in ns1, then patched to reference a revision in ns2, we failed to clean up the old helm install in ns1. That almost never causes problems - except in the case where afterwards, you remove all tags, and then create a new one with the same name that is never reconciled. If that tag is deleted, the operator will attempt to uninstall that lingering helm release, which fails, leading the operator to deadlock, and stopping the user from deleting the tag. The only way to get the operator back into a working state would be to manually delete that lingering helm release. That is a pretty significant bug in my opinion. Signed-off-by: Daniel Grimm <dgrimm@redhat.com> Co-authored-by: Daniel Grimm <dgrimm@redhat.com> * Expose "Profile" parameter in UI for Istio and IstioCNI (istio-ecosystem#924) (istio-ecosystem#947) During deployment of Istio in ambient mode, the user should be able to set the "profile: ambient" for Istio and IstioCNI resources. Currently, this parameter is hidden from the UI. Turn on the visibility of this parameter in the UI. Signed-off-by: Maxim Babushkin <mbabushk@redhat.com> * [Manual Cherry pick] Pick e2e improve and solving helm issues (istio-ecosystem#948) * Consolidate multicluster operator deploy/cleanup (istio-ecosystem#902) Now that we have `Cleaner` clean up after each test, it makes sense to run this logic once in the suite, rather than after each test (which cleans up its own resources). Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> * Fix typo in the ReportAfterSuite text (istio-ecosystem#911) Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com> * Fix helm issue during e2e test run operator installation (istio-ecosystem#914) * Fix helm issue during e2e test run operator installation Signed-off-by: Francisco H <frherrer@redhat.com> Fix command error Signed-off-by: Francisco H <frherrer@redhat.com> * Apply suggestions from code review Co-authored-by: Mike Kolesnik <mkolesni@redhat.com> Signed-off-by: Francisco H <frherrer@redhat.com> * Improves from code review Signed-off-by: Francisco H <frherrer@redhat.com> * Update comment in build and push script Signed-off-by: Francisco H <frherrer@redhat.com> --------- Signed-off-by: Francisco H <frherrer@redhat.com> Co-authored-by: Mike Kolesnik <mkolesni@redhat.com> * Consolidate e2e resource creation (istio-ecosystem#931) * Consolidate IstioCNI creation in e2e tests This code is very repetitive and would better live in a common function. Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> * Consolidate Istio creation in e2e tests This code is very repetitive and would better live in a common function. Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> --------- Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> * Use `Eventually` in remote secret generation (istio-ecosystem#921) In this test the secret generation sometimes is attempted before everything is ready, leading to flaky, hard to debug, failures. Using `Eventually` to generate the secret avoids these failures. Fixes: istio-ecosystem#866 Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> --------- Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com> Signed-off-by: Francisco H <frherrer@redhat.com> Co-authored-by: Mike Kolesnik <mkolesni@redhat.com> Co-authored-by: Sridhar Gaddam <sgaddam@redhat.com> * Bump to 1.26.2, add Istio 1.26.2 (istio-ecosystem#950) Signed-off-by: Daniel Grimm <dgrimm@redhat.com> --------- Signed-off-by: Mikhail Abramov <mabramov@redhat.com> Signed-off-by: Daniel Grimm <dgrimm@redhat.com> Signed-off-by: Maxim Babushkin <mbabushk@redhat.com> Signed-off-by: Mike Kolesnik <mkolesni@redhat.com> Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com> Signed-off-by: Francisco H <frherrer@redhat.com> Co-authored-by: Mikhail Abramov <mabramov@redhat.com> Co-authored-by: Istio Automation <istio-testing-bot@google.com> Co-authored-by: Daniel Grimm <dgrimm@redhat.com> Co-authored-by: Maxim Babushkin <mbabushk@redhat.com> Co-authored-by: Francisco Herrera <frherrer@redhat.com> Co-authored-by: Mike Kolesnik <mkolesni@redhat.com> Co-authored-by: Sridhar Gaddam <sgaddam@redhat.com>
Signed-off-by: Mikhail Abramov <mabramov@redhat.com> Signed-off-by: Daniel Grimm <dgrimm@redhat.com>
What type of PR is this?
What this PR does / why we need it:
Allows wait inside CheckPodsReady func till pods will be ready (avoid errors during dual-stack tests like
Same principle as used here: https://github.com/istio-ecosystem/sail-operator/blob/main/tests/e2e/controlplane/control_plane_test.go#L221
Which issue(s) this PR fixes:
Fixes #
Related Issue/PR #
Additional information: