Skip to content

Comments

Bug 1945091: Refactor openshift-tests a bit, add more network-test-autodetecting support#25818

Merged
openshift-merge-robot merged 7 commits intoopenshift:masterfrom
danwinship:network-config-test-skips
Apr 25, 2021
Merged

Bug 1945091: Refactor openshift-tests a bit, add more network-test-autodetecting support#25818
openshift-merge-robot merged 7 commits intoopenshift:masterfrom
danwinship:network-config-test-skips

Conversation

@danwinship
Copy link
Contributor

Currently we are unconditionally skipping all single-stack-IPv6, dual-stack, and SCTP tests, regardless of cluster configuration. This is no good.

This PR extends the existing network-plugin-detection code to also detect network IP configuration and automatically skip IPv4-only, IPv6-only, and dual-stack-only tests as appropriate.

It also kind of adds support for detecting and skipping SCTP tests, except we can't really easily detect whether SCTP is available in the cluster at this point, so in reality it only works if you manually override the SCTP detection via --provider/TEST_PROVIDER. But we might be able to improve on that in the future. (And being able to unskip the SCTP tests at all is an improvement over the current situation.)

This will require a followup PR to openshift/kubernetes to stop skipping these tags. Thus, for the moment, this PR should have no effect on the tests that actually get run.

Also, until https://issues.redhat.com/browse/KNIDEPLOY-4090 is fixed, the e2e-metal-ipi-ovn-ipv6 and e2e-metal-ipi-ovn-dualstack suites won't actually run any of the "fun" test cases anyway...

@danwinship danwinship force-pushed the network-config-test-skips branch 2 times, most recently from f675869 to 1575c88 Compare January 25, 2021 17:40
@openshift-ci-robot openshift-ci-robot added the vendor-update Touching vendor dir or related files label Jan 25, 2021
@danwinship
Copy link
Contributor Author

/cc @smarterclayton @marun

@marun
Copy link
Contributor

marun commented Jan 27, 2021

/retest

@marun
Copy link
Contributor

marun commented Jan 29, 2021

/lgtm

This looks fine to me, and I appreciate the testing you've added.

Any reason not to unskip in o/k first and vendor that change as part of this PR to get some test signal? Any change to o/k will have to be vendored to origin to take effect.

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 29, 2021
@danwinship danwinship force-pushed the network-config-test-skips branch from 1575c88 to 47477cd Compare February 10, 2021 17:53
@openshift-ci-robot openshift-ci-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Feb 10, 2021
@danwinship
Copy link
Contributor Author

danwinship commented Feb 10, 2021

Any reason not to unskip in o/k first and vendor that change as part of this PR to get some test signal? Any change to o/k will have to be vendored to origin to take effect.

Filed that PR now, openshift/kubernetes#565. But that shouldn't merge until after this does since if it got vendored before this PR merged it would cause tests to be unskipped that shouldn't be unskipped (eg running dual-stack tests in single-stack clusters).

But also, this PR is mostly just a prereq; we still have to fix KNIDEPLOY-4090 to get the IPv6/dual-stack tests to actually run, and we still have to create a new job (or modify some existing one) to get the SCTP tests to run.

@danwinship
Copy link
Contributor Author

/hold
testing with updated annotations; should not actually affect what runs, in the end

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 10, 2021
@danwinship danwinship force-pushed the network-config-test-skips branch from 22aa7b6 to 47477cd Compare February 11, 2021 14:19
@danwinship
Copy link
Contributor Author

e2e-gcp runs with and without the vendored kube change are identical (other than having different flakes). (The different test count is just because they had different numbers of flakes and so had different numbers of retries. Sorting and diffing the junit output shows no other differences.)

@marun
Copy link
Contributor

marun commented Feb 12, 2021

/retest

@danwinship
Copy link
Contributor Author

/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 15, 2021
Comment on lines +208 to +214
skips = append(skips, "[Feature:Networking-IPv4]")
}
if !c.HasIPv6 {
skips = append(skips, "[Feature:Networking-IPv6]")
Copy link

@aojea aojea Feb 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just FYI these tags are not longer upstream, tests must work on both IP families (depend on the cluster configuration), there is only one test remaining that I couldn't migrate

grep -r Networking  test | grep Fea
test/e2e/network/networking_perf.go:var _ = SIGDescribe("Networking IPerf2 [Feature:Networking-Performance]", func() {
test/e2e/network/networking.go: ginkgo.It("should provide Internet connection for containers [Feature:Networkin-IPv4]", func() {
test/e2e/network/networking.go: ginkgo.It("should provide Internet connection for containers [Feature:Networkin-IPv6][Experimental][LinuxOnly]", func() {

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh, I just realise the tag is wrong Networkin/Networking

if len(providerInfo.Type) == 0 {
return nil, fmt.Errorf("provider must be a JSON object with the 'type' key")
}
var cloudConfig e2e.CloudConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this would mean we have to be backwards compatible forever in upstream test context (the config for a step is ignorant of the version of openshift it’s testing). If upstream changes the struct in a non backwards compatible way the rebase will require all the test steps/config in CI to be forked).

I’m not opposed to the change in theory but I’d want some evidence we have a way that is not “openshift-tests api surface to CI changes” before moving forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the other way around; currently we expose the upstream API in our API. After this PR we do not.

That is, before this commit we do:

after this PR we do:

  • unmarshal provider into an exutilcluster.ClusterConfig
  • later, copy the exutilcluster.ClusterConfig into an e2e.CloudConfig, element-by-element

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're backward compatible on the json format all the way back to 4.1 still, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The JSON-parsing unit test is added in the first commit, and none of the input values of those tests is changed after that. (The output values change slightly to reflect two bugfixes: we now correctly set the default vales of NumNodes and MultiMaster even on bare metal; and we no longer ignore the MultiMaster and MultiZone fields in the input JSON if it's trying to override them.)

If you want to add additional examples from openshift/release to the unit tests I can do that

@smarterclayton
Copy link
Contributor

A few concerns on individual commits.

@danwinship danwinship force-pushed the network-config-test-skips branch from 47477cd to 6910026 Compare March 2, 2021 18:49
It already contains non-cloud-specific network info, and will soon
contain more.
We shouldn't avoid detecting NumNodes and MultiMaster just because
Platform is None.
When the user passes "--provider <JSON-VALUE>", rather than reading it
into an e2e.CloudConfig, autodetecting the
exutilcluster.ClusterConfig, and then merging the two structs together
field by field, just do the autodetection first, and then unmarshal
the JSON directly into the ClusterConfig (possibly overwriting some of
the autodetected values).

In addition to being simpler, this allows for overriding the
networking-related values that are in ClusterConfig but not
CloudConfig.
Redo "NetworkPluginIDs" as "NetworkPlugin" and "NetworkPluginMode",
which make more sense as part of a visible API.
(For now we can't actually autodetect whether SCTP is available, but a
test job could set HasSCTP via JSON --provider.)
@danwinship danwinship force-pushed the network-config-test-skips branch from 8f4c7ae to 6d7e8dc Compare April 12, 2021 17:27
@danwinship
Copy link
Contributor Author

/retest

4 similar comments
@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/retest

@dcbw
Copy link
Contributor

dcbw commented Apr 19, 2021

/retest

@smarterclayton
Copy link
Contributor

/retest

@smarterclayton
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 22, 2021
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, marun, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2021
@danwinship
Copy link
Contributor Author

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@danwinship
Copy link
Contributor Author

/retest
e2e-gcp-upgrade appears to be mostly failing for everyone

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 581a8a0 into openshift:master Apr 25, 2021
@openshift-ci-robot
Copy link

@danwinship: All pull requests linked via external trackers have merged:

Bugzilla bug 1945091 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1945091: Refactor openshift-tests a bit, add more network-test-autodetecting support

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants