-
Notifications
You must be signed in to change notification settings - Fork 287
Remove the resource and machine tickers from e2e tests #1471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mdbooth The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
should we make some error on purpose to prove those files are correctly dumped and get a sample output |
lentzi90
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! The logs are much cleaner with this!
One thing I have noticed (and I don't think it is anything new) is that we don't seem to be able to dump the openstack server instances. Maybe we could fix that at the same time?
Logs look like this:
[2023-02-09T18:25:48Z] Dumping all OpenStack server instances in the "e2e-pe41vo" namespace
error getting internal ip for server cluster-e2e-dajzgq-control-plane-7kf5p: internal ip doesn't exist (yet)
error getting internal ip for server cluster-e2e-dajzgq-bastion: internal ip doesn't exist (yet)
Perhaps there is something wrong with how we try to get the IP here
cluster-api-provider-openstack/test/e2e/shared/openstack.go
Lines 468 to 470 in 6645eb9
| ip := instanceNS.IP(openStackCluster.Status.Network.Name) | |
| if ip == "" { | |
| _, _ = fmt.Fprintf(GinkgoWriter, "error getting internal ip for server %s: internal ip doesn't exist (yet)\n", srv.Name) |
Any clues?
Yes, definitely. It's on my todo list to prove that errors during delete are captured. I haven't even tested it manually yet. |
I've also wondered about this. It's on my TODO list to investigate properly. |
The resource and machine tickers are responsible for fetching logs every 5 and 10 seconds respectively during a test run. Each fetch overwrites the previous one. These are problematic because they cause failure state to be overwritten by logs captured during deletion. That is, if a test ends in failure, we continue to capture these logs regularly during cleanup. This causes the test failure state to be overwritten by the state during cleanup. We already capture these resources after each test regardless of success or failure, so regular capture during the test is redundant. This change simply removes the tickers. We still get the same logs when we execute DumpSpecResourcesAndCleanup() after each test. Additionally, if there is an error during cleanup we optionally capture that state too, but to a separate directory.
|
/test pull-cluster-api-provider-openstack-e2e-test |
|
We correctly dumped the error on deletion in: Note that there is an additional directory created for the deletion failure: Incidentally, I deliberately added the failure after cleanup had succeeded, so the contents of that directory are 'leaked' objects. |
|
/hold cancel |
|
/retest-required |
lentzi90
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
The resource and machine tickers are responsible for fetching logs every 5 and 10 seconds respectively during a test run. Each fetch overwrites the previous one. These are problematic because they cause failure state to be overwritten by logs captured during deletion. That is, if a test ends in failure, we continue to capture these logs regularly during cleanup. This causes the test failure state to be overwritten by the state during cleanup.
We already capture these resources after each test regardless of success or failure, so regular capture during the test is redundant. This change simply removes the tickers. We still get the same logs when we execute DumpSpecResourcesAndCleanup() after each test.
Additionally, if there is an error during cleanup we optionally capture that state too, but to a separate directory.
/hold