Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests/e2e: more debug info on dumps #357

Merged
merged 5 commits into from
Aug 23, 2022
Merged

Conversation

willfindlay
Copy link
Contributor

This series adds a bunch more debug info to e2e test dumps. See commits.

@willfindlay willfindlay requested a review from kkourt August 23, 2022 13:31
@willfindlay willfindlay requested a review from a team as a code owner August 23, 2022 13:31
@willfindlay willfindlay force-pushed the pr/willfindlay/even-better-dumps branch 2 times, most recently from 6240623 to b04dd26 Compare August 23, 2022 13:48
Copy link
Contributor

@kkourt kkourt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some minor comments

tests/e2e/helpers/dumpinfo.go Show resolved Hide resolved
tests/e2e/helpers/dumpinfo.go Outdated Show resolved Hide resolved
@willfindlay willfindlay marked this pull request as draft August 23, 2022 14:31
There were a few leaked goroutines in e2e framework due to the context never getting
canceled at the end of the tests, breaking some of our assumptions. Refactor the runner to
wrap the testenv and cancel the context at the end of the test.

Signed-off-by: William Findlay <[email protected]>
In cases where the agent crashes during a test, we would be unable to retrieve metrics at
the end (since the metrics server is now offline). This means that we lose out on some
valuable debugging information to deduce what went wrong. To rectify this shortcoming,
introduce some new logic to regularly dump Tetragon metrics during the test at a specific
interval. With these changes in place, we now at least have a recent snapshot of metrics
regardless of whether the metrics server can be contacted at the end of a test.

Signed-off-by: William Findlay <[email protected]>
Create a new RunCommand helper and factor it out of the main body of the bpftool dump
subroutine. This will enable us to reuse the same basic logic to run commands elsewhere
with a nice dump of stdout and stderr.

Signed-off-by: William Findlay <[email protected]>
In case the Tetragon pod crashes during a test, it's useful to grab the output of kubectl
describe to help understand why. So let's add this to the test dump on failure.

Signed-off-by: William Findlay <[email protected]>
It's useful to get a summary of pods in our cluster when a test fails so we can figure out
if the reason for the failure might be incidental -- for example in cases where our
workload fails to run correctly.

Signed-off-by: William Findlay <[email protected]>
@willfindlay willfindlay force-pushed the pr/willfindlay/even-better-dumps branch from b04dd26 to 6db73d0 Compare August 23, 2022 14:39
@willfindlay
Copy link
Contributor Author

willfindlay commented Aug 23, 2022

@kkourt I addressed your comments. Also realized at the same time that we were never actually cancelling our context in the tests, so I had to refactor the runner a bit (it's in the new first commit).

@willfindlay willfindlay marked this pull request as ready for review August 23, 2022 14:49
@kkourt kkourt merged commit d8bb3c5 into main Aug 23, 2022
@kkourt kkourt deleted the pr/willfindlay/even-better-dumps branch August 23, 2022 17:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants