Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalidate project test results when test-infra bad #14890

Closed
tpepper opened this issue Oct 21, 2019 · 8 comments
Closed

Invalidate project test results when test-infra bad #14890

tpepper opened this issue Oct 21, 2019 · 8 comments
Labels
area/testgrid kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@tpepper
Copy link
Member

tpepper commented Oct 21, 2019

What would you like to be added:
Invalidation of test results

Why is this needed:
In testgrid we see the results of the project under test, for example k/k. But the reality is that we're also testing test-infra in parallel, yet only the results relative to the project directly under test are reported, not test-infra. Nevertheless test-infra does have monitoring and does report issues to #testing-ops, and perhaps k-dev mailing list, while test-grid itself is the place we all look for test results.

It would be very useful to the project's disparate developers across its many SIGs if, for any test-grid result columns in any board, the commitid range of test-infra for which problems are identified in test-infra got a colorized red box, like other test results. Eg: instead of:

Screen Shot 2019-10-21 at 1 19 00 PM

if we've triaged that test-infra 844e8e0 (and greater, up to some point?) then we should see this in testgrid:

Screen Shot 2019-10-21 at 1 14 11 PM

It would be useful if that box could also then be annotated with a link to more information, similar to how the test cases for the project are linked. Link could be for example to the issue raised in k/test-infra, the Slack notification in #testing-ops from the Prow Monitoring App, or the #testing-ops or #sig-testing discussion of the issue.

And then it would be useful to be able to clear these. A crude approach might be to re-run any tests that failed in the 24hrs around a flagged test-infra issue. Or if the issue is identified specifically to a .. range, to re-issue all failed tests that ran across the project with test-infra code in the corresponding commit range.

This could more actively highlight if/when test-infra issues are causing issues impacting end users, but not clearly observable as such to them. It could help counter the sense of maybe something's wrong, I'll just wait for a few more instances of my code to go through test and see if for constant project commitid and varying test-infra commitid, or vice versa, or worse both varying, that I can differentiate if my code is at issue or if I should loop in others for broader help. Additionally if tests are invalidated and re-run, some notable redundant debugging effort might be avoided by amending more valid test results into the grid for the project under test before humans spend time debugging a test-infra issue.

@tpepper tpepper added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 21, 2019
@alejandrox1
Copy link
Contributor

/area testgrid

@BenTheElder
Copy link
Member

But the reality is that we're also testing test-infra in parallel, yet only the results relative to the project directly under test are reported, not test-infra.

not for all results, FWIW. eg unit testing, kind. we may still surface another commit in the dashboard though.

the commitid range of test-infra for which problems are identified in test-infra got a colorized red box, like other test results

how does testgrid know what commits of test-infra are bad?

@tpepper
Copy link
Member Author

tpepper commented Oct 21, 2019

how does testgrid know what commits of test-infra are bad?

That's the point of the enhancement request...can criteria result is a decision and can that good/bad decision be passed somehow to the same place we track other results for visibility? bad_infra_commit_ranges.yaml? I'm not expecting agreement that the idea is worthwhile, but if it were agreed, is it possible?

@BenTheElder
Copy link
Member

cc @michelle192837

@michelle192837
Copy link
Contributor

I suspect it's possible, but I think having passing results once any environmental/repo-specific/whatnot issues are over is a clearer indicator that doesn't need extra logic to implement. Marking these as visually different in TestGrid seems more plausible though (for example, a pink box with an issue number), and is in-line with some of the existing enhancements TestGrid has.

I don't think we'll get to this soon, but will keep in mind.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2020
@BenTheElder
Copy link
Member

this seems relatively infeasible at the moment and not staffed.
I'm going to go ahead and close this for now, unless someone has staffing and an actionable plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testgrid kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants