Invalidate project test results when test-infra bad #14890

tpepper · 2019-10-21T20:38:05Z

What would you like to be added:
Invalidation of test results

Why is this needed:
In testgrid we see the results of the project under test, for example k/k. But the reality is that we're also testing test-infra in parallel, yet only the results relative to the project directly under test are reported, not test-infra. Nevertheless test-infra does have monitoring and does report issues to #testing-ops, and perhaps k-dev mailing list, while test-grid itself is the place we all look for test results.

It would be very useful to the project's disparate developers across its many SIGs if, for any test-grid result columns in any board, the commitid range of test-infra for which problems are identified in test-infra got a colorized red box, like other test results. Eg: instead of:

if we've triaged that test-infra 844e8e0 (and greater, up to some point?) then we should see this in testgrid:

It would be useful if that box could also then be annotated with a link to more information, similar to how the test cases for the project are linked. Link could be for example to the issue raised in k/test-infra, the Slack notification in #testing-ops from the Prow Monitoring App, or the #testing-ops or #sig-testing discussion of the issue.

And then it would be useful to be able to clear these. A crude approach might be to re-run any tests that failed in the 24hrs around a flagged test-infra issue. Or if the issue is identified specifically to a .. range, to re-issue all failed tests that ran across the project with test-infra code in the corresponding commit range.

This could more actively highlight if/when test-infra issues are causing issues impacting end users, but not clearly observable as such to them. It could help counter the sense of maybe something's wrong, I'll just wait for a few more instances of my code to go through test and see if for constant project commitid and varying test-infra commitid, or vice versa, or worse both varying, that I can differentiate if my code is at issue or if I should loop in others for broader help. Additionally if tests are invalidated and re-run, some notable redundant debugging effort might be avoided by amending more valid test results into the grid for the project under test before humans spend time debugging a test-infra issue.

The text was updated successfully, but these errors were encountered:

alejandrox1 · 2019-10-21T20:52:57Z

/area testgrid

BenTheElder · 2019-10-21T21:00:16Z

But the reality is that we're also testing test-infra in parallel, yet only the results relative to the project directly under test are reported, not test-infra.

not for all results, FWIW. eg unit testing, kind. we may still surface another commit in the dashboard though.

the commitid range of test-infra for which problems are identified in test-infra got a colorized red box, like other test results

how does testgrid know what commits of test-infra are bad?

tpepper · 2019-10-21T23:16:09Z

how does testgrid know what commits of test-infra are bad?

That's the point of the enhancement request...can criteria result is a decision and can that good/bad decision be passed somehow to the same place we track other results for visibility? bad_infra_commit_ranges.yaml? I'm not expecting agreement that the idea is worthwhile, but if it were agreed, is it possible?

BenTheElder · 2019-10-21T23:17:54Z

cc @michelle192837

michelle192837 · 2019-10-21T23:23:48Z

I suspect it's possible, but I think having passing results once any environmental/repo-specific/whatnot issues are over is a clearer indicator that doesn't need extra logic to implement. Marking these as visually different in TestGrid seems more plausible though (for example, a pink box with an issue number), and is in-line with some of the existing enhancements TestGrid has.

I don't think we'll get to this soon, but will keep in mind.

fejta-bot · 2020-01-19T23:40:28Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2020-02-19T00:23:00Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

BenTheElder · 2020-02-19T00:25:31Z

this seems relatively infeasible at the moment and not staffed.
I'm going to go ahead and close this for now, unless someone has staffing and an actionable plan.

tpepper added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 21, 2019

k8s-ci-robot added the area/testgrid label Oct 21, 2019

alejandrox1 mentioned this issue Oct 28, 2019

[Umbrella] 1.16 Release Retrospective Action Items kubernetes/sig-release#806

Closed

9 tasks

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2020

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2020

BenTheElder closed this as completed Feb 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invalidate project test results when test-infra bad #14890

Invalidate project test results when test-infra bad #14890

tpepper commented Oct 21, 2019

alejandrox1 commented Oct 21, 2019

BenTheElder commented Oct 21, 2019

tpepper commented Oct 21, 2019

BenTheElder commented Oct 21, 2019

michelle192837 commented Oct 21, 2019

fejta-bot commented Jan 19, 2020

fejta-bot commented Feb 19, 2020

BenTheElder commented Feb 19, 2020

Invalidate project test results when test-infra bad #14890

Invalidate project test results when test-infra bad #14890

Comments

tpepper commented Oct 21, 2019

alejandrox1 commented Oct 21, 2019

BenTheElder commented Oct 21, 2019

tpepper commented Oct 21, 2019

BenTheElder commented Oct 21, 2019

michelle192837 commented Oct 21, 2019

fejta-bot commented Jan 19, 2020

fejta-bot commented Feb 19, 2020

BenTheElder commented Feb 19, 2020