-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update flaky fail count in the process flakes task #705
Conversation
This PR includes changes to |
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #705 +/- ##
========================================
Coverage 98.05% 98.06%
========================================
Files 434 432 -2
Lines 36466 36273 -193
========================================
- Hits 35758 35570 -188
+ Misses 708 703 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
@@ Coverage Diff @@
## main #705 +/- ##
========================================
Coverage 98.05% 98.06%
========================================
Files 434 432 -2
Lines 36466 36273 -193
========================================
- Hits 35758 35570 -188
+ Misses 708 703 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
Changes have been made to critical files, which contain lines commonly executed in production. Learn more ✅ All tests successful. No failed tests found.
Additional details and impacted files@@ Coverage Diff @@
## main #705 +/- ##
========================================
Coverage 98.10% 98.11%
========================================
Files 475 473 -2
Lines 37820 37627 -193
========================================
- Hits 37105 36917 -188
+ Misses 715 710 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. @@ Coverage Diff @@
## main #705 +/- ##
========================================
Coverage 98.05% 98.06%
========================================
Files 434 432 -2
Lines 36466 36273 -193
========================================
- Hits 35758 35570 -188
+ Misses 708 703 -5
Flags with carried forward coverage won't be shown. Click here to find out more.
|
046c0ad
to
780d39b
Compare
This PR includes changes to |
if flake.recent_passes_count == FLAKE_EXPIRY_COUNT: | ||
flake.end_date = dt.datetime.now(tz=dt.UTC) | ||
flake.end_date = test_instance.created_at | ||
|
||
flake.save() | ||
|
||
|
||
def upsert_failed_flake( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we should try and be persistent about ordering of params across functions belonging to the same module
test_instance,flake,repo_id vs. test_instance,repo_id,flake
And Ideally the most common params occur first
# retroactively mark newly caught flake as flaky failure in its rollup | ||
rollup = DailyTestRollup.objects.filter( | ||
repoid=repo_id, | ||
date=test_instance.created_at.date(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious about this .date() fn, what type does that cast to?
Saw it in the log as well and wondering why we'd want to cast it there vs just returning test_instance.created_at
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
date is a PG date
type and created_at is a PG timestamp with TZ
type, in python the equivalents are datetime.date
and datetime.datetime
. The date()
method is getting the date component of the datetime.datetime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
got it! Thanks for the clarification
@@ -68,8 +74,21 @@ def run_impl( | |||
testrun_dict_list = [] | |||
upload_list = [] | |||
|
|||
f = db_session.query(Flake).all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
technically this is flakes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, this is leftover from me debugging i think
@@ -68,8 +74,21 @@ def run_impl( | |||
testrun_dict_list = [] | |||
upload_list = [] | |||
|
|||
f = db_session.query(Flake).all() | |||
|
|||
flakes = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is repo flakes
+ duration_seconds | ||
) / ( | ||
def update_daily_total(): | ||
daily_totals[test_id]["last_duration_seconds"] = duration_seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a comment here for what the formula is? It's a bit hard to parse otherwise
tasks/test_results_processor.py
Outdated
else 0, | ||
"skip_count": 1 if outcome == str(Outcome.Skip) else 0, | ||
"flaky_fail_count": 1 | ||
if test_id in flaky_test_set and outcome == str(Outcome.Failure) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would an error be considered a flake too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah for flaky test detection we consider an error a failure, me omitting Error here is a mistake
db_session.flush() | ||
|
||
# Upsert Daily Test Totals | ||
rollup_table = DailyTestRollup.__table__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may need you to run me through this on Monday 😅
Particularly why we need to do the calculations again on conflict; I figured they'd be computed in daily_totals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah the reason we do them is because the dictionary is just an intermediate holding container, if we came across two test instances that map to the same rollup, we can't rely on the database to do the aggregating for us because there's a restriction that an insert on conflict do update
can't have two entries that would insert/update the same row. So in the dictionary above we're duplicating this logic
rollup.pass_count += 1 | ||
case TestInstance.Outcome.SKIP.value: | ||
rollup.skip_count += 1 | ||
case _: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this identical to "default" in other languages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
branch=rs.repo.branch, | ||
) | ||
|
||
traveller.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it checking for code coverage in a test file lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i actually don't know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, nice work man!
when we detect a test as newly flaky: - we want to increment the flaky_fail_count of the rollup of the test instance that was used to detect the flakiness by 1 - we want to increment the flaky_fail_count of the rollups for every test instance on that test that failed that was processed after the test instance where the flake was detected
when we process a test instance and its test is flaky and it's a failure then we want to increment the flaky_fail_count of its daily rollup
4ae1fb2
to
478e425
Compare
depends on: #699 and codecov/shared#356