Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update flaky fail count in the process flakes task #705

Merged
merged 11 commits into from
Sep 25, 2024

Conversation

joseph-sentry
Copy link
Contributor

depends on: #699 and codecov/shared#356

Copy link

This PR includes changes to shared. Please review them here: codecov/shared@42f83ec...5cc5f48

@codecov-staging
Copy link

codecov-staging bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 97.16312% with 4 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
tasks/test_results_processor.py 96.15% 2 Missing ⚠️
tasks/process_flakes.py 91.66% 1 Missing ⚠️
tasks/tests/unit/test_process_flakes.py 98.21% 1 Missing ⚠️

Impacted file tree graph

@@           Coverage Diff            @@
##             main     #705    +/-   ##
========================================
  Coverage   98.05%   98.06%            
========================================
  Files         434      432     -2     
  Lines       36466    36273   -193     
========================================
- Hits        35758    35570   -188     
+ Misses        708      703     -5     
Flag Coverage Δ
integration 98.06% <97.16%> (+<0.01%) ⬆️
latest-uploader-overall 98.06% <97.16%> (+<0.01%) ⬆️
unit 98.06% <97.16%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.96% <95.38%> (+<0.01%) ⬆️
OutsideTasks 98.07% <100.00%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
database/models/reports.py 99.50% <100.00%> (+<0.01%) ⬆️
database/tests/factories/reports.py 100.00% <100.00%> (ø)
...sks/tests/unit/test_test_results_processor_task.py 100.00% <100.00%> (ø)
tasks/process_flakes.py 98.46% <91.66%> (-1.54%) ⬇️
tasks/tests/unit/test_process_flakes.py 98.55% <98.21%> (-0.14%) ⬇️
tasks/test_results_processor.py 98.19% <96.15%> (ø)

... and 1 file with indirect coverage changes

@codecov-qa
Copy link

codecov-qa bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 97.16312% with 4 lines in your changes missing coverage. Please review.

Project coverage is 98.06%. Comparing base (e63488d) to head (478e425).
Report is 2 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
tasks/test_results_processor.py 96.15% 2 Missing ⚠️
tasks/process_flakes.py 91.66% 1 Missing ⚠️
tasks/tests/unit/test_process_flakes.py 98.21% 1 Missing ⚠️

Impacted file tree graph

@@           Coverage Diff            @@
##             main     #705    +/-   ##
========================================
  Coverage   98.05%   98.06%            
========================================
  Files         434      432     -2     
  Lines       36466    36273   -193     
========================================
- Hits        35758    35570   -188     
+ Misses        708      703     -5     
Flag Coverage Δ
integration 98.06% <97.16%> (+<0.01%) ⬆️
latest-uploader-overall 98.06% <97.16%> (+<0.01%) ⬆️
unit 98.06% <97.16%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.96% <95.38%> (+<0.01%) ⬆️
OutsideTasks 98.07% <100.00%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
database/models/reports.py 99.50% <100.00%> (+<0.01%) ⬆️
database/tests/factories/reports.py 100.00% <100.00%> (ø)
...sks/tests/unit/test_test_results_processor_task.py 100.00% <100.00%> (ø)
tasks/process_flakes.py 98.46% <91.66%> (-1.54%) ⬇️
tasks/tests/unit/test_process_flakes.py 98.55% <98.21%> (-0.14%) ⬇️
tasks/test_results_processor.py 98.19% <96.15%> (ø)

... and 1 file with indirect coverage changes

Copy link

codecov bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 97.16312% with 4 lines in your changes missing coverage. Please review.

Project coverage is 98.11%. Comparing base (e63488d) to head (478e425).
Report is 2 commits behind head on main.

Changes have been made to critical files, which contain lines commonly executed in production. Learn more

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
tasks/test_results_processor.py 96.15% 2 Missing ⚠️
tasks/process_flakes.py 91.66% 1 Missing ⚠️
tasks/tests/unit/test_process_flakes.py 98.21% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff            @@
##             main     #705    +/-   ##
========================================
  Coverage   98.10%   98.11%            
========================================
  Files         475      473     -2     
  Lines       37820    37627   -193     
========================================
- Hits        37105    36917   -188     
+ Misses        715      710     -5     
Flag Coverage Δ
integration 98.06% <97.16%> (+<0.01%) ⬆️
latest-uploader-overall 98.06% <97.16%> (+<0.01%) ⬆️
unit 98.06% <97.16%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.08% <95.38%> (+<0.01%) ⬆️
OutsideTasks 98.07% <100.00%> (+<0.01%) ⬆️
Files with missing lines Coverage Δ
database/models/reports.py 99.50% <100.00%> (+<0.01%) ⬆️
database/tests/factories/reports.py 100.00% <100.00%> (ø)
...sks/tests/unit/test_test_results_processor_task.py 100.00% <100.00%> (ø)
tasks/process_flakes.py 98.46% <91.66%> (-1.54%) ⬇️
tasks/tests/unit/test_process_flakes.py 98.55% <98.21%> (-0.14%) ⬇️
tasks/test_results_processor.py 98.19% <96.15%> (ø)

... and 1 file with indirect coverage changes

Related Entrypoints
run/app.tasks.test_results.TestResultsProcessor

Copy link

codecov-public-qa bot commented Sep 12, 2024

Codecov Report

Attention: Patch coverage is 97.16312% with 4 lines in your changes missing coverage. Please review.

Project coverage is 98.06%. Comparing base (e63488d) to head (478e425).

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@           Coverage Diff            @@
##             main     #705    +/-   ##
========================================
  Coverage   98.05%   98.06%            
========================================
  Files         434      432     -2     
  Lines       36466    36273   -193     
========================================
- Hits        35758    35570   -188     
+ Misses        708      703     -5     
Flag Coverage Δ
integration 98.06% <97.16%> (+<0.01%) ⬆️
latest-uploader-overall 98.06% <97.16%> (+<0.01%) ⬆️
unit 98.06% <97.16%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 95.96% <95.38%> (+<0.01%) ⬆️
OutsideTasks 98.07% <100.00%> (+<0.01%) ⬆️
Files Coverage Δ
database/models/reports.py 99.50% <100.00%> (+<0.01%) ⬆️
database/tests/factories/reports.py 100.00% <100.00%> (ø)
...sks/tests/unit/test_test_results_processor_task.py 100.00% <100.00%> (ø)
tasks/process_flakes.py 98.46% <91.66%> (-1.54%) ⬇️
tasks/tests/unit/test_process_flakes.py 98.55% <98.21%> (-0.14%) ⬇️
tasks/test_results_processor.py 98.19% <96.15%> (ø)

... and 1 file with indirect coverage changes

Copy link

This PR includes changes to shared. Please review them here: codecov/shared@c9d8b87...403d5a2

@joseph-sentry joseph-sentry requested review from ajay-sentry and a team September 20, 2024 18:08
@joseph-sentry joseph-sentry marked this pull request as ready for review September 20, 2024 18:09
if flake.recent_passes_count == FLAKE_EXPIRY_COUNT:
flake.end_date = dt.datetime.now(tz=dt.UTC)
flake.end_date = test_instance.created_at

flake.save()


def upsert_failed_flake(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should try and be persistent about ordering of params across functions belonging to the same module

test_instance,flake,repo_id vs. test_instance,repo_id,flake

And Ideally the most common params occur first

# retroactively mark newly caught flake as flaky failure in its rollup
rollup = DailyTestRollup.objects.filter(
repoid=repo_id,
date=test_instance.created_at.date(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious about this .date() fn, what type does that cast to?

Saw it in the log as well and wondering why we'd want to cast it there vs just returning test_instance.created_at

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

date is a PG date type and created_at is a PG timestamp with TZ type, in python the equivalents are datetime.date and datetime.datetime. The date() method is getting the date component of the datetime.datetime

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it! Thanks for the clarification

@@ -68,8 +74,21 @@ def run_impl(
testrun_dict_list = []
upload_list = []

f = db_session.query(Flake).all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically this is flakes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, this is leftover from me debugging i think

@@ -68,8 +74,21 @@ def run_impl(
testrun_dict_list = []
upload_list = []

f = db_session.query(Flake).all()

flakes = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is repo flakes

+ duration_seconds
) / (
def update_daily_total():
daily_totals[test_id]["last_duration_seconds"] = duration_seconds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a comment here for what the formula is? It's a bit hard to parse otherwise

else 0,
"skip_count": 1 if outcome == str(Outcome.Skip) else 0,
"flaky_fail_count": 1
if test_id in flaky_test_set and outcome == str(Outcome.Failure)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would an error be considered a flake too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah for flaky test detection we consider an error a failure, me omitting Error here is a mistake

db_session.flush()

# Upsert Daily Test Totals
rollup_table = DailyTestRollup.__table__
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may need you to run me through this on Monday 😅

Particularly why we need to do the calculations again on conflict; I figured they'd be computed in daily_totals

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah the reason we do them is because the dictionary is just an intermediate holding container, if we came across two test instances that map to the same rollup, we can't rely on the database to do the aggregating for us because there's a restriction that an insert on conflict do update can't have two entries that would insert/update the same row. So in the dictionary above we're duplicating this logic

rollup.pass_count += 1
case TestInstance.Outcome.SKIP.value:
rollup.skip_count += 1
case _:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this identical to "default" in other languages?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

branch=rs.repo.branch,
)

traveller.stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it checking for code coverage in a test file lol

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i actually don't know

Copy link
Contributor

@ajay-sentry ajay-sentry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, nice work man!

when we detect a test as newly flaky:
    - we want to increment the flaky_fail_count of the rollup of the
      test instance that was used to detect the flakiness by 1
    - we want to increment the flaky_fail_count of the rollups for
      every test instance on that test that failed that was processed
      after the test instance where the flake was detected
when we process a test instance and its test is flaky and it's a
failure then we want to increment the flaky_fail_count of its daily
rollup
@joseph-sentry joseph-sentry added this pull request to the merge queue Sep 25, 2024
Merged via the queue into main with commit a2d2110 Sep 25, 2024
24 of 27 checks passed
@joseph-sentry joseph-sentry deleted the joseph/update-flaky-fail-count branch September 25, 2024 13:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants