feat: flake detection post risky migration changes #525

joseph-sentry · 2024-06-26T16:08:21Z

These changes should be committed after the risky migrations in the
flake detection shared change are run.

Add reduced_error foreign key to test instance sqlalchemy models
add the process flakes task which takes in a repoid and list of commit
ids as input and upserts Flake objects in the DB based on the test
instances on those commits. This task is what implements the
flakiness heuristic.
call the process flakes task in sync pulls on merged commits
call the process flakes task in the test results finisher when we
receive test results from the main branch

codecov-notifications · 2024-06-26T16:14:31Z

Codecov Report

Attention: Patch coverage is 99.04762% with 3 lines in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

@@            Coverage Diff             @@
##             main     #525      +/-   ##
==========================================
+ Coverage   97.49%   97.50%   +0.01%     
==========================================
  Files         418      420       +2     
  Lines       35009    35320     +311     
==========================================
+ Hits        34131    34440     +309     
- Misses        878      880       +2

Flag	Coverage Δ
integration	`97.50% <99.04%> (+0.01%)`	⬆️
latest-uploader-overall	`97.50% <99.04%> (+0.01%)`	⬆️
unit	`97.50% <99.04%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.58% <100.00%> (+0.03%)`	⬆️
OutsideTasks	`97.75% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
database/models/reports.py	`99.46% <100.00%> (+<0.01%)`	⬆️
tasks/process_flakes.py	`100.00% <100.00%> (ø)`
tasks/sync_pull.py	`98.87% <100.00%> (+0.03%)`	⬆️
tasks/test_results_finisher.py	`97.01% <100.00%> (+0.09%)`	⬆️
tasks/test_results_processor.py	`99.31% <ø> (ø)`
tasks/tests/unit/test_sync_pull.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_process_flakes.py	`98.54% <98.54%> (ø)`

... and 1 file with indirect coverage changes

codecov-qa · 2024-06-26T16:14:33Z

Codecov Report

Attention: Patch coverage is 99.04762% with 3 lines in your changes missing coverage. Please review.

Project coverage is 97.50%. Comparing base (8eb5cc4) to head (91165d1).

✅ All tests successful. No failed tests found.

@@            Coverage Diff             @@
##             main     #525      +/-   ##
==========================================
+ Coverage   97.49%   97.50%   +0.01%     
==========================================
  Files         418      420       +2     
  Lines       35009    35320     +311     
==========================================
+ Hits        34131    34440     +309     
- Misses        878      880       +2

Flag	Coverage Δ
integration	`97.50% <99.04%> (+0.01%)`	⬆️
latest-uploader-overall	`97.50% <99.04%> (+0.01%)`	⬆️
unit	`97.50% <99.04%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.58% <100.00%> (+0.03%)`	⬆️
OutsideTasks	`97.75% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
database/models/reports.py	`99.46% <100.00%> (+<0.01%)`	⬆️
tasks/process_flakes.py	`100.00% <100.00%> (ø)`
tasks/sync_pull.py	`98.87% <100.00%> (+0.03%)`	⬆️
tasks/test_results_finisher.py	`97.01% <100.00%> (+0.09%)`	⬆️
tasks/test_results_processor.py	`99.31% <ø> (ø)`
tasks/tests/unit/test_sync_pull.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_process_flakes.py	`98.54% <98.54%> (ø)`

... and 1 file with indirect coverage changes

codecov-public-qa · 2024-06-26T16:14:53Z

Codecov Report

Attention: Patch coverage is 99.04762% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 97.50%. Comparing base (8eb5cc4) to head (91165d1).

✅ All tests successful. No failed tests found ☺️

@@            Coverage Diff             @@
##             main     #525      +/-   ##
==========================================
+ Coverage   97.49%   97.50%   +0.01%     
==========================================
  Files         418      420       +2     
  Lines       35009    35320     +311     
==========================================
+ Hits        34131    34440     +309     
- Misses        878      880       +2

Flag	Coverage Δ
integration	`97.50% <99.04%> (+0.01%)`	⬆️
latest-uploader-overall	`97.50% <99.04%> (+0.01%)`	⬆️
unit	`97.50% <99.04%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.58% <100.00%> (+0.03%)`	⬆️
OutsideTasks	`97.75% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
database/models/reports.py	`99.46% <100.00%> (+<0.01%)`	⬆️
tasks/process_flakes.py	`100.00% <100.00%> (ø)`
tasks/sync_pull.py	`98.87% <100.00%> (+0.03%)`	⬆️
tasks/test_results_finisher.py	`97.01% <100.00%> (+0.09%)`	⬆️
tasks/test_results_processor.py	`99.31% <ø> (ø)`
tasks/tests/unit/test_sync_pull.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_process_flakes.py	`98.54% <98.54%> (ø)`

... and 1 file with indirect coverage changes

codecov · 2024-06-26T16:16:30Z

Codecov Report

Attention: Patch coverage is 99.04762% with 3 lines in your changes missing coverage. Please review.

Project coverage is 97.53%. Comparing base (8eb5cc4) to head (91165d1).

Changes have been made to critical files, which contain lines commonly executed in production. Learn more

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #525      +/-   ##
==========================================
+ Coverage   97.51%   97.53%   +0.01%     
==========================================
  Files         449      451       +2     
  Lines       35732    36043     +311     
==========================================
+ Hits        34844    35153     +309     
- Misses        888      890       +2

Flag	Coverage Δ
integration	`97.50% <99.04%> (+0.01%)`	⬆️
latest-uploader-overall	`97.50% <99.04%> (+0.01%)`	⬆️
unit	`97.50% <99.04%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`94.63% <100.00%> (+0.03%)`	⬆️
OutsideTasks	`97.75% <100.00%> (+<0.01%)`	⬆️

Files	Coverage Δ
database/models/reports.py	`99.46% <100.00%> (+<0.01%)`	⬆️
tasks/process_flakes.py	`100.00% <100.00%> (ø)`
tasks/sync_pull.py	`98.87% <100.00%> (+0.03%)`	⬆️
tasks/test_results_finisher.py	`97.76% <100.00%> (+0.06%)`	⬆️
tasks/test_results_processor.py	`99.31% <ø> (ø)`
tasks/tests/unit/test_sync_pull.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_test_results_finisher.py	`100.00% <100.00%> (ø)`
tasks/tests/unit/test_process_flakes.py	`98.54% <98.54%> (ø)`

... and 1 file with indirect coverage changes

This change has been scanned for critical changes. Learn more

giovanni-guidini · 2024-07-03T10:31:17Z

tasks/process_flakes.py

+def update_passed_flakes(flake: Flake):
+    flake.count += 1
+    flake.recent_passes_count += 1
+    if flake.recent_passes_count == 30:


Is this number (30) arbitrary? I might be a good candidate for a constant value with a name

giovanni-guidini · 2024-07-03T10:32:12Z

tasks/process_flakes.py

+    return flake_dict
+
+
+def update_passed_flakes(flake: Flake):


[nit] can you make explicit that the function returns nothing, for consistency

def update_passed_flakes(flake: Flake) -> None:

giovanni-guidini · 2024-07-03T10:32:45Z

tasks/process_flakes.py

+    flake.save()
+
+
+def upsert_failed_flake(test_instance: TestInstance, repo_id, flake: Flake | None):


[nit] missing some typehints

giovanni-guidini · 2024-07-03T10:35:01Z

tasks/process_flakes.py

+    flakes = Flake.objects.filter(repository_id=repo_id, end_date__isnull=True).all()
+    flake_dict = dict()
+    for flake in flakes:
+        flake_dict[flake.test_id] = flake


Is it possible for a test_id to have more than 1 flake active at the same time?
(in which case one of them would be overwritten)

in this version of the code the flakes are unique for every repo and test and there's only ever one that is "active" (end_date is None)

giovanni-guidini · 2024-07-03T10:39:17Z

tasks/tests/unit/test_process_flakes.py

+        self.repo = RepositoryFactory()
+        self.repo.save()
+        self.test_count = 0
+        self.branch_name = 0


[nit] I didn't expect branch_name to be an integer

it's an incrementing integer that gets converted to a string, I suppose it should be called branch_number instead

These changes should be committed after the risky migrations in the flake detection shared change are run. - Add reduced_error foreign key to test instance sqlalchemy models - add the process flakes task which takes in a repoid and list of commit ids as input and upserts Flake objects in the DB based on the test instances on those commits. This task is what implements the flakiness heuristic. - call the process flakes task in sync pulls on merged commits - call the process flakes task in the test results finisher when we receive test results from the main branch

Signed-off-by: joseph-sentry <[email protected]>

change branch_name to branch_number because it was confusing Signed-off-by: joseph-sentry <[email protected]>

michelletran-codecov

Generally LGTM. I have a slight concern about querying and loading i.e. 50,000 tests in memory (might be fine, but we'll see). We might want to consider doing paginated updates for the tests instances and doing smaller batch queries for the flakes table. But we can see how this performs before we add that.

michelletran-codecov · 2024-07-03T18:11:53Z

tasks/process_flakes.py

+log = logging.getLogger(__name__)
+
+
+FlakeDict = dict[Any, Flake]


The database creates a text object for testid, so let's type it appropriately.

Suggested change

FlakeDict = dict[Any, Flake]

FlakeDict = dict[str, Flake]

michelletran-codecov · 2024-07-03T18:19:56Z

tasks/process_flakes.py

+                elif (
+                    test_instance.outcome == TestInstance.Outcome.FAILURE.value
+                    or test_instance.outcome == TestInstance.Outcome.ERROR.value
+                ):


less repetitive:

Suggested change

elif (

test_instance.outcome == TestInstance.Outcome.FAILURE.value

or test_instance.outcome == TestInstance.Outcome.ERROR.value

):

elif (

test_instance.outcome in (TestInstance.Outcome.FAILURE.value, tInstance.Outcome.ERROR.value)

):

michelletran-codecov · 2024-07-03T18:33:16Z

tasks/tests/unit/test_process_flakes.py

+
+
+@time_machine.travel(dt.datetime.now(tz=dt.UTC), tick=False)
+def test_it_works_when_processing_commits_together(transactional_db):  # TODO


nit: name of test should be more specific about what "works". So, it looks like it's testing multiple commits having the same failed tests?

Also, do we still need the TODO? (ditto for the TODO comments in the tests below).

michelletran-codecov · 2024-07-03T18:47:54Z

tasks/process_flakes.py

+
+
+def generate_flake_dict(repo_id: int) -> FlakeDict:
+    flakes = Flake.objects.filter(repository_id=repo_id, end_date__isnull=True).all()


Similar comment as in #524 (comment) . Are we expecting to make better use of the index in the future?

this query may need a separate index, because while we may be using the reduced_error_id field in the future we will definitely not be using the test_id

michelletran-codecov · 2024-07-03T18:52:37Z

tasks/process_flakes.py

+    flake.save()
+
+
+def upsert_failed_flake(test_instance: TestInstance, repo_id: int, flake: Flake | None):


missing return type

michelletran-codecov · 2024-07-03T19:02:30Z

tasks/process_flakes.py

+                    upserted_flake = upsert_failed_flake(test_instance, repo_id, flake)
+                    if flake is None:
+                        flake_dict[upserted_flake.test_id] = upserted_flake
+


Can we also log success of this task? It would be useful for debugging.

joseph-sentry force-pushed the joseph/flakes-post-risky branch from 909572a to c4d5be6 Compare June 27, 2024 17:09

joseph-sentry changed the title ~~feat: flake detection post risky migraiton changes~~ feat: flake detection post risky migration changes Jun 27, 2024

joseph-sentry force-pushed the joseph/flakes-post-risky branch from c4d5be6 to ce8d1bc Compare July 2, 2024 17:13

joseph-sentry marked this pull request as ready for review July 2, 2024 18:42

joseph-sentry requested a review from a team July 2, 2024 18:42

joseph-sentry force-pushed the joseph/flakes-post-risky branch from ce8d1bc to 77ec075 Compare July 2, 2024 18:54

giovanni-guidini reviewed Jul 3, 2024

View reviewed changes

joseph-sentry force-pushed the joseph/flakes-post-risky branch from 77ec075 to 20dd278 Compare July 3, 2024 13:22

joseph-sentry added 3 commits July 3, 2024 10:57

fix: address feedback

21bdfa1

Signed-off-by: joseph-sentry <[email protected]>

fix: address feedback in tests

26d4cad

change branch_name to branch_number because it was confusing Signed-off-by: joseph-sentry <[email protected]>

joseph-sentry force-pushed the joseph/flakes-post-risky branch from 20dd278 to 26d4cad Compare July 3, 2024 14:59

joseph-sentry requested a review from michelletran-codecov July 3, 2024 16:15

michelletran-codecov reviewed Jul 3, 2024

View reviewed changes

fix: address feedback and add metrics

91165d1

michelletran-codecov approved these changes Jul 3, 2024

View reviewed changes

joseph-sentry enabled auto-merge July 3, 2024 20:16

joseph-sentry added this pull request to the merge queue Jul 3, 2024

Merged via the queue into main with commit a920fba Jul 3, 2024
29 of 30 checks passed

joseph-sentry deleted the joseph/flakes-post-risky branch July 3, 2024 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: flake detection post risky migration changes #525

feat: flake detection post risky migration changes #525

joseph-sentry commented Jun 26, 2024

codecov-notifications bot commented Jun 26, 2024 •

edited

Loading

codecov-qa bot commented Jun 26, 2024 •

edited

Loading

codecov-public-qa bot commented Jun 26, 2024 •

edited

Loading

codecov bot commented Jun 26, 2024 •

edited

Loading

giovanni-guidini Jul 3, 2024

giovanni-guidini Jul 3, 2024

giovanni-guidini Jul 3, 2024

giovanni-guidini Jul 3, 2024

joseph-sentry Jul 3, 2024

giovanni-guidini Jul 3, 2024

joseph-sentry Jul 3, 2024

michelletran-codecov left a comment

michelletran-codecov Jul 3, 2024

michelletran-codecov Jul 3, 2024

michelletran-codecov Jul 3, 2024

michelletran-codecov Jul 3, 2024

joseph-sentry Jul 3, 2024

michelletran-codecov Jul 3, 2024

michelletran-codecov Jul 3, 2024

		flake.save()


		def upsert_failed_flake(test_instance: TestInstance, repo_id, flake: Flake \| None):

		log = logging.getLogger(__name__)


		FlakeDict = dict[Any, Flake]



		@time_machine.travel(dt.datetime.now(tz=dt.UTC), tick=False)
		def test_it_works_when_processing_commits_together(transactional_db): # TODO



		def generate_flake_dict(repo_id: int) -> FlakeDict:
		flakes = Flake.objects.filter(repository_id=repo_id, end_date__isnull=True).all()

feat: flake detection post risky migration changes #525

feat: flake detection post risky migration changes #525

Conversation

joseph-sentry commented Jun 26, 2024

codecov-notifications bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov-qa bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov-public-qa bot commented Jun 26, 2024 • edited Loading

Codecov Report

codecov bot commented Jun 26, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

michelletran-codecov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-notifications bot commented Jun 26, 2024 •

edited

Loading

codecov-qa bot commented Jun 26, 2024 •

edited

Loading

codecov-public-qa bot commented Jun 26, 2024 •

edited

Loading

codecov bot commented Jun 26, 2024 •

edited

Loading