-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: flake detection post risky migration changes #525
Conversation
Codecov ReportAttention: Patch coverage is ✅ All tests successful. No failed tests found. @@ Coverage Diff @@
## main #525 +/- ##
==========================================
+ Coverage 97.49% 97.50% +0.01%
==========================================
Files 418 420 +2
Lines 35009 35320 +311
==========================================
+ Hits 34131 34440 +309
- Misses 878 880 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. @@ Coverage Diff @@
## main #525 +/- ##
==========================================
+ Coverage 97.49% 97.50% +0.01%
==========================================
Files 418 420 +2
Lines 35009 35320 +311
==========================================
+ Hits 34131 34440 +309
- Misses 878 880 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found @@ Coverage Diff @@
## main #525 +/- ##
==========================================
+ Coverage 97.49% 97.50% +0.01%
==========================================
Files 418 420 +2
Lines 35009 35320 +311
==========================================
+ Hits 34131 34440 +309
- Misses 878 880 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov ReportAttention: Patch coverage is
Changes have been made to critical files, which contain lines commonly executed in production. Learn more ✅ All tests successful. No failed tests found. Additional details and impacted files@@ Coverage Diff @@
## main #525 +/- ##
==========================================
+ Coverage 97.51% 97.53% +0.01%
==========================================
Files 449 451 +2
Lines 35732 36043 +311
==========================================
+ Hits 34844 35153 +309
- Misses 888 890 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes This change has been scanned for critical changes. Learn more |
909572a
to
c4d5be6
Compare
c4d5be6
to
ce8d1bc
Compare
ce8d1bc
to
77ec075
Compare
tasks/process_flakes.py
Outdated
def update_passed_flakes(flake: Flake): | ||
flake.count += 1 | ||
flake.recent_passes_count += 1 | ||
if flake.recent_passes_count == 30: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this number (30) arbitrary? I might be a good candidate for a constant value with a name
tasks/process_flakes.py
Outdated
return flake_dict | ||
|
||
|
||
def update_passed_flakes(flake: Flake): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] can you make explicit that the function returns nothing, for consistency
def update_passed_flakes(flake: Flake) -> None:
tasks/process_flakes.py
Outdated
flake.save() | ||
|
||
|
||
def upsert_failed_flake(test_instance: TestInstance, repo_id, flake: Flake | None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] missing some typehints
flakes = Flake.objects.filter(repository_id=repo_id, end_date__isnull=True).all() | ||
flake_dict = dict() | ||
for flake in flakes: | ||
flake_dict[flake.test_id] = flake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible for a test_id to have more than 1 flake active at the same time?
(in which case one of them would be overwritten)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this version of the code the flakes are unique for every repo and test and there's only ever one that is "active" (end_date is None)
self.repo = RepositoryFactory() | ||
self.repo.save() | ||
self.test_count = 0 | ||
self.branch_name = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] I didn't expect branch_name
to be an integer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an incrementing integer that gets converted to a string, I suppose it should be called branch_number
instead
77ec075
to
20dd278
Compare
These changes should be committed after the risky migrations in the flake detection shared change are run. - Add reduced_error foreign key to test instance sqlalchemy models - add the process flakes task which takes in a repoid and list of commit ids as input and upserts Flake objects in the DB based on the test instances on those commits. This task is what implements the flakiness heuristic. - call the process flakes task in sync pulls on merged commits - call the process flakes task in the test results finisher when we receive test results from the main branch
Signed-off-by: joseph-sentry <[email protected]>
change branch_name to branch_number because it was confusing Signed-off-by: joseph-sentry <[email protected]>
20dd278
to
26d4cad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM. I have a slight concern about querying and loading i.e. 50,000 tests in memory (might be fine, but we'll see). We might want to consider doing paginated updates for the tests instances and doing smaller batch queries for the flakes table. But we can see how this performs before we add that.
tasks/process_flakes.py
Outdated
log = logging.getLogger(__name__) | ||
|
||
|
||
FlakeDict = dict[Any, Flake] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The database creates a text object for testid
, so let's type it appropriately.
FlakeDict = dict[Any, Flake] | |
FlakeDict = dict[str, Flake] |
tasks/process_flakes.py
Outdated
elif ( | ||
test_instance.outcome == TestInstance.Outcome.FAILURE.value | ||
or test_instance.outcome == TestInstance.Outcome.ERROR.value | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
less repetitive:
elif ( | |
test_instance.outcome == TestInstance.Outcome.FAILURE.value | |
or test_instance.outcome == TestInstance.Outcome.ERROR.value | |
): | |
elif ( | |
test_instance.outcome in (TestInstance.Outcome.FAILURE.value, tInstance.Outcome.ERROR.value) | |
): |
|
||
|
||
@time_machine.travel(dt.datetime.now(tz=dt.UTC), tick=False) | ||
def test_it_works_when_processing_commits_together(transactional_db): # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: name of test should be more specific about what "works". So, it looks like it's testing multiple commits having the same failed tests?
Also, do we still need the TODO
? (ditto for the TODO
comments in the tests below).
|
||
|
||
def generate_flake_dict(repo_id: int) -> FlakeDict: | ||
flakes = Flake.objects.filter(repository_id=repo_id, end_date__isnull=True).all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment as in #524 (comment) . Are we expecting to make better use of the index in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this query may need a separate index, because while we may be using the reduced_error_id
field in the future we will definitely not be using the test_id
tasks/process_flakes.py
Outdated
flake.save() | ||
|
||
|
||
def upsert_failed_flake(test_instance: TestInstance, repo_id: int, flake: Flake | None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing return type
upserted_flake = upsert_failed_flake(test_instance, repo_id, flake) | ||
if flake is None: | ||
flake_dict[upserted_flake.test_id] = upserted_flake | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also log success of this task? It would be useful for debugging.
These changes should be committed after the risky migrations in the
flake detection shared change are run.
ids as input and upserts Flake objects in the DB based on the test
instances on those commits. This task is what implements the
flakiness heuristic.
receive test results from the main branch