Different handling label analysis if result already calculated #21

giovanni-guidini · 2023-07-19T14:50:17Z

This change is intended to work with the PATCHing of labels by the CLI after collecting them.
Currently doing that doesn't trigger another task in the worker. SO if the labels arraive after we finished calculating we are left with incomplete results on the database. That is fine as the CLI will calculate again.

I think this change might have been a bit premature, but the idea is to have a different way of handling the calculation if we already have results. Then we simply calculate the final result using the saved labels and add in the requested labels that we might not have had the first time around. This would be much much faster than calculating everything again.

Legal Boilerplate

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

codecov · 2023-07-19T14:55:41Z

Codecov Report

Merging #21 (03509a2) into main (3056f09) will decrease coverage by 0.01%.
The diff coverage is 95.45%.

@@            Coverage Diff             @@
##             main      #21      +/-   ##
==========================================
- Coverage   98.48%   98.48%   -0.01%     
==========================================
  Files         364      364              
  Lines       26797    26818      +21     
==========================================
+ Hits        26392    26412      +20     
- Misses        405      406       +1

Flag	Coverage Δ
integration	`98.45% <95.45%> (-0.01%)`	⬇️
latest-uploader-overall	`98.45% <95.45%> (-0.01%)`	⬇️
onlysomelabels	`98.48% <95.45%> (-0.01%)`	⬇️
unit	`98.45% <95.45%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
NonTestCode	`97.14% <92.30%> (-0.01%)`	⬇️
OutsideTasks	`98.25% <ø> (ø)`

Files Changed	Coverage Δ
tasks/label_analysis.py	`99.37% <92.30%> (-0.63%)`	⬇️
tasks/tests/unit/test_label_analysis.py	`100.00% <100.00%> (ø)`

This change has been scanned for critical changes. Learn more

scott-codecov

Not sure I entirely understand the workflow here but the code changes look OK to me.

So the CLI will PATCH the label analysis w/ additional labels and that triggers off another task that presumably hits this LabelAnalysisRequestState.FINISHED conditional. In that case we just combine the previous results with the calculated results for the new labels. Is that roughly correct?

This change is intended to work with the PATCHing of labels by the CLI after collecting them. Currently doing that doesn't trigger another task in the worker. SO if the labels arraive after we finished calculating we are left with incomplete results on the database. That is fine as the CLI will calculate again. I think this change might have been a bit premature, but the idea is to have a different way of handling the calculation if we already have results. Then we simply calculate the final result using the saved labels and add in the requested labels that we might not have had the first time around. This would be much much faster than calculating everything again.

Updating branch and making sure all tests are OK.

giovanni-guidini · 2023-09-11T16:26:46Z

Not sure I entirely understand the workflow here but the code changes look OK to me.

So the CLI will PATCH the label analysis w/ additional labels and that triggers off another task that presumably hits this LabelAnalysisRequestState.FINISHED conditional. In that case we just combine the previous results with the calculated results for the new labels. Is that roughly correct?

Yes, that's the idea.
The thing that happens today is:

CLI sends a POST request to api that creates the original LabelAnalysisRequest instance and triggers this task. Initially this has no requested_labels.

Later on - after collecting the labels present - the CLI will PATCH the labels in the api, but (right now) the task will not be re-triggered. This is not a problem for using ATS because the CLI can sort out what labels to include / remove locally.

The thing is that we want to have metrics around how many tests ATS is saving. For that reason we need to re-trigger the LabelAnalysis task with the larq that has already been calculated and update the result.

test_saved = (global_labels + absent_labels + present_diff_labels) / (absent_labels + present_report_labels)

From codecov/worker#21 we have a different way of handling updates for `LabelAnalysisRequest` objects. This is important because we want to use these objects to extract metrics from ATS. However to actually have the updated values we need to trigger the task again. These changes trigger the task again :3 closes codecov/engineering-team#456

giovanni-guidini requested a review from scott-codecov July 19, 2023 14:50

scott-codecov approved these changes Jul 20, 2023

View reviewed changes

giovanni-guidini added 2 commits September 11, 2023 12:58

chore: Update branch

03509a2

Updating branch and making sure all tests are OK.

giovanni-guidini force-pushed the gio/label-analysis-skip-if-complete branch from 1d26d8f to 03509a2 Compare September 11, 2023 16:14

giovanni-guidini requested a review from scott-codecov September 11, 2023 16:21

scott-codecov approved these changes Sep 11, 2023

View reviewed changes

giovanni-guidini merged commit 6d39922 into main Sep 11, 2023
9 of 12 checks passed

giovanni-guidini deleted the gio/label-analysis-skip-if-complete branch September 11, 2023 18:50

giovanni-guidini mentioned this pull request Sep 13, 2023

feat: trigger label analysis task after update codecov/codecov-api#131

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Different handling label analysis if result already calculated #21

Different handling label analysis if result already calculated #21

giovanni-guidini commented Jul 19, 2023

codecov bot commented Jul 19, 2023 •

edited

Loading

scott-codecov left a comment

giovanni-guidini commented Sep 11, 2023 •

edited

Loading

Different handling label analysis if result already calculated #21

Different handling label analysis if result already calculated #21

Conversation

giovanni-guidini commented Jul 19, 2023

Legal Boilerplate

codecov bot commented Jul 19, 2023 • edited Loading

Codecov Report

scott-codecov left a comment

Choose a reason for hiding this comment

giovanni-guidini commented Sep 11, 2023 • edited Loading

codecov bot commented Jul 19, 2023 •

edited

Loading

giovanni-guidini commented Sep 11, 2023 •

edited

Loading