-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: backfill commit data to storage #78
Conversation
73287d7
to
b8cec6e
Compare
b8cec6e
to
f92beb0
Compare
Codecov Report
@@ Coverage Diff @@
## main #78 +/- ##
========================================
Coverage 98.47% 98.48%
========================================
Files 362 364 +2
Lines 26560 26797 +237
========================================
+ Hits 26156 26392 +236
- Misses 404 405 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 1 file with indirect coverage changes This change has been scanned for critical changes. Learn more |
Now that we are writing data to GCS with acceptable degree of success we want to backfil data from existing commits. For such purpose we are introducing a new task. The present changes implement said new task. closes codecov/engineering-team#189
f92beb0
to
c58e401
Compare
) | ||
return {"success": False, "errors": [BackfillError.missing_data.value]} | ||
|
||
def handle_all_report_rows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there's already a method do do all this in the report service:
worker/services/report/__init__.py
Line 124 in 8012c86
async def initialize_and_save_report( |
Are we able to use that code? Or is there a subtle difference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The subtle difference that I've found that made me almost copy the code from there was that the initialize_and_save_report
calls save_full_report
And that actually creates an Upload
instance. We don't have one that isn't in the database, so I thought that was a no-no.
So the version here just does save_report
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One little question but otherwise look good
db_session.add(report_details) | ||
db_session.flush() | ||
|
||
repo_yaml = get_repo_yaml(commit.repository) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be get_current_yaml
instead so that it takes into account the commit's YAML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters for the operations we are doing. And using the repo_yaml saves us a request to the git provider.
However from a correctness pov - given we will be working mostly with old commits - I guess the most accurate would be to use only the commit yaml... (given that more recent changes are probably merged to the repo yaml already and might have been done in the owner yaml)
What do you think is best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that it doesn't matter. Looks like there are 2 methods being called here:
report_service.get_existing_report_for_commit_from_legacy_data
report_service.save_report
I just looked through both and AFAICT neither rely on self.current_yaml
in any way. You could probably even pass ReportService({})
here.
Now that we are writing data to GCS with acceptable degree of success we want to backfil data from existing commits.
For such purpose we are introducing a new task.
The present changes implement said new task.
closes codecov/engineering-team#189
Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.