-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow for encoded labels in Report
#189
Conversation
Updates shared to the version where `Report` has a `_labels_index` and makes use of them in processing.
context: codecov/engineering-team #768 Creates a rollout for label compression. This is gonna help us to test and safely release the feature in the wild. Notice that currently the label compression does nothing. There are comments similar to `TODO: needs shared update`. The update in question is codecov/shared#79 So these changes mostly prep the terrain for future changes that will actually do something, and add the guardrails to avoid issues when deploying. In particular it adds some helper methods to the `ArchiveService`, creates a kinda-stubbed `LabelsIndexService`, passes more data to the `_adjust_sessions` portion of `raw_upload_processor` where most changes will occur. If you're curious as to what the end result will probably look like see #180
These changes extend the `Report` to use the `_labels_index`. Everything is controlled via a rollout, so in practice there's a single repo that will use the compressed labels, my sentry fork (giovanni-guidini/sentry). For testing purposes. There are 2 important facts here: 1. Given worker / shared pairs we can only have (old, old) or (new, new). where "old" denote the code that doesn't do label encoding, and "new" the code that does. 2. The (new, new) pair can handle the cases the (old, old) does and the new one of label encoding, but the (old, old) pair will generate corrupted results if it tries to process an encoded report. SO it's important to woll out the (new, new) pair into prod _before_ turning the feature on for everyone. Which is what we are doing, so we're fine (probably). At first I was very much in the "let's tear everything down and build again", but because the (new, new) pair needs to handle the unencoded versions as well I decided to make smaller changes so the code can co-exist in harmony.
Codecov Report
@@ Coverage Diff @@
## main #189 +/- ##
==========================================
- Coverage 98.35% 98.35% -0.01%
==========================================
Files 353 357 +4
Lines 27844 28576 +732
==========================================
+ Hits 27386 28105 +719
- Misses 458 471 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #189 +/- ##
==========================================
- Coverage 98.35% 98.35% -0.01%
==========================================
Files 353 357 +4
Lines 27844 28576 +732
==========================================
+ Hits 27386 28105 +719
- Misses 458 471 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Codecov Report
Changes have been made to critical files, which contain lines commonly executed in production. Learn more Additional details and impacted files@@ Coverage Diff @@
## main #189 +/- ##
==========================================
- Coverage 98.31% 98.31% -0.01%
==========================================
Files 382 386 +4
Lines 28489 29222 +733
==========================================
+ Hits 28010 28730 +720
- Misses 479 492 +13
Flags with carried forward coverage won't be shown. Click here to find out more.
|
context: codecov/engineering-team #768 Creates a rollout for label compression. This is gonna help us to test and safely release the feature in the wild. Notice that currently the label compression does nothing. There are comments similar to `TODO: needs shared update`. The update in question is codecov/shared#79 So these changes mostly prep the terrain for future changes that will actually do something, and add the guardrails to avoid issues when deploying. In particular it adds some helper methods to the `ArchiveService`, creates a kinda-stubbed `LabelsIndexService`, passes more data to the `_adjust_sessions` portion of `raw_upload_processor` where most changes will occur. If you're curious as to what the end result will probably look like see #180
1e257c1
to
bb537db
Compare
When generating the report the labels_index was being generaged with the SpecialLabel in the index. That's a problem because: 1. It's different from the default index wheere you have the corresponding label, so you end up having 2 indexes for the same label 2. THe enum value is not JSON serializable By already using the corresponding label we solve both problems at once.
While processing reports they get merged into a temporary report (because there might be multiple reports in a single upload) THe temporary reports is what ultimately gets merged with the original one, so the labels need to be passed to it.
WHen carrying forward the report we also need to carryforward the label index, if the parent_commit has one.
a4febe6
to
e004bd6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dang, this gets complicated. I almost wonder if it'd be easier to make sure the next label request (after this is deployed) just returns ALL labels and then we can just carry on from there using label IDs only. Not entirely sure how to make that happen and maybe there's additional nuance that makes such an approach infeasible. This is just a lot of logic to handle the 2 potential label encodings
# 1. It's necessary for labels flags to be carryforward, so it's ok to carryforward the entire index | ||
# 2. As tests are renamed the index might start to be filled with stale labels. This is not good. | ||
# but I'm unsure if we should try to clean it up at this point. Cleaning it up requires going through | ||
# all lines of the report. It might be better suited for an offline job. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree - could this happen in the upload finisher perhaps after triggering the notification task?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure... certainly after the notification task. But I'm not sure about the upload finisher because it is scheduled for every upload
task. It does have a lock for it, but maybe I'll have to implement a multi-reader/single write lock for the labels file?
My biggest concern with putting it in the upload finisher is that there might still be uploads being actively processed at the same time. I think the cleanup should happen a few hours after the last upload came in, to be extra sure.
|
||
def __init__(self, commit_report: CommitReport) -> None: | ||
self.commit_report = commit_report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious about this change - what's wrong with passing in the commit report?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's nothing wrong with doing that per se, but it doesn't match all usages.
In the carry forward bit the commit_report
doesn't exist yet - or I don't have access to it (for the parent commit) - and yet I create LabelIndexServices
for the commits. So I needed to be able to create them without the commit_report
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have since taken Matt's suggestion around that bit of code tho, so I can revert back the code if preferred.
@@ -22,23 +22,54 @@ def matches_content(self, content, first_line, name) -> bool: | |||
and "files" in content | |||
) | |||
|
|||
def _convert_testname_to_label(self, testname, labels_table): | |||
def _convert_testname_to_label(self, testname) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find the testname
terminology a little confusing since I wasn't sure what the difference between a label and a test name is. Maybe _normalize_label
would describe this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's a valid change. I'll do that.
Honestly I don't know what the exact difference between "label" and "test name" was meant to be (by Thiago).
My working mental model is that "label" is a more flexible term than "test name", because "test name" obviously denotes a specific test name. While a "label" might denote a group of tests, and might not be generated by the testing tool at all.
@@ -173,7 +191,9 @@ def create_coverage_line( | |||
coverage, | |||
*, | |||
coverage_type: CoverageType, | |||
labels_list_of_lists: typing.List[typing.Union[str, SpecialLabelsEnum]] = None, | |||
labels_list_of_lists: Union[ | |||
List[Union[str, SpecialLabelsEnum]], List[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤯
@@ -29,6 +31,8 @@ | |||
SpecialLabelsEnum.CODECOV_ALL_LABELS_PLACEHOLDER.corresponding_label | |||
) | |||
|
|||
GLOBAL_LEVEL_LABEL_IDX = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this should go deeper in the report service or shared
or something. I saw 0
explicitly referenced elsewhere and should maybe use this constant instead
@@ -40,6 +44,24 @@ class LinesRelevantToChange(TypedDict): | |||
files: Dict[str, Optional[LinesRelevantToChangeInFile]] | |||
|
|||
|
|||
class ExistingLabelSetsEncoded(NamedTuple): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of NamedTuple
over using @dataclass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No idea, really. But originally this was a tuple, so I thought I'd keep it similar to a tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently there are considerations to be made: https://stackoverflow.com/questions/51671699/data-classes-vs-typing-namedtuple-primary-use-cases
But the usage is quite small for this case in particular.
What I want out of this is typing info, to improve code readability.
I'm OK changing to dataclass if preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't really understand what's happening, i should maybe pause and ramp up and then try reviewing again
if report_builder_session.should_use_label_index: | ||
label_list_of_lists = [ | ||
self._get_list_of_label_ids( | ||
report_builder_session.label_index, | ||
file_coverage.get("contexts", {}).get(str(ln), []), | ||
) | ||
] | ||
else: | ||
label_list_of_lists = [ | ||
[self._convert_testname_to_label(testname)] | ||
for testname in file_coverage.get("contexts", {}).get( | ||
str(ln), [] | ||
) | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
top branch creates an outer list containing a single large inner list, bottom branch creates an outer list containing many single-entry inner lists. is it okay that they don't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
THat's a very good catch. There is a difference indeed.
This goes to ReportBuilderSession.create_coverage_line
(here). On the top it creates a single datapoint with all labels. In the bottom it creates many datapoints, one label per datapoint.
This difference is crucial when updating the report with new uploads. In particular when we delete datapoints. It seems that the bottom one is more precise and keeps data available longer. Sadly it also uses more space. But I'll keep the same way it was before.
# It's OK to update this here because it's inside the | ||
# UploadProcessing lock, so it's exclusive access | ||
original_report._labels_index[new_index] = label_or_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_labels_index
is just an instance var, right? what would the race be?
is there an alternative to this that would not make parallel upload processing harder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The race would be different instances of the same report (that all pull the labels_index from the same file) writing different things to it and then trying to save the report to the same destination. But because all changes happen only in the merge portion of processing it's fine. THe merge needs to be in a lock anyway.
However I should be more careful that the report might be processing and not processing at the same time... and only save it into storage if it was modified.
Anyway, it doesn't really matter for parallel processing, because the uploads will be processed into temporary reports with their indexes. You can even merge them into a single report to be merged with the original in parallel, just take care of the index (with the aux functions in this file). But ultimately you will have to put a lock in the original report when merging to it, right?
As pointed out by @matt-codecov, there was a difference between the old behaviour creates "outer list containing many single-entry inner lists" whereas the new behavior was creating "outer list containing a single large inner list". This difference is subtle but it matters. Each of the inner lists become a datapoint with all the labels in the list. Later on, when [deleting labels](https://github.com/codecov/shared/blob/main/shared/reports/editable.py#L52) we delete a datapoint if _any_ of the labels to delete is in it. This means that the old behavior made the datapoints stick around for longer. I should probably investigate if we ever merge datapoints... not sure Anyway, matching old behavior.
Addressing various review comments around renaming functions, updating comments and small refactors. Also updates shared again, where `_labels_index` was renamed to `labels_index`, given that it's accessed all the time. The bigest refactor here is around carrying forawrd the labels index. As suggested we are skipping the `LabelsIndexService` and directly using the `ArchiveService` to copy the file from parent commit to child commit.
lables_index is used in the following scenarios: 1. merging reports during processing 2. label_analysis In these cases we load it, use it, unload it. Unloading includes saving it to storage again, and that is how changes are persisted. ONLY scenario 1 actually makes changes to it. Now imagine that for some reason there's a label-analysis request that takes a long time with the index open(unlikely) and in that period an upload is processed (unlikely). Then we would loose the updated index when label-analysis re-writes it to storage. To prevent that without a proper locking mechanism we can just let label_analysis NOT save it back. And because the merging scenario already runs inside a lock we are fine :)
This PR will be replaced by another one. |
These changes extend the
Report
to use the_labels_index
.Everything is controlled via a rollout, so in practice there's a single repo that will use
the compressed labels, my sentry fork (giovanni-guidini/sentry). For testing purposes.
There are 2 important facts here:
where "old" denote the code that doesn't do label encoding, and "new" the code that does.
but the (old, old) pair will generate corrupted results if it tries to process an encoded report.
SO it's important to woll out the (new, new) pair into prod before turning the feature on for everyone.
Which is what we are doing, so we're fine (probably).
At first I was very much in the "let's tear everything down and build again", but because the (new, new) pair
needs to handle the unencoded versions as well I decided to make smaller changes so the code can co-exist in harmony.