Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(slices): Make Subscription Scheduler filter by slice ID #3338

Merged
merged 36 commits into from
Jan 9, 2023

Conversation

ayirr7
Copy link
Member

@ayirr7 ayirr7 commented Nov 3, 2022

For general context:
https://getsentry.atlassian.net/browse/SNS-1759

Up-to-date Approach for scheduling Subscriptions in a Sliced context:

  • Getting the partition key from the corresponding EntitySubscription
  • Mapping the partition key's value (some integer) to slice id (using partitioning functions)
  • Check if this slice id is equal to the slice id that is passed into the SubscriptionScheduler instance
  • If equal, keep the Subscription. Otherwise, filter out.
  • Ensure that tasks are being scheduled on a sliced (physical) subscriptions scheduler topic, if a slice_id is passed in

Updates: There was discussion around changing this approach, however, we are now deciding to move forward with this approach for now and will most likely move around/refactor once changes to EntitySubscription are merged.

@codecov-commenter
Copy link

codecov-commenter commented Nov 3, 2022

Codecov Report

Base: 92.25% // Head: 21.91% // Decreases project coverage by -70.33% ⚠️

Coverage data is based on head (64f18eb) compared to base (a6a41ef).
Patch coverage: 11.66% of modified lines in pull request are covered.

❗ Current head 64f18eb differs from pull request most recent head 61ddcb8. Consider uploading reports for the commit 61ddcb8 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           master    #3338       +/-   ##
===========================================
- Coverage   92.25%   21.91%   -70.34%     
===========================================
  Files         725      684       -41     
  Lines       33828    32451     -1377     
===========================================
- Hits        31207     7112    -24095     
- Misses       2621    25339    +22718     
Impacted Files Coverage Δ
snuba/cli/subscriptions_scheduler.py 0.00% <0.00%> (ø)
snuba/clickhouse/formatter/expression.py 33.33% <0.00%> (-61.49%) ⬇️
snuba/clickhouse/formatter/query.py 0.00% <0.00%> (-98.81%) ⬇️
snuba/datasets/entities/factory.py 0.00% <0.00%> (-92.11%) ⬇️
snuba/migrations/groups.py 95.61% <ø> (ø)
snuba/query/__init__.py 42.42% <ø> (-51.27%) ⬇️
snuba/subscriptions/scheduler.py 0.00% <0.00%> (-97.89%) ⬇️
snuba/subscriptions/scheduler_consumer.py 0.00% <0.00%> (-92.91%) ⬇️
...uba/subscriptions/scheduler_processing_strategy.py 0.00% <0.00%> (-90.59%) ⬇️
snuba/web/db_query.py 0.00% <ø> (-84.80%) ⬇️
... and 656 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@ayirr7 ayirr7 marked this pull request as ready for review November 4, 2022 22:17
@ayirr7 ayirr7 requested a review from a team as a code owner November 4, 2022 22:17
@evanh
Copy link
Member

evanh commented Nov 7, 2022

@enochtangg You should look at this, we'll need to factor this into the EntitySubscription changes.

@ayirr7 ayirr7 changed the title [wip]: Make Subscription Scheduler filter by slice ID feat(slices): Make Subscription Scheduler filter by slice ID Nov 7, 2022
partition_key_value = entity_sub.get_partitioning_key()

# map the partition key's value to the slice ID
logical_part = map_org_id_to_logical_partition(partition_key_value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like here you are assuming the partition key for entity_sub is always org_id. Is that always the case? Seems like this is tightly coupled to the implementation of entity_sub and would break if we change the partitioning key. SessionSubscription has an organization attribute, maybe we can use that to make it clear we are getting the org_id? Or maybe map_org_id_to_logical_partition can be refactored to be more general i.e map_patition_key_to_logical_partition

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point, and it makes sense. I think it would be good to change it to map_partition_key_to_logical_partition as you are suggesting.

Copy link
Member

@nikhars nikhars left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add slice_id tag to the MetricsWrapper when slice_id is passed via the CLI

@@ -321,6 +329,36 @@ def __reset_builder(self) -> None:
# We are transitioning between jittered and immediate mode. We must use the delegate builder.
self.__builder = self.__delegate_builder

def __get_filtered_subscriptions(self) -> List[Subscription]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would recommend that you pass the original list of subscriptions to this method rather than relying on self.__subscriptions and rename the method to __filter_subscriptions. Looking at the usage of self.__subscriptions I don't think there is value in having the private instance variable. You could probably get rid of it completely.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this makes sense. Updated.

@@ -321,6 +329,36 @@ def __reset_builder(self) -> None:
# We are transitioning between jittered and immediate mode. We must use the delegate builder.
self.__builder = self.__delegate_builder

def __get_filtered_subscriptions(self) -> List[Subscription]:
if self.__slice_id is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better if we move the slice_id checks in __get_subscriptions and call __get_filtered_subscriptions only in the case of slice_id being enabled.

@@ -88,6 +92,9 @@ def get_entity_subscription_conditions_for_snql(
def to_dict(self) -> Mapping[str, Any]:
return {"organization": self.organization}

def get_partitioning_key(self) -> int:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's implement this method for GenericMetricsSetsSubscription and GenericMetricsDistributionsSubscription only for now since they are the only entities which need slicing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, this makes sense

@ayirr7 ayirr7 requested a review from nikhars December 6, 2022 19:58
# get the metadata and org_id from the Subscription
sub_data = subscription.data
sub_metadata = sub_data.metadata
org_id = sub_metadata["organization"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make access to the metadata fields be specific to an Entity? Not all subscriptions would have an organization field. So if tomorrow this method gets called for some other Entity, this would cause an Exception

Copy link
Member Author

@ayirr7 ayirr7 Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think we can add a condition for checking the EntityKey that corresponds with a Subscription. Where we add this condition depends:

  • if slice_id is not None, but the entity of the subscription is not within generic metrics, should we just return the regular list of subscriptions?

So, should we just restrict the filter step to Subscriptions that satisfy these conditions: (1) slice_id is not None (2) the EntityKey for the Subscription is within generic metrics

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets allow filtering to be called when slice_id is enabled. Within the filter method you can restrict it to specific EntityKey.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, that makes sense to me. Just to be clear, does this mean that if we have a slice_id enabled, but a non-generic metrics entity, we return all of the subscriptions (no filtering needed) or something else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The filter can just return the original subscriptions list for non-generic metrics entity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good with this approach (and tested it locally) but let's make sure there's a unit test on the filtering before we commit to master

Comment on lines 335 to 336
self.__entity_key == EntityKey.GENERIC_METRICS_SETS
or self.__entity_key == EntityKey.GENERIC_METRICS_DISTRIBUTIONS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we check if the storage set for the given entity key is sliced using the global logic rather than hardcoding these here?

the entity key is constant after initialization so we should only have to test for that once

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, changed

@@ -321,6 +328,36 @@ def __reset_builder(self) -> None:
# We are transitioning between jittered and immediate mode. We must use the delegate builder.
self.__builder = self.__delegate_builder

def __filter_subscriptions(self) -> List[Subscription]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra line

self.__entity_key == EntityKey.GENERIC_METRICS_SETS
or self.__entity_key == EntityKey.GENERIC_METRICS_DISTRIBUTIONS
):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra line

# get the metadata and org_id from the Subscription
sub_data = subscription.data
sub_metadata = sub_data.metadata
org_id = sub_metadata["organization"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do something if organization is None? maybe we can emit a metric and skip the current subscription?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, done

@@ -321,6 +328,36 @@ def __reset_builder(self) -> None:
# We are transitioning between jittered and immediate mode. We must use the delegate builder.
self.__builder = self.__delegate_builder

def __filter_subscriptions(self) -> List[Subscription]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this should be split out of SubscriptionScheduler so that we can write unit tests for it. Maybe we could do something like

# def filter(subscriptions, slice_id): 
#   ...
SubscriptionFilter = Callable[[Sequence[Subscription], int], Sequence[Subscription]]

and there could be a filter argument to the constructor.

Even if we don't do this, I think one unit test should exist to make sure that the filtering works properly

@ayirr7 ayirr7 requested a review from onewland January 6, 2023 18:45
entity_key: EntityKey,
metrics: MetricsBackend,
slice_id: Optional[int] = None,
) -> List[Subscription]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) -> List[Subscription]:
) -> MutableSequence[Subscription]:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think this type is copied from self.__subscriptions so it should probably be updated in both places or neither

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it back!

if part_slice_id == slice_id:
filtered_subscriptions.append(subscription)
else:
metrics.increment("queries_with_orgID=None")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if datadog supports using = in tag names. But in general, you should avoid using it. You can rename this to something like subscription_with_empty_org_id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with this

Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good except for the little things pointed out by Nikhar and me

importlib.reload(scheduler)

filtered_subs = filter_subscriptions(
subs, EntityKey.EVENTS, DummyMetricsBackend(strict=True), 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of a nit but can we used named rather than positional arguments here? It's just not super obvious that 2 is the org_id

I think it's generally good practice to use named arguments if the count is greater than 3

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense! fixed

Comment on lines 33 to 36
# create a list of subscriptions
expected_subs = [build_subscription(timedelta(minutes=1), 2) for count in range(20)]
extra_subs = [build_subscription(timedelta(minutes=3), 1) for count in range(10)]
subs = expected_subs + extra_subs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's put these in the test method, or create fixture methods (https://docs.pytest.org/en/6.2.x/fixture.html#) to avoid future coupling

"""
Maps an org_id to a logical partition. Since SENTRY_LOGICAL_PARTITIONS is
Maps a partition key to a logical partition. Since SENTRY_LOGICAL_PARTITIONS is
fixed, an org id will always be mapped to the same logical partition.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doc comment and the method title still reference org_id, IMO we might just want to remove the rename, but I don't feel strongly

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is true. i can revert it and we can revisit later if needed

@ayirr7 ayirr7 merged commit 15d29cd into master Jan 9, 2023
@ayirr7 ayirr7 deleted the sliced-scheduler branch January 9, 2023 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants