Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(MDC): EntityKey enum to class #3109

Merged
merged 18 commits into from
Sep 9, 2022

Conversation

rahul-kumar-saini
Copy link
Contributor

@rahul-kumar-saini rahul-kumar-saini commented Sep 1, 2022

Overview

  • Replace EntityKey enum with a simple class which will be slowly phased out as more entities are made into config

Most of this PR consists of updated import statements:

from snuba.datasets.entities import EntityKey -> from snuba.datasets.entities.entity_key import EntityKey

Relevant file changes to review are the following:

  • snuba/datasets/configuration/entity_builder.py
    • Entity Keys are registered as they are loaded from config
  • snuba/datasets/entities/__init__.py
    • EntityKey enum removed
  • snuba/datasets/entities/entity_key.py
    • New EntityKey class created
  • snuba/datasets/entities/factory.py
    • Updated factory to load entities automatically instead of from hardcoded file paths
  • snuba/datasets/pluggable_entity.py
    • Replaced attribute name with entity_key
  • snuba/datasets/storages/storage_key.py
    • Added iterator in case StorageKeys need iterating
  • snuba/settings/__init__.py
    • Added file path glob for auto loading entity configs
  • snuba/subscriptions/entity_subscription.py
    • Call to init entity factory
  • tests/datasets/entities/test_entity_key.py

@codecov-commenter
Copy link

codecov-commenter commented Sep 6, 2022

Codecov Report

Base: 92.50% // Head: 92.84% // Increases project coverage by +0.33% 🎉

Coverage data is based on head (d117744) compared to base (624ba91).
Patch coverage: 95.54% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3109      +/-   ##
==========================================
+ Coverage   92.50%   92.84%   +0.33%     
==========================================
  Files         672      673       +1     
  Lines       30858    30873      +15     
==========================================
+ Hits        28545    28663     +118     
+ Misses       2313     2210     -103     
Impacted Files Coverage Δ
snuba/cli/entities.py 48.97% <50.00%> (ø)
snuba/datasets/pluggable_entity.py 90.19% <50.00%> (+0.19%) ⬆️
snuba/datasets/entities/factory.py 89.33% <66.66%> (+0.14%) ⬆️
snuba/datasets/storages/storage_key.py 96.77% <66.66%> (-3.23%) ⬇️
snuba/datasets/entities/entity_key.py 95.65% <95.65%> (ø)
snuba/cli/subscriptions_executor.py 54.23% <100.00%> (ø)
snuba/cli/subscriptions_scheduler.py 56.25% <100.00%> (ø)
snuba/cli/subscriptions_scheduler_executor.py 58.62% <100.00%> (ø)
snuba/datasets/cdc/groupassignee.py 100.00% <100.00%> (ø)
snuba/datasets/cdc/groupassignee_entity.py 100.00% <100.00%> (ø)
... and 111 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@rahul-kumar-saini rahul-kumar-saini marked this pull request as ready for review September 6, 2022 22:01
@rahul-kumar-saini rahul-kumar-saini requested a review from a team as a code owner September 6, 2022 22:01
Copy link
Contributor

@onewland onewland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me, did we test running subscription schedulers + executors + other CLI things that depend on entities? Sometimes that can catch errors

Comment on lines +11 to +14
assert REGISTERED_ENTITY_KEYS == {
"GENERIC_METRICS_DISTRIBUTIONS": "generic_metrics_distributions",
"GENERIC_METRICS_SETS": "generic_metrics_sets",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love that this is going to require changing for every entity added to snuba/datasets/configuration/. Not sure of a better way to test this behavior but I think that we should not assume the entity list will be so static

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will definitely need to figure this out, because it also implies that if a product person wants to add a new entity they will also have to edit this test. The point of the configuration is that changing them doesn't require Python changes.

Copy link
Member

@volokluev volokluev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some questions

snuba/subscriptions/entity_subscription.py Outdated Show resolved Hide resolved


def test_entity_key() -> None:
reset_entity_factory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function only exist for testing? I can't imagine any other reason why it would need to.

If you need a blank entity factory for a test, just mock the _ent_factory function to return what you want

def _ent_factory() -> _EntityFactory:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reset method shouldn't exist because it invites mutability into global state, which we should not have.

Copy link
Contributor Author

@rahul-kumar-saini rahul-kumar-saini Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was introduced as part of #3065 and I just used it here, I can use initialize instead but the change to remove the reset function should be a different PR (cc @enochtangg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was introduced purely for testing the multithreaded race condition. As opposed to spawning new threads and accessing the entity factory in a test case, the test was revised to ensure the entity factory initialization and the entity mapping attribute creation are not decoupled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(off-topic for this PR)

It's best to keep testing functions within tests especially for things like this which work with global state. Having a function that alters global state, should not be used in normal operation of the system, in the public api of a module introduces more surface area for issues.

This is a usecase for the python mocking library which can inject these kinds of corner conditions into the runtime of the test but without changing the way the system operates

@@ -245,6 +245,7 @@

# File path glob for configs
STORAGE_CONFIG_FILES_GLOB = f"{CONFIG_FILES_PATH}/**/storages/*.yaml"
ENTITY_CONFIG_FILES_GLOB = f"{CONFIG_FILES_PATH}/**/entities/*.yaml"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to need a better way of doing this I think. I feel like this works now because there is only one folder in the config file path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the ** should recursively travel through all folders under CONFIG_FILES_PATH. If it doesn't work on the next dataset we config we can change it, but it should work.

Comment on lines +11 to +14
assert REGISTERED_ENTITY_KEYS == {
"GENERIC_METRICS_DISTRIBUTIONS": "generic_metrics_distributions",
"GENERIC_METRICS_SETS": "generic_metrics_sets",
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will definitely need to figure this out, because it also implies that if a product person wants to add a new entity they will also have to edit this test. The point of the configuration is that changing them doesn't require Python changes.

@rahul-kumar-saini rahul-kumar-saini merged commit 1f92bcb into master Sep 9, 2022
@rahul-kumar-saini rahul-kumar-saini deleted the rahul/ref/remove-entity-key-enum branch September 9, 2022 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants