-
-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Change DefaultNoneColumnMapper to use a normal set #3580
Conversation
Dataset configuration doesn't support complex objects for registered classes like this. In order for the discover entity to be migration to YAML, there needs to be a different way to initialize this class that can be encoded in YAML. This mapper is checking if a column name exists in the ColumnSet. That check compares against the flattened column name stored in the ColumnSet. This can be achieved by comparing to a normal set instead of a ColumnSet.
I'm not sure this will take us in the direction we want to go in. The problem is more that the DefaultNoneColumnMapper needs to take the columns of the two tables that it's creating a merge table of. Is the plan to just re-enumerate all those columns in the discover entity yaml? |
Codecov ReportBase: 92.20% // Head: 92.37% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #3580 +/- ##
==========================================
+ Coverage 92.20% 92.37% +0.16%
==========================================
Files 733 733
Lines 33963 33974 +11
==========================================
+ Hits 31317 31382 +65
+ Misses 2646 2592 -54
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
That's what I was thinking. The list of transaction/event columns is going to need to be enumerated somewhere, since in theory the That list of event/transaction specific columns is referenced in the class itself, but just as a convenience (instead of spelling out all the columns it does I think the Discover entity would list out all the columns (common, event, transactions) as its schema, and then in the mappers define which ones are event specific vs. transaction specific. That will result in a lot of lines of configuration, but more closely maps to how these values are actually used. |
@@ -148,7 +149,7 @@ class DefaultNoneColumnMapper(ColumnMapper): | |||
the discover dataset file. | |||
""" | |||
|
|||
columns: ColumnSet | |||
columns: set[str] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can we make this a list and then check uniqueness in the
__post_init__
the yaml syntax for sets is pretty ugly - Can we fix the docstring and explain that the columns list is a list of strings mapping to the column names?
Dataset configuration doesn't support complex objects for registered classes
like this. In order for the discover entity to be migration to YAML, there needs
to be a different way to initialize this class that can be encoded in YAML.
This mapper is checking if a column name exists in the ColumnSet. That check
compares against the flattened column name stored in the ColumnSet. This can be
achieved by comparing to a normal set instead of a ColumnSet.
Blast Radius
This should only concern anyone who works with the Discover entity.
Also, nothing should change when this is merged.