-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix filter parsing bug #9508
Fix filter parsing bug #9508
Conversation
Thank you for your pull request! We could not find a changelog entry for this change. For details on how to document a change, see the contributing guide. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #9508 +/- ##
==========================================
- Coverage 87.97% 87.93% -0.05%
==========================================
Files 167 167
Lines 22171 22171
==========================================
- Hits 19506 19496 -10
- Misses 2665 2675 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable to me based on my minuscule knowledge of Mashumaro (where the from_dict method apparently comes from), but I'm not a maintainer here so I'll let someone from the core team do the approving/merging.
core/dbt/contracts/graph/unparsed.py
Outdated
@@ -564,7 +564,7 @@ def __bool__(self): | |||
@dataclass | |||
class UnparsedMetricInputMeasure(dbtClassMixin): | |||
name: str | |||
filter: Optional[Union[str, List[str]]] = None | |||
filter: Union[Optional[str], Optional[List[str]]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to use an explicit None
in the type spec instead of putting in an Optional on everything? Union[str, List[str], None]
or str | List[str] | None
is a little less repetitive, assuming either of those work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that works too! Just tested to confirm.
@tlento I thought that too, but if you go into the definition in Mashumaro, it's just the type stubs. Took me forever to track this down as a result 😅 but turns out |
Well I just learned something today. There is no from_dict method on a Python dataclass, which is why I was so confused at first, and that type stub is indeed non-functional as written, but I figured there was something that Mashumaro was doing to populate the body with some type-specific extraction. Turns out the library adds it on the fly via a dynamic codegen + compile step invoked from the DataclassDictMixin. Consequently, I have no idea what the resulting Were you able to actually get to the from_dict method body in your investigation? I'd be curious to see what it looks like, since my feeble human brain is not able to assemble the recursively-generated code. Wild stuff, either way. |
You can see the generated code by enabling
Currently, Union types are handled naively by trying to use a handler for each variant type in the loop. Values of type If you still have any questions or need help, I invite you to open an issue for discussion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should absolutely get this in. Thank you for doing this work ❤️ We should though add a test. Especially as @Fatal1ty mentioned, the typing based parser in Mashumaro will be seeing some changes in the future, so we want to guarantee that the filter continues to be parsed correctly as the things we depend on change how they operate. A good place to add a test would be in test_metrics.py.
@@ -564,7 +564,7 @@ def __bool__(self): | |||
@dataclass | |||
class UnparsedMetricInputMeasure(dbtClassMixin): | |||
name: str | |||
filter: Optional[Union[str, List[str]]] = None | |||
filter: Union[str, List[str], None] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We should probably add a comment about whats going on here. I can imagine a future person like me accidentally changing it back to Optional
during a refactor as that is our usual pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, updated!
99f8e09
to
f95123e
Compare
Added tests! @QMalcolm thank you for pointing me to where those should be! Note that in the process of writing tests, I discovered a separate bug. If you try to use a list filter that includes jinja on a metric or an input measure, you'll get an error when you run
Gets this error: I'll be putting up a separate issue for that bug. But that's the reason you don't see any test cases for input measures or metrics with list filters. The same bug does not exist for filters on saved queries, oddly, so I was able to include list filter tests for those. |
f95123e
to
a97fecd
Compare
I'm not sure why the artifacts check is failing in CI - it looks like the action itself is erroring? But LMK if there's something I need to fix there! |
The failing CI check is one we've added recently. I've added the label Context: we want to strongly control changes to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Thank you for adding the tests ❤️
resolves #9507
Problem
Currently, if you pass a string into a filter YAML param, it will be assumed to be a list. This means your semantic manifest ends up with a list of strings for where filters. For example, this YAML:
would look like this in the semantic manifest:
This is because the dataclass
from_dict()
method only works for unions ifUnion
is the outermost type annotation. We had it nested in anOptional
annotation, so this fixes that.Solution
Checklist