-
Notifications
You must be signed in to change notification settings - Fork 981
Description
Is your feature request related to a problem? Please describe.
During #20291, we ran into issues with to_ast. This does two things:
- Validates whether expression is supported, in the sense that we're able to translate the expression (e.g. there are some binary ops that aren't supported by parquet filter). When we encounter an unsupported expression we raise a
NotImplementedErrorand (optionally) fall back. This must happen at IR translation time. - Does the actual translation, i.e. makes the
pylibcudf.Expression. Some of the translations require (or can require) a CUDA stream because they make aplc.Scalar.
So we have a conflict. Validation needs to occur at translation time, when we don't have a scalar at hand. Our workaround in that PR is to just create a new stream, do the operation, and synchronize it, all in Predicate.__init__.
A natural question to ask is "can we split the validation from the translation". I attempted that, but failed. I think it's still work exploring, so I'll outline my attempt here:
- Define a new
validate_to_asttransformer, similar toto_ast - Move the pieces of the
_to_astimplementations thatraise NotImplementedErrorto this newvalidate_to_ast - Call
validate_to_astinPredict.__init__ - Store
predicateinPredict.__init__instead of callingto_ast - Make a new
Predict.to_astmethod that callsto_astto do the actual translation - Call
Predict.to_astinside ofConditionalJoin.do_evaluate, where we have a stream ready to go.
Some of the tests in tests/test_parquet_filters.py::test_scan_by_hand were failing with this setup. I think it's worth working through these to avoid that stream.synchronize().
Describe the solution you'd like
Similar functionality, with no stream synchronize.
Describe alternatives you've considered
I also wondered whether we could define something like an s-expression, where you delay creating the plc.Scalar` until a time when you have a stream. Might work, but it gets a bit messy.
Additional context
Add any other context, code examples, or references to existing implementations about the feature request here.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status