You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.:
As a user of the Parquet reader, I find pull request #17594 to be extremely useful. This report metrics are related to filter effectiveness.
I'm wondering if it would be reasonable to first export these C++ metrics to pylibcudf. Having access to these metrics would enhance its functionality in python world with pylibcudf.
Describe the solution you'd like:
I believe the implementation of such a solution could be straightforward, and I'm willing to take on the task if assigned. In python/pylibcudf/pylibcudf/io/types.pyx should be generally:
However, there's one important point to note (which I think is also mentioned in the C++ comments): these metric variables are only valid for the Parquet reader. I'm unsure whether it would be necessary to provide additional documentation for pylibcudf users to clarify this limitation:
@mroeschke Hello! I spent some time working on this yesterday, but I encountered a strange issue when passing a C++ variable to Python .pyx.
I have a simple implementation for converting std::optional<size_t> to an int in python.
Under normal circumstances, when filtering, the std::optional<size_t> should hold a valid value, which is the expected behavior. ✅
However, during the filtering process, the value received as a Python int is always 0, which doesn't match the size_t value in the C++ code, as I've verified through stack debugging. With Pdb, I am able to go to the .pyx stackframe but could no print any thing:
I've been unable to figure out where the value is getting lost because there shouldn't be any loss in this conversion. If you have some free time, could you take a look at my commit? Since this bug is present, I haven't submitted a pull request yet.
If you have some free time, could you take a look at my commit?
You implementation looks OK so far. I would suggest you open a PR with your changes and the test case where this is always returning 0 as that would be easier to iterate and debug what your are seeing.
Is your feature request related to a problem? Please describe.:
As a user of the Parquet reader, I find pull request #17594 to be extremely useful. This report metrics are related to filter effectiveness.
I'm wondering if it would be reasonable to first export these C++ metrics to pylibcudf. Having access to these metrics would enhance its functionality in python world with pylibcudf.
Describe the solution you'd like:
I believe the implementation of such a solution could be straightforward, and I'm willing to take on the task if assigned. In
python/pylibcudf/pylibcudf/io/types.pyx
should be generally:However, there's one important point to note (which I think is also mentioned in the C++ comments): these metric variables are only valid for the Parquet reader. I'm unsure whether it would be necessary to provide additional documentation for pylibcudf users to clarify this limitation:
cudf/cpp/include/cudf/io/types.hpp
Lines 288 to 297 in d0e219e
The text was updated successfully, but these errors were encountered: