-
Notifications
You must be signed in to change notification settings - Fork 983
Update zfill to match Python output #11634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update zfill to match Python output #11634
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## branch-22.10 #11634 +/- ##
===============================================
Coverage ? 86.41%
===============================================
Files ? 145
Lines ? 22993
Branches ? 0
===============================================
Hits ? 19869
Misses ? 3124
Partials ? 0 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
| // if the string starts with a sign, output the sign first | ||
| if (!d_str.empty() && (*in_ptr == '-' || *in_ptr == '+')) { | ||
| *out_ptr++ = *in_ptr++; | ||
| d_str = string_view{in_ptr, d_str.size_bytes() - 1}; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is the actual meat of this PR, right? The rest just looks like ancillary cleanup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Sometimes I cannot help myself. Especially in cleaning up code I wrote awhile back.
| ["³", "⅕", ""], | ||
| pytest.param( | ||
| ["hello", "there", "world", "+1234", "-1234", None, "accént", ""], | ||
| marks=pytest.mark.xfail( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this PR is updated, @galipremsagar can you remove this in #11617?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that's the plan. Waiting on this PR to be merged to un-xfail here: https://github.com/rapidsai/cudf/pull/11617/files#diff-abb1dc78dc17e099ea2d1f20a54e8b078edcf87e80865527b3a8e9660a66effcR1772
|
@davidwendt is this ready to merge? |
|
@gpucibot merge |
This PR introduces `pandas-1.5` support in `cudf`. The changes include: - [x] Requires `group_keys` support in `groupby` for `dask_cudf` to work: #11659 - [x] Requires `zfill` updates to match `pandas-1.5` behavior: #11634 - [x] `where` API: Ability to inspect a scalar value if it can be fit into the existing dtype, similar to: pandas-dev/pandas#48373 - [x] Switches `ValueError` to `TypeError` when an unknown category is being set to a `CategoricalColumn` - [x] Handles breaking change of an `ArrowIntervalType` related import that has resulted in `cudf` to error on import itself. - [x] Fix an issue with `IntervalColumn.to_pandas`. - [x] Raises error when an object of `boolean` dtype is being set to a `NumericalColumn`. - [x] Raises error when `pat` is None in `Series.str.startswith` & `Series.str.endswith`. - [x] Add `IntervalDtype.to_pandas` with appropriate versioning. - [x] Handle `get_window_bounds` signature changes. - [x] Fix and version a bunch of pytests. ```python branch-22.10: == 4275 failed, 79837 passed, 2049 skipped, 1193 xfailed, 1923 xpassed, 6597 warnings, 4 errors in 1103.52s (0:18:23) == == 803 failed, 106 passed, 14 skipped, 14 xfailed, 324 warnings, 17 errors in 148.46s (0:02:28) == This PR: == 84041 passed, 2049 skipped, 1199 xfailed, 1710 xpassed, 6599 warnings in 359.27s (0:05:59) == == 954 passed, 14 skipped, 7 xfailed, 3 xpassed, 580 warnings in 54.75s == ``` Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Ashwin Srinath (https://github.com/shwina) - Matthew Roeschke (https://github.com/mroeschke) - Mark Sadang (https://github.com/msadang) URL: #11617
Description
Fixes
cudf::strings::zfillto match Python'szfillbehavior. This will match Pandas 1.5zfillas well.The new behavior correctly skips the leading sign character when applying the '0' character fill.
Updates gtests and added more test data.
The pytest was updated to xfail for test data with leading sign characters until Pandas 1.5 is supported.
The Java tests did not include any test data with sign characters.
Closes #11632
Checklist