feat(expr-ir): Finish* implementing ArrowExpr#3325
Merged
dangotbanned merged 216 commits intooh-nodesfrom Dec 14, 2025
Merged
Conversation
Quite odd behavior for scalar lol
Will come back to this later to shrink
Discovered while adding `kurtosis` test which had an empty series
Adding the `skew` test revealed this "edge case"
The rest will allow them to be used in `group_by`
Indirectly adds support for `over` too, but haven't added tests yet
- Still needs tests - Also unsure what the scalar behavior should be for `mode_all`
I wanna try rewriting this without `numpy` after getting the tests in place
Got quite a few more ideas to experiment with
Each of these are expensive + this version is simpler
Managed to write it with one less `if_else`, but the readability suffered so this will do
I've tried adding the `not_implemented` 3 times now and kept forgetting why it wasn't there yet
`ArrowSeries.struct.unnest` depends on this for backcompat I'd rather this was covered in all cases
TIL: `pyarrow.compute.and_not` exists
dangotbanned
commented
Dec 13, 2025
I added some extra cases that I hadn't considered while debugging Good news is they failed
The tiniest of fixes
dangotbanned
commented
Dec 14, 2025
dangotbanned
commented
Dec 14, 2025
dangotbanned
added a commit
that referenced
this pull request
Dec 14, 2025
dangotbanned
added a commit
that referenced
this pull request
Dec 14, 2025
This was referenced Dec 14, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issues
ExprIR #2572{DataFrame,Series}.explode(empty_as_nulls, keep_nulls)#3347linear_space#3349Tracking
JoinOptionsinbinary_jointo handle nulls apache/arrow#48477Description
This PR is a bit of a mixed bag.
I've really tried to focus on getting an implementation for all of the methods in the
Exprnamespace.Man there a lot of them now 😅
I have a pretty long list, so things are grouped and highlighted to what was the most interesting to me.
Tip
✨ - New feature (either for
pyarrowas a backend ornarwhalsitself)💾 - Refactor from
main(particularly trying to avoid unconditional-dependence onnumpy)Show top-level functions
{all,any}_horizontal(ignore_nulls=...)coalesceformatlinear_space(✨, see feat(expr-ir): Addlinear_space#3349)Show
Exprmethodsceilclip,clip_lower,clip_upperdrop_nullsexpfill_nanfill_nullfill_null_with_strategy(💾)fill_null(strategy=..., limit=...)can raiseArrowIndexError#3327floorgather_every(deprecated)hist(✨)numpyusage avoidedis_{duplicated,unique}(💾, ✨)over(*partition_by)is_not_null(✨)is_not_nan(✨){kurtosis,skew}, (💾, ✨)over(*partition_by)andgroup_byrolling_expr(💾, cc francesco)rolling_sumrolling_meanrolling_varrolling_stdlogmap_batches(is_elementwise=..., returns_scalar=...)(✨)mode(keep: ModeKeepStrategy)(💾)replace_strictreplace_strict(default=...)roundsample(deprecated)numpywhenwith_replacement=TruesqrtuniqueShow
Expr*Namespacemethodscat.get_categories(💾)list.contains(✨)Exprlist.getlist.join(✨)list.lenlist.unique(✨)str.containsstr.len_charsstr.to_{upper,lower,title}casestr.replace(value: IntoExpr)(✨)str.replace_all(value: IntoExpr)(✨)str.slice{head,tail}are sugar at narwhals-levelstr.splitstr.{starts,ends}_with(💾)str.strip_charsstr.zfill(💾)struct.fieldShow
Seriesmethods*Not a complete list*, some misc others were added since they were used in the test suite on `main`:
cum_*explode(✨)fill_null(_with_strategy)gather_everyrolling_*(💾, which led to)diff(n=...)(✨)shift(fill_value=...)(✨)sample(💾)zip_withShow
Series*NamespacemethodsChoosing to leave most of these out for now; only adding things that are unique to
Seriesand/or are depended on by the current implementation:struct.fieldstruct.unnest(✨, used inSeries.hist)struct.schema(✨, used inunnest)Show
DataFramemethodsexplode(✨){DataFrame,Series}.explode(empty_as_nulls, keep_nulls)#3347to_struct(✨)from_dictgather_everyiter_columnssampleShow internal implementations
While working on the above, some functions I found easier to express when composed of other parts of the
polarsAPI - whichnarwhalsdoesn't support yet.In some cases I factored out their usage - but it was a fun experiment
str.find(44a9d1e)str.pad_startstr.splitnstr.joinimplodeeq_missingShow missing vs
mainExpr.{head,tail}(deprecated)is_closeExpr.str.to_date(time)Expr.dt(whole namespace)A general theme this all follows is that implementations of functions go in
<backend>.functions.py.They can then be used by
Series,Expr(andScalar) directly or when composing other functions - withoutnarwhalswrapper overheadWhat's next?
I've been fighting the urge to rewrite how this version of
CompliantExprworks, pretty hard throughout this PR.Most of what is in (https://github.com/narwhals-dev/narwhals/blob/e68d9ab9b12562848602e7a0d2f7baf80bc0576a/narwhals/_plan/arrow/expr.py) is general visitor logic which would be a slog to repeat in every backend.
It works and I'm finding it easy to reason about, but I 100% plan to put in some more design work.
Just needed to endure the pain of doing it the long way, and get a feel for where things work and where they don't 🙂
LogicalPlanis the next (likely) big item I have my eye on