-
Notifications
You must be signed in to change notification settings - Fork 172
feat(expr-ir): The big Selector overhaul
#3233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
dangotbanned
merged 104 commits into
expr-ir/over-and-over-and-over-again
from
expr-ir/strict-selectors
Nov 4, 2025
Merged
feat(expr-ir): The big Selector overhaul
#3233
dangotbanned
merged 104 commits into
expr-ir/over-and-over-and-over-again
from
expr-ir/strict-selectors
Nov 4, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`expressions.selectors` is where most of the changes need to be done
Still have a lot of surgery left when ripping everything else out
bugs-a-plenty
Have quite a few more to port and almost all follow the same format
This is a subset of the actual test suite, but still a very big subset What's here so far has already caught a few bugs, I'm sure there's more to come (sadly?)
Discovered (#3235) in the process
dangotbanned
commented
Oct 23, 2025
I'm sure I've done this at least once before 😭
Knowing that these return a selector requires knowing the special-casing for `col`, but that can't be known statically without making `Expr` generic (not happening)
9 tasks
There are some things we're missing, but they can be de-sugared
25 tests (potentially) in parallel is better than one big honking boi
This was referenced Oct 31, 2025
Resolves (#3233 (comment)), (#3233 (comment)) The only other `DType` selector I'm still on the fence about is `ncs.nested` The others would be nice to have, but maybe lower utility
An option since `DTypeClass` added
2 tasks
dangotbanned
commented
Nov 2, 2025
This also simplifies the compliant-level, since resolving names is part of expansion Heavily based on what `polars` does
Need to rewrite `ArrowExpr.over_ordered` to make use of the simplification Maybe something like loop `order_by` and use `meta.as_selector`?
SelectorsSelector overhaul
d9b9b3f
into
expr-ir/over-and-over-and-over-again
40 checks passed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related issues
over(*partition_by)#3224ExprIR #2572Upstream
Selectorexpansion pola-rs/polars#25022pl.col('a', 'b')vspl.nth(0, 1)vspl.all()pola-rs/polars#21773Show Motivation
Adapting the
Selectors implementation to be more like (pola-rs/polars#23351) has been on my mind for a while now:ExprIR #2572 (comment)ExprIR #2572 (comment)ExprIR #2572 (comment)My most recent woe in (#3224) - when trying to allow
ColumnNameOrSelectorin more places - are that (here) not all column selection expressions have a selectors equivalent:col(s)doesn't translate tocs.by_namecs.by_name(..., require_all=True){nth,index_columns}would have the same issuecs.by_indexis missingThat poses a problem for implicitly converting
str(ColumnName*) into aSelector(*OrSelector) under-the-hood.A user can request columns that don't exist and selector-semantics (empty sets are okay in most cases) allow it:
Show tests
narwhals/tests/plan/frame_partition_by_test.py
Lines 120 to 146 in 41d8cc2
Show Tasks
Selector.matches_column(self, name: str, dtype: DType)logic toDTypeSelector.matches(self, dtype: DType)DTypeSelector.into_columnsDTypeSelector.iter_expand{Root,Binary,Invert}Selectorrust)(DataType)Selector::{Union,Difference,ExclusiveOr,Intersect}variantsSelectorSelectorIR.to_dtype_selectorexpr.{_ColumnSelection,Columns,Nth,IndexColumns,All,Exclude}RenameAlias,KeepNamechangesnameopsKeepNameworkingselectors_testpasses!!!!! 🥳🥳🥳🥳🥳BinaryExprexpansion (see thread)ncs.{float,integer,temporal}ncs.{first,last}into_columnsiter_expandcomment (see thread)_plan.selectors.__all__group_byagg expansion withpolars*Framemethodsdrop(9447959), (003972b)drop_nulls(616a5d5)sort*🏆 Highlights
Selectors
Reimplemented
Selectors to align with (pola-rs/polars#23351).TL;DR: we now can use all these guys independent of backend/version:
cs.by_{index,name}(require_all: bool)cs.{array,list}(inner: Selector | None)cs.emptycs.enumcs.firstcs.floatcs.integercs.lastcs.structcs.temporalAll
Selectors (includingnw.nth,nw.all,nw.exclude) andnw.colcan now be used in*Frame-level methods, which so far includes:BaseFrame.sortBaseFrame.dropBaseFrame.drop_nullsDataFrame.partition_by(started in #3224)Note
All of this is opaque to the compliant-level, which continues to receive already-resolved column names
Well-defined Expansion Rules
I ran into some issues relating to (pola-rs/polars#25022), (pola-rs/polars#21773), (narwhals-dev/narwhals#3029)
The precise details of what expands and how can be found in this fancy new class:
Show
Expandernarwhals/narwhals/_plan/_expansion.py
Lines 137 to 352 in 50bcb9c
There are some more examples in (#3233 (comment)) and (#3029 (comment)).
Also, this test features a particularly-odd blend of the new features 😅
narwhals/tests/plan/over_test.py
Lines 230 to 237 in 50bcb9c
Testing
Speaking of tests ...
A LOT of the positive diff (+2013/3125) comes from a major effort to improve test coverage.
I focused on selectors and expression-expansion the most, where many of the tests are adapted directly from the
polarstest suite.Writing those got repetitive quick, so wrapped some bits up in a new testing util and sprinkled in some examples:
Show
Framenarwhals/tests/plan/utils.py
Lines 41 to 144 in 50bcb9c
I'm quite happy with how readable these tests are 🙂
narwhals/tests/plan/selectors_test.py
Lines 639 to 645 in 50bcb9c