feat: unify exception for nested over statements, nested aggregations, filtrations on aggregations, document expression metadata better#2351
Conversation
eb6ec10 to
8186f0f
Compare
| expr_methods = [ | ||
| i | ||
| for i in nw.Expr(lambda: 0, ExprMetadata.simple_selector()).__dir__() | ||
| for i in nw.Expr(lambda: 0, ExprMetadata.selector_single()).__dir__() |
There was a problem hiding this comment.
I'm not sure what the lambda: 0, part is for here - but pyright is shouting at all 4 of these
Argument of type "() -> Literal[0]" cannot be assigned to parameter "to_compliant_expr" of type "_ToCompliant" in function "__init__"
Type "() -> Literal[0]" is not assignable to type "_ToCompliant"
Function accepts too many positional parameters; expected 0 but received 1
Function return type "Literal[0]" is incompatible with type "CompliantExpr[Any, Any]"
"Literal[0]" is incompatible with protocol "CompliantExpr[Any, Any]"
"_implementation" is not present
"_backend_version" is not present
"_version" is not present
"_evaluate_output_names" is not present
...There was a problem hiding this comment.
this is just to get the list of methods which exist of Expr, it's an internal script
11a6bbc to
d44e7c5
Compare
|
|
|
Is there overlap with the rules defined here and what we plan to allow in #2325? I was thinking about (#2225 (comment)) again and just wanted to double-check things are aligned (if they need to be) |
dangotbanned
left a comment
There was a problem hiding this comment.
Really great stuff @MarcoGorelli 🎉
Left a few notes/suggestions - also maybe consider renaming the PR to something shorter 😉
docs/how_it_works.md
Outdated
| print(nw.col("a").mean().over("b")) | ||
| ``` | ||
|
|
||
| Note how they tell us something but their metadata. This section is all about |
There was a problem hiding this comment.
Note how they tell us something but their metadata
Is this a typo?
docs/how_it_works.md
Outdated
| aggregate (e.g. `nw.col('a').drop_nulls()`). | ||
| - `FILTRATION`: expressions which change length but don't |
There was a problem hiding this comment.
| aggregate (e.g. `nw.col('a').drop_nulls()`). | |
| - `FILTRATION`: expressions which change length but don't | |
| - `FILTRATION`: expressions which change length but don't | |
| aggregate (e.g. `nw.col('a').drop_nulls()`). |
docs/how_it_works.md
Outdated
| - `nw.col('a')` is not order-dependent, so it's `TRANSFORM`. | ||
| - `nw.col('a').abs()` is not order-dependent, so it's a `TRANSFORM`. | ||
| - `nw.col('a').cum_sum()`'s last operation is `cum_sum`, so it's `WINDOW`. | ||
| - `nw.col('a').cum_sum()+1`'s last operation is `__add__`, and it preserves |
There was a problem hiding this comment.
| - `nw.col('a').cum_sum()+1`'s last operation is `__add__`, and it preserves | |
| - `nw.col('a').cum_sum() + 1`'s last operation is `__add__`, and it preserves |
| How these change depends on the operation. | ||
|
|
||
| #### Chaining | ||
|
|
||
| Say we have `expr.expr_method()`. How does `expr`'s `ExprMetadata` change? | ||
| This depends on `expr_method`. |
There was a problem hiding this comment.
I'm not quite sure what to suggest, but some notes
- Maybe the sentence before Chaining is redundant?
- After Chaining I think there's a few too many
expr/Expr😵
docs/how_it_works.md
Outdated
| - `drop_nulls` and `filter` result in `FILTRATION`, and can only be applied | ||
| to `TRANSFORM` and `WINDOW`. | ||
| - `over` always results in `TRANSFORM`. This is a bit more complicated, | ||
| so we elaborate on it in the "You open a window" section below. |
There was a problem hiding this comment.
| so we elaborate on it in the "You open a window" section below. | |
| so we elaborate on it in ["You open a window ..."](#you-open-a-window-to-another-window-to-another-window-to-another-window). |
| #### Binary operations (e.g. `nw.col('a') + nw.col('b')`) | ||
|
|
||
| How do expression kinds change under binary operations? For example, | ||
| if we do `expr1 + expr2`, then what can we say about the output kind? | ||
| The rules are: | ||
|
|
||
| - If both are `LITERAL`, then the output is `LITERAL`. | ||
| - If one is a `FILTRATION`, then: | ||
|
|
||
| - if the other is `LITERAL` or `AGGREGATION`, then the output is `FILTRATION`. | ||
| - else, we raise an error. | ||
|
|
||
| - If one is `TRANSFORM` or `WINDOW` and the other is not `FILTRATION`, | ||
| then the output is `TRANSFORM`. | ||
| - If one is `AGGREGATION` and the other is `LITERAL` or `AGGREGATION`, | ||
| the output is `AGGREGATION`. | ||
|
|
||
| For n-ary operations such as `nw.sum_horizontal`, the above logic is | ||
| extended across inputs. For example, `nw.sum_horizontal(expr1, expr2, expr3)` | ||
| is `LITERAL` if all of `expr1`, `expr2`, and `expr3` are. |
There was a problem hiding this comment.
Don't worry about it for this PR, but I feel like this could really benefit from some kind of visual aid.
Not sure what format would make the most sense - just wanted to note it as tough to keep track of in my head 😔
docs/how_it_works.md
Outdated
| ### "You open a window to another window to another window to another window" | ||
|
|
||
| When we print out an expression, in addition to the expression kind, | ||
| we also see `window_kind`. These are four window kinds: |
There was a problem hiding this comment.
| we also see `window_kind`. These are four window kinds: | |
| we also see `window_kind`. There are four window kinds: |
narwhals/_expression_parsing.py
Outdated
| - FILTRATION vs (LITERAL | AGGREGATION) -> FILTRATION | ||
| - FILTRATION vs (FILTRATION | TRANSFORM | WINDOW) -> raise | ||
| - (TRANSFORM | WINDOW) vs (LITERAL | AGGREGATION) -> TRANSFORM | ||
| - (TRANSFORM | WINDOW) vs (anything) -> TRANSFORM |
There was a problem hiding this comment.
| - (TRANSFORM | WINDOW) vs (anything) -> TRANSFORM | |
| - (TRANSFORM | WINDOW) vs (...) -> TRANSFORM |
|
thanks for your review! really appreciate it |
These have been popping up and bugging me for 2 weeks 🙈 #2351 (comment)
* chore(typing): fix `check_docstrings` `Path` warning `pyright` did a good job spotting that one * chore: Use `dir(...)` instead of `.__dir__()` Fixes 2/6 errors and shrinks the script * chore(typing): Fix 2 more errors These have been popping up and bugging me for 2 weeks 🙈 #2351 (comment) * chore(typing): Fix the last 2 as well * chore: Simplify the rest * chore(typing): Ignore remaining in `utils` * chore(typing): Ignore everything in `tpch` These are all the same issue, can be fixed with a constrained `TypeVar` * fix: pre-`3.13` compat? Following where `__name__` used to be sourced from - python/cpython#101876 * chore(typing): Fix some of the new `tpch` ignores The rest seem pretty unavoidable * fix: don't clobber `typing` 🤦♂️ https://github.com/narwhals-dev/narwhals/actions/runs/14624570842/job/41032838053?pr=2426
What type of PR is this? (check all applicable)
Related issues
Checklist
If you have comments or can explain your changes, please do so below