Conversation
Needed for `__init__` support
- Dropped `native` as it got too complex - I've done this fake `Any` thing to make `mypy` understand in multiple places
- Having a hard time working out what is going on here - All I've changed is what the refs are named
Much happier with this than the `pandas` one
- Shorted in each backend - Added docs - Accounts for `dask` deviation from `str`
- More performant to compile a single pattern and reuse everywhere - Gives a name to a common op - Backend code is shorter
Keeps this part of `pandas` in sync, despite it doing some extra name stuff
- `_duckdb` and `_spark_like` don't need these parts (only `_dask` does) - Also avoids needing a `TypeVar` default, which caused some issues in https://github.com/narwhals-dev/narwhals/actions/runs/13970848097/job/39111959826?pr=2252
- These two are almost identical - Trying to reduce them as much as possible, before moving the common parts to `nw._compliant.LazyGroupBy`
- Greatly simplifies what each backend needs to implement - Avoids creating and combining intermediate lists - Avoids performing a `not in exclude` check, where `exclude` is empty - Identified and documented a new common method `CompliantExpr._is_multi_output_agg`
- Will need to make (inavriant) `CompliantExprT` to type the second part correctly - Believe that is also causing this weird error ``` Argument of type "CompliantExprT_contra@CompliantDataFrame" cannot be assigned to parameter "exprs" of type "CompliantExprT_contra@CompliantDataFrame" in function "select" Type "CompliantExprT_contra@CompliantDataFrame" is not assignable to type "CompliantExprT_contra@CompliantDataFrame" Pylance(reportArgumentType) ```
| else output_names | ||
| ) | ||
| native_exprs = expr(self.compliant) | ||
| if expr._is_multi_output_agg(): |
There was a problem hiding this comment.
https://github.com/narwhals-dev/narwhals/pull/2246/files kinda ties in here...i'll update that later, i like what you've done here anyway
There was a problem hiding this comment.
Ah nice we were pretty close to landing on the same name
There was a problem hiding this comment.
@MarcoGorelli another one that might be cleaner as a method:
is_elementary_expression -> CompliantExpr._is_elementary
There was a problem hiding this comment.
I can't tell if you mean the same thing as elementwise or just a fancy way of saying simple
There was a problem hiding this comment.
by 'elementwise' i mean something that operates on each row independently of all the other rows. something like Expr.abs
'elementary' i just meant that it only does a single operation - so, "fancy way of saying simple" seems accurate 😄
| agg_columns = list(chain(self._keys, self._evaluate_exprs(exprs))) | ||
| return self.compliant._from_native_frame( | ||
| self.compliant.native.aggregate(agg_columns) # type: ignore[arg-type] | ||
| ) |
There was a problem hiding this comment.
cool, this now looks super-simple!
| keys: list[str], | ||
| drop_null_keys: bool, # noqa: FBT001 | ||
| df: SparkLikeLazyFrame, | ||
| keys: Sequence[str], |
There was a problem hiding this comment.
maybe we should call these by, for consistency with polars? ok to do separately anyway
There was a problem hiding this comment.
Yeah, no objections from me
I just picked the name used in most of the implementations
|
Thanks for such a quick review @MarcoGorelli |

What type of PR is this? (check all applicable)
Related issues
Compliant*protocols #2230_FullContexttoCompliantDataFrame#2251Checklist
If you have comments or can explain your changes, please do so below
ArrowGroupByDaskLazyGroupByPandasLikeGroupByLazyGroupByDuckDBGroupBySparkLikeLazyGroupBy