Skip to content
Merged
Changes from 9 commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
a7f2559
refactor: Move errors, warnings outside of `PandasLikeGroupBy.agg`
dangotbanned Jun 15, 2025
c61e09e
refactor: Split out complex, add some typing
dangotbanned Jun 15, 2025
c530b85
refactor: Split out dupe code to `_select_results`
dangotbanned Jun 15, 2025
294c6de
docs: Note the current complexity counts
dangotbanned Jun 15, 2025
745eccd
cov
dangotbanned Jun 15, 2025
d9cfa76
feat(typing): Add `NativeAggregation` literal
dangotbanned Jun 15, 2025
839cfb6
chore(typing): "Add typing" for the rest
dangotbanned Jun 15, 2025
13d900c
Experimenting with named agg style
dangotbanned Jun 15, 2025
6707d2d
refactor: Nice generalized version that "works"
dangotbanned Jun 15, 2025
d941e74
Mostly clean slate re-impl
dangotbanned Jun 16, 2025
f222cac
fix: Resolve 15/21 failures
dangotbanned Jun 16, 2025
a62e58f
note the remaining issues
dangotbanned Jun 16, 2025
6951889
test: Update to use `DuplicateError`
dangotbanned Jun 16, 2025
e313f2b
refactor: Remove dead code
dangotbanned Jun 16, 2025
a61dbbb
revert: Remove outdated complexity counts
dangotbanned Jun 16, 2025
c0c142c
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jun 16, 2025
b2feb1e
chore: Remove more outdated
dangotbanned Jun 16, 2025
b50456d
refactor: Move functions into `PandasLikeGroupBy
dangotbanned Jun 16, 2025
12ae9af
fix: Use alias instead of function name
dangotbanned Jun 16, 2025
f24651d
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 16, 2025
b420085
refactor: Don't store two `PandasLikeDataFrame`
dangotbanned Jun 16, 2025
5965a48
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 16, 2025
3a905a6
fix(DRAFT): Slap on non `str` support
dangotbanned Jun 17, 2025
00b303a
Merge branch 'simp-pandas-group-by' of https://github.com/narwhals-de…
dangotbanned Jun 17, 2025
f9b0cc5
fix(typing): Resolve intermittent variance issue
dangotbanned Jun 17, 2025
732fdc9
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 17, 2025
eea8e64
fix: Don't use `methodcaller` w/ modin
dangotbanned Jun 17, 2025
d78d541
cov ignore
dangotbanned Jun 17, 2025
a2f4f48
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jun 17, 2025
243f6cf
test: Add (failing) case
dangotbanned Jun 17, 2025
80cfb1b
fix: Everything except `modin[pyarrow]`
dangotbanned Jun 17, 2025
8fe49c7
test: xfail modin, dask
dangotbanned Jun 17, 2025
2317ff4
test: Add failing `plotly` repro
dangotbanned Jun 19, 2025
07c093e
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 19, 2025
82c31ba
test: Rewrite as minimal repro
dangotbanned Jun 19, 2025
6fd3b4c
fix: Cast `string[pyarrow]` back to int
dangotbanned Jun 19, 2025
2e60863
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jun 19, 2025
508e604
refactor: Use `simple_select` instead of `select_columns_by_name`
dangotbanned Jun 19, 2025
87dc041
refactor: Shorten some paths
dangotbanned Jun 19, 2025
3c6eb58
🎨
dangotbanned Jun 19, 2025
b1f65bd
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 20, 2025
2f157df
test: Widen to `test_group_by_no_preserve_dtype`
dangotbanned Jun 20, 2025
46e5ae0
fix: Address all non-int dtypes
dangotbanned Jun 20, 2025
af04570
test: Skip `Decimal` for polars when unsupported
dangotbanned Jun 20, 2025
5a540ff
fix: Handle old pandas float
dangotbanned Jun 20, 2025
4944e38
refactor: Make `exclude` a property
dangotbanned Jun 20, 2025
c2df420
refactor: Move all state into `AggExpr`, docs
dangotbanned Jun 20, 2025
26539c5
revert: Remove unused caching
dangotbanned Jun 20, 2025
7bb0d0d
fix: Try higher min pandas
dangotbanned Jun 20, 2025
4628f2e
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 21, 2025
2683b58
refactor: Renaming, make `native_agg` a method
dangotbanned Jun 21, 2025
bff83c8
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 21, 2025
83c591e
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jun 24, 2025
e0984ae
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 24, 2025
c29c8fc
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 25, 2025
b194e10
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 25, 2025
4cc6543
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 27, 2025
58d044c
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 28, 2025
b59c774
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 28, 2025
f116ada
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 28, 2025
cf5285c
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jun 30, 2025
f5f2798
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jul 3, 2025
3065418
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jul 3, 2025
c8cbd78
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jul 5, 2025
de7347e
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jul 6, 2025
b7e7b37
refactor: Use `list.copy` on keys
dangotbanned Jul 8, 2025
da7ea98
perf: Collect `dtypes` outside of the loop
dangotbanned Jul 8, 2025
532d6b3
refactor: `_agg_complex` -> `apply_aggs`, `_apply_exprs` -> `_apply_e…
dangotbanned Jul 8, 2025
502bcdc
Merge remote-tracking branch 'camriddell/main' into simp-pandas-group-by
dangotbanned Jul 8, 2025
c52bf66
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jul 9, 2025
5fd4160
fix(typing): Bump stubs, doc new ignore `include_groups`
dangotbanned Jul 9, 2025
cfeac25
perf: Use `rename` instead of `with_columns`
dangotbanned Jul 9, 2025
24eb873
try `rename` with `copy=True`
dangotbanned Jul 9, 2025
46282e8
revert: try rename with copy=True
dangotbanned Jul 9, 2025
9fe1642
test: Add failing repro
dangotbanned Jul 10, 2025
9eb672e
fix(DRAFT): Are we passing yet? πŸ™
dangotbanned Jul 10, 2025
53eab03
chore(typing): ignore `apply` overload
dangotbanned Jul 10, 2025
cc7a9a0
Merge remote-tracking branch 'upstream/main' into simp-pandas-group-by
dangotbanned Jul 10, 2025
44ba496
re-undo undoing ignore 😭
dangotbanned Jul 10, 2025
60461d9
Merge branch 'main' into simp-pandas-group-by
FBruzzesi Jul 11, 2025
91b5800
refactor: Switch back from `.agg(**named_aggs)` to `__getitem__`
dangotbanned Jul 11, 2025
c053971
Merge branch 'simp-pandas-group-by' of https://github.com/narwhals-de…
dangotbanned Jul 11, 2025
5899103
fix: `pandas` nightly boolean columns
dangotbanned Jul 11, 2025
13fdb47
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jul 12, 2025
e93044e
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jul 14, 2025
1672513
Merge branch 'main' into simp-pandas-group-by
dangotbanned Jul 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
253 changes: 184 additions & 69 deletions narwhals/_pandas_like/group_by.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,104 @@

import collections
import warnings
from typing import TYPE_CHECKING, Any, ClassVar
from functools import lru_cache
from itertools import chain
from operator import methodcaller
from typing import TYPE_CHECKING, Any, ClassVar, Literal

from narwhals._compliant import EagerGroupBy
from narwhals._expression_parsing import evaluate_output_names_and_aliases
from narwhals._pandas_like.utils import select_columns_by_name
from narwhals._utils import find_stacklevel

if TYPE_CHECKING:
from collections.abc import Iterator, Mapping, Sequence
from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence

import pandas as pd
from pandas.api.typing import DataFrameGroupBy as _NativeGroupBy
from typing_extensions import TypeAlias, Unpack

from narwhals._compliant.group_by import NarwhalsAggregation
from narwhals._compliant.typing import ScalarKwargs
from narwhals._pandas_like.dataframe import PandasLikeDataFrame
from narwhals._pandas_like.expr import PandasLikeExpr

NativeGroupBy: TypeAlias = "_NativeGroupBy[tuple[str, ...], Literal[True]]"

NativeApply: TypeAlias = "Callable[[pd.DataFrame], pd.Series[Any]]"
InefficientNativeAggregation: TypeAlias = Literal["cov", "skew"]
NativeAggregation: TypeAlias = Literal[
"any",
"all",
"count",
"first",
"idxmax",
"idxmin",
"last",
"max",
"mean",
"median",
"min",
"nunique",
"prod",
"quantile",
"sem",
"size",
"std",
"sum",
"var",
InefficientNativeAggregation,
]
"""https://pandas.pydata.org/pandas-docs/stable/user_guide/groupby.html#built-in-aggregation-methods"""

_AggFunc: TypeAlias = "NativeAggregation | Callable[..., Any]"
"""Equivalent to `pd.NamedAgg.aggfunc`."""

_NamedAgg: TypeAlias = "tuple[str, _AggFunc]"
"""Equivalent to `pd.NamedAgg`."""


@lru_cache(maxsize=32)
def _agg_func(
name: NativeAggregation, /, **kwds: Unpack[ScalarKwargs]
) -> _AggFunc: # pragma: no cover
if name == "nunique":
return methodcaller(name, dropna=False)
if not kwds or kwds.get("ddof") == 1:
return name
return methodcaller(name, **kwds)


def _named_aggs(
gb: PandasLikeGroupBy, /, expr: PandasLikeExpr, exclude: Sequence[str]
) -> Iterator[tuple[str, _NamedAgg]]: # pragma: no cover
output_names, aliases = evaluate_output_names_and_aliases(expr, gb.compliant, exclude)
function_name = gb._remap_expr_name(gb._leaf_name(expr))
aggfunc = _agg_func(function_name, **expr._scalar_kwargs)
for output_name, alias in zip(output_names, aliases):
yield alias, (output_name, aggfunc)


def named_aggs(
gb: PandasLikeGroupBy, *exprs: PandasLikeExpr, exclude: Sequence[str]
) -> dict[str, _NamedAgg]: # pragma: no cover
"""**Very early draft** for named agg-like input.

class PandasLikeGroupBy(EagerGroupBy["PandasLikeDataFrame", "PandasLikeExpr", str]):
_REMAP_AGGS: ClassVar[Mapping[NarwhalsAggregation, Any]] = {
Ignoring most special-casing for now, just trying to work out the right shape.

The idea would be using this like:

df.groupby(...).agg(**named_aggs(..., ..., exclude=...)).reset_index()

Looks entirely different to the current `PandasLikeGroupBy` πŸ€”
"""
return dict(chain.from_iterable(_named_aggs(gb, expr, exclude) for expr in exprs))


class PandasLikeGroupBy(
EagerGroupBy["PandasLikeDataFrame", "PandasLikeExpr", NativeAggregation]
):
_REMAP_AGGS: ClassVar[Mapping[NarwhalsAggregation, NativeAggregation]] = {
"sum": "sum",
"mean": "mean",
"median": "median",
Expand Down Expand Up @@ -51,17 +132,21 @@ def __init__(
else:
native_frame = self.compliant.native

self._grouped = native_frame.groupby(
self._grouped: NativeGroupBy = native_frame.groupby(
list(self._keys),
sort=False,
as_index=True,
dropna=drop_null_keys,
observed=True,
)

# NOTE: Still have *quite* a bit of work to do here!
# -------------------------------------------------------
# NOTE: `C901` Too complex (25 > 10)
# NOTE: `PLR0912` Too many branches (28 > 12)
# NOTE: `PLR0914` Too many local variables (27 > 15)
# NOTE: `PLR0915` Too many statements (83 > 50)
def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR0912, PLR0914, PLR0915
implementation = self.compliant._implementation
backend_version = self.compliant._backend_version
new_names: list[str] = self._keys.copy()

all_aggs_are_simple = True
Expand All @@ -76,8 +161,8 @@ def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR
# We need to do this separately from the rest so that we
# can pass the `dropna` kwargs.
nunique_aggs: dict[str, str] = {}
simple_aggs: dict[str, list[str]] = collections.defaultdict(list)
simple_aggs_functions: set[str] = set()
simple_aggs: dict[str, list[NativeAggregation]] = collections.defaultdict(list)
simple_aggs_functions: set[NativeAggregation] = set()

# ddof to (output_names, aliases) mapping
std_aggs: dict[int, tuple[list[str], list[str]]] = collections.defaultdict(
Expand Down Expand Up @@ -126,10 +211,11 @@ def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR
simple_agg_new_names.append(alias)
simple_aggs_functions.add(function_name)

result_aggs = []
result_aggs: list[pd.DataFrame] = []

if simple_aggs:
# Fast path for single aggregation such as `df.groupby(...).mean()`
result_simple_aggs: pd.DataFrame
if (
len(simple_aggs_functions) == 1
and (agg_method := simple_aggs_functions.pop()) != "size"
Expand All @@ -138,24 +224,22 @@ def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR
result_simple_aggs = getattr(
self._grouped[list(simple_aggs.keys())], agg_method
)()
result_simple_aggs.columns = [
result_simple_aggs.columns = [ # type: ignore[assignment]
f"{a}_{agg_method}" for a in result_simple_aggs.columns
]
else:
result_simple_aggs = self._grouped.agg(simple_aggs)
result_simple_aggs.columns = [
f"{a}_{b}" for a, b in result_simple_aggs.columns
result_simple_aggs = self._grouped.agg(simple_aggs) # type: ignore[arg-type]
result_simple_aggs.columns = [ # type: ignore[assignment,misc]
f"{a}_{b}" # type: ignore[has-type]
for a, b in result_simple_aggs.columns
]
if not (
set(result_simple_aggs.columns) == set(expected_old_names)
and len(result_simple_aggs.columns) == len(expected_old_names)
): # pragma: no cover
msg = (
f"Safety assertion failed, expected {expected_old_names} "
f"got {result_simple_aggs.columns}, "
"please report a bug at https://github.com/narwhals-dev/narwhals/issues"
raise safety_assertion_error(
expected_old_names, result_simple_aggs.columns
)
raise AssertionError(msg)

# Rename columns, being very careful
expected_old_names_indices: dict[str, list[int]] = (
Expand All @@ -167,30 +251,31 @@ def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR
expected_old_names_indices[item].pop(0)
for item in result_simple_aggs.columns
]
result_simple_aggs.columns = [simple_agg_new_names[i] for i in index_map]
result_simple_aggs.columns = [simple_agg_new_names[i] for i in index_map] # type: ignore[assignment]
result_aggs.append(result_simple_aggs)

if nunique_aggs:
result_nunique_aggs = self._grouped[list(nunique_aggs.values())].nunique(
dropna=False
)
result_nunique_aggs.columns = list(nunique_aggs.keys())
result_nunique_aggs.columns = list(nunique_aggs.keys()) # type: ignore[assignment]

result_aggs.append(result_nunique_aggs)

if std_aggs:
for ddof, (std_output_names, std_aliases) in std_aggs.items():
_aggregation = self._grouped[std_output_names].std(ddof=ddof)
# `_aggregation` is a new object so it's OK to operate inplace.
_aggregation.columns = std_aliases
_aggregation.columns = std_aliases # type: ignore[assignment]
result_aggs.append(_aggregation)
if var_aggs:
for ddof, (var_output_names, var_aliases) in var_aggs.items():
_aggregation = self._grouped[var_output_names].var(ddof=ddof)
# `_aggregation` is a new object so it's OK to operate inplace.
_aggregation.columns = var_aliases
_aggregation.columns = var_aliases # type: ignore[assignment]
result_aggs.append(_aggregation)

result: pd.DataFrame
if result_aggs:
output_names_counter = collections.Counter(
c for frame in result_aggs for c in frame
Expand All @@ -211,61 +296,53 @@ def agg(self, *exprs: PandasLikeExpr) -> PandasLikeDataFrame: # noqa: C901, PLR
result = self.compliant.__native_namespace__().DataFrame(
list(self._grouped.groups.keys()), columns=self._keys
)
# Keep inplace=True to avoid making a redundant copy.
# This may need updating, depending on https://github.com/pandas-dev/pandas/pull/51466/files
result.reset_index(inplace=True) # noqa: PD002
return self.compliant._with_native(
select_columns_by_name(result, new_names, backend_version, implementation)
).rename(dict(zip(self._keys, self._output_key_names)))
return self._select_results(result, new_names)

if self.compliant.native.empty:
# Don't even attempt this, it's way too inconsistent across pandas versions.
msg = (
"No results for group-by aggregation.\n\n"
"Hint: you were probably trying to apply a non-elementary aggregation with a "
"pandas-like API.\n"
"Please rewrite your query such that group-by aggregations "
"are elementary. For example, instead of:\n\n"
" df.group_by('a').agg(nw.col('b').round(2).mean())\n\n"
"use:\n\n"
" df.with_columns(nw.col('b').round(2)).group_by('a').agg(nw.col('b').mean())\n\n"
)
raise ValueError(msg)

warnings.warn(
"Found complex group-by expression, which can't be expressed efficiently with the "
"pandas API. If you can, please rewrite your query such that group-by aggregations "
"are simple (e.g. mean, std, min, max, ...). \n\n"
"Please see: "
"https://narwhals-dev.github.io/narwhals/concepts/improve_group_by_operation/",
UserWarning,
stacklevel=find_stacklevel(),
raise empty_results_error()
return self._agg_complex(exprs, new_names)

def _select_results(
self, df: pd.DataFrame, /, new_names: list[str]
) -> PandasLikeDataFrame:
compliant = self.compliant
# NOTE: Keep `inplace=True` to avoid making a redundant copy.
# This may need updating, depending on https://github.com/pandas-dev/pandas/pull/51466/files
df.reset_index(inplace=True) # noqa: PD002
native = select_columns_by_name(
df, new_names, compliant._backend_version, compliant._implementation
)
rename = dict(zip(self._keys, self._output_key_names))
return compliant._with_native(native).rename(rename)

def func(df: Any) -> Any:
def _agg_complex(
self, exprs: Iterable[PandasLikeExpr], new_names: list[str]
) -> PandasLikeDataFrame:
warn_complex_group_by()
implementation = self.compliant._implementation
backend_version = self.compliant._backend_version
func = self._apply_exprs(exprs)
if implementation.is_pandas() and backend_version >= (2, 2):
result = self._grouped.apply(func, include_groups=False)
else: # pragma: no cover
result = self._grouped.apply(func)
return self._select_results(result, new_names)

def _apply_exprs(self, exprs: Iterable[PandasLikeExpr]) -> NativeApply:
ns = self.compliant.__narwhals_namespace__()
into_series = ns._series.from_iterable

def fn(df: pd.DataFrame) -> pd.Series[Any]:
out_group = []
out_names = []
for expr in exprs:
results_keys = expr(self.compliant._with_native(df))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move self.compliant._with_native(df) outside the for loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably yeah

I hadn't changed much on this part, I think it was just moving those two lines outside of the function?

main

def func(df: Any) -> Any:
out_group = []
out_names = []
for expr in exprs:
results_keys = expr(self.compliant._with_native(df))
for result_keys in results_keys:
out_group.append(result_keys.native.iloc[0])
out_names.append(result_keys.name)
ns = self.compliant.__narwhals_namespace__()
return ns._series.from_iterable(out_group, index=out_names, context=ns).native

(#2680)

def _apply_exprs(self, exprs: Iterable[PandasLikeExpr]) -> NativeApply:
ns = self.compliant.__narwhals_namespace__()
into_series = ns._series.from_iterable
def fn(df: pd.DataFrame) -> pd.Series[Any]:
out_group = []
out_names = []
for expr in exprs:
results_keys = expr(self.compliant._with_native(df))
for keys in results_keys:
out_group.append(keys.native.iloc[0])
out_names.append(keys.name)
return into_series(out_group, index=out_names, context=ns).native
return fn

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saying that I agree, but recommend doing as a follow-up

for result_keys in results_keys:
out_group.append(result_keys.native.iloc[0])
out_names.append(result_keys.name)
ns = self.compliant.__narwhals_namespace__()
return ns._series.from_iterable(out_group, index=out_names, context=ns).native

if implementation.is_pandas() and backend_version >= (2, 2):
result_complex = self._grouped.apply(func, include_groups=False)
else: # pragma: no cover
result_complex = self._grouped.apply(func)
for keys in results_keys:
out_group.append(keys.native.iloc[0])
out_names.append(keys.name)
return into_series(out_group, index=out_names, context=ns).native

# Keep inplace=True to avoid making a redundant copy.
# This may need updating, depending on https://github.com/pandas-dev/pandas/pull/51466/files
result_complex.reset_index(inplace=True) # noqa: PD002
return self.compliant._with_native(
select_columns_by_name(
result_complex, new_names, backend_version, implementation
)
).rename(dict(zip(self._keys, self._output_key_names)))
return fn

def __iter__(self) -> Iterator[tuple[Any, PandasLikeDataFrame]]:
with warnings.catch_warnings():
Expand All @@ -280,3 +357,41 @@ def __iter__(self) -> Iterator[tuple[Any, PandasLikeDataFrame]]:
key,
self.compliant._with_native(group).simple_select(*self._df.columns),
)


def safety_assertion_error(
old_names: Sequence[str], new_names: Sequence[str] | pd.Index[str]
) -> AssertionError: # pragma: no cover
msg = (
f"Safety assertion failed, expected {old_names} "
f"got {new_names}, "
"please report a bug at https://github.com/narwhals-dev/narwhals/issues"
)
return AssertionError(msg)


def empty_results_error() -> ValueError:
"""Don't even attempt this, it's way too inconsistent across pandas versions."""
msg = (
"No results for group-by aggregation.\n\n"
"Hint: you were probably trying to apply a non-elementary aggregation with a "
"pandas-like API.\n"
"Please rewrite your query such that group-by aggregations "
"are elementary. For example, instead of:\n\n"
" df.group_by('a').agg(nw.col('b').round(2).mean())\n\n"
"use:\n\n"
" df.with_columns(nw.col('b').round(2)).group_by('a').agg(nw.col('b').mean())\n\n"
)
return ValueError(msg)


def warn_complex_group_by() -> None:
warnings.warn(
"Found complex group-by expression, which can't be expressed efficiently with the "
"pandas API. If you can, please rewrite your query such that group-by aggregations "
"are simple (e.g. mean, std, min, max, ...). \n\n"
"Please see: "
"https://narwhals-dev.github.io/narwhals/concepts/improve_group_by_operation/",
UserWarning,
stacklevel=find_stacklevel(),
)
Loading