Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
523f0ef
add basic aggregation support
rjzamora Feb 6, 2025
4b6f180
roll back change to literal.py
rjzamora Feb 6, 2025
920c361
make get_expr_partition_count more efficient
rjzamora Feb 6, 2025
71bbe53
make get_expr_partition_count more efficient
rjzamora Feb 6, 2025
a6b05a9
fix copyright changes
rjzamora Feb 6, 2025
3092c58
use traversal
rjzamora Feb 6, 2025
c4ed2a6
roll back unnecessary date change
rjzamora Feb 6, 2025
5af267b
move fuse_expr_graph
rjzamora Feb 6, 2025
c13e916
cleanup
rjzamora Feb 6, 2025
663db89
add mean support
rjzamora Feb 6, 2025
062c322
update some comments
rjzamora Feb 6, 2025
5f7d73c
add todo comment
rjzamora Feb 6, 2025
fa64610
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 5, 2025
9690d6e
use replace
rjzamora Mar 5, 2025
ad362b6
avoid passing through options to renamed aggs unless the new options …
rjzamora Mar 5, 2025
1a96b6e
remove unused func
rjzamora Mar 5, 2025
5514889
address partial code review
rjzamora Mar 5, 2025
9b351d9
remove strict=False everywhere
rjzamora Mar 5, 2025
ddce78f
address review in select.py
rjzamora Mar 5, 2025
82f8d10
use NamedExprs to better keep track of ouput column names
rjzamora Mar 5, 2025
e9c413e
Merge branch 'branch-25.04' into complex-aggregations
rjzamora Mar 7, 2025
a1cb27b
Merge branch 'branch-25.04' into complex-aggregations
rjzamora Mar 10, 2025
07265e7
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 11, 2025
7b1252b
modify coverage
rjzamora Mar 11, 2025
3d83ea6
Merge branch 'branch-25.04' into complex-aggregations
rjzamora Mar 13, 2025
255e2e2
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 14, 2025
d1abd8d
partial code review
rjzamora Mar 14, 2025
0c3828c
remove unused code
rjzamora Mar 14, 2025
fd5b3e1
simplify _replace
rjzamora Mar 14, 2025
27e2184
update names
rjzamora Mar 14, 2025
dd39d4b
Merge branch 'branch-25.04' into complex-aggregations
rjzamora Mar 17, 2025
a21f8de
add n_unique support
rjzamora Mar 18, 2025
84a20ab
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 18, 2025
60a99d9
refactor shuffle component of 'n_unique'
rjzamora Mar 18, 2025
7b7f834
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 18, 2025
5676199
temporarily drop coverage
rjzamora Mar 18, 2025
aab0d83
improve test coverage
rjzamora Mar 19, 2025
b6bce28
Merge remote-tracking branch 'upstream/branch-25.04' into complex-agg…
rjzamora Mar 19, 2025
2802418
Merge remote-tracking branch 'upstream/branch-25.06' into complex-agg…
rjzamora Mar 19, 2025
6229730
type annotations
rjzamora Mar 19, 2025
ea570dc
Merge branch 'branch-25.06' into complex-aggregations
rjzamora Mar 31, 2025
650277e
implement fallback to single-partition
rjzamora Apr 1, 2025
53def54
fix typo
rjzamora Apr 1, 2025
dcee153
Merge branch 'branch-25.06' into single-partition-fallback
rjzamora Apr 1, 2025
a1d2fc4
Merge remote-tracking branch 'upstream/branch-25.06' into single-part…
rjzamora Apr 2, 2025
b52ee0a
make fallback configurable by passing ConfigOptions into lower_ir_graph
rjzamora Apr 2, 2025
6cfdea1
Merge remote-tracking branch 'upstream/branch-25.06' into single-part…
rjzamora Apr 2, 2025
1cc256d
Merge remote-tracking branch 'upstream/branch-25.06' into single-part…
rjzamora Apr 4, 2025
754ca8d
address code review
rjzamora Apr 4, 2025
5ece3f1
Merge remote-tracking branch 'upstream/branch-25.06' into single-part…
rjzamora Apr 4, 2025
ec2a211
Merge branch 'branch-25.06' into single-partition-fallback
rjzamora Apr 7, 2025
6bed08b
Merge remote-tracking branch 'upstream/branch-25.06' into single-part…
rjzamora Apr 8, 2025
b913ecb
address code review
rjzamora Apr 8, 2025
653de36
align with 18405
rjzamora Apr 8, 2025
96c7f20
update message for mismatched partition counts
rjzamora Apr 8, 2025
400c1e4
Merge remote-tracking branch 'upstream/branch-25.06' into complex-agg…
rjzamora Apr 8, 2025
be4fdba
rename _check_sub_expr to _valid_sub_expr
rjzamora Apr 8, 2025
96134ed
check leaf nodes
rjzamora Apr 8, 2025
2ffca58
partial code review
rjzamora Apr 8, 2025
cdedb9f
refactor expression decomposition. Avoid IR manipulation at graph-con…
rjzamora Apr 9, 2025
1bd1cbd
fix finalize
rjzamora Apr 9, 2025
9bcfb14
use input_ir instead of child
rjzamora Apr 9, 2025
d73689b
add comment
rjzamora Apr 9, 2025
98bae3d
save further work
rjzamora Apr 10, 2025
5d3d8df
fix column-shuffle support
rjzamora Apr 10, 2025
0949231
fix column-shuffle support
rjzamora Apr 10, 2025
fb6495b
Merge remote-tracking branch 'upstream/branch-25.06' into complex-agg…
rjzamora Apr 10, 2025
d20c031
Merge branch 'branch-25.06' into complex-aggregations
rjzamora Apr 10, 2025
ec5ca42
Merge remote-tracking branch 'upstream/branch-25.06' into complex-agg…
rjzamora Apr 11, 2025
da161e9
add no-cover pragmas
rjzamora Apr 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion python/cudf_polars/cudf_polars/dsl/expressions/base.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES.
# SPDX-License-Identifier: Apache-2.0
# TODO: remove need for this
# ruff: noqa: D101
Expand All @@ -18,6 +18,8 @@
if TYPE_CHECKING:
from collections.abc import Mapping

from typing_extensions import Self

from cudf_polars.containers import Column, DataFrame

__all__ = ["AggInfo", "Col", "ColRef", "ExecutionContext", "Expr", "NamedExpr"]
Expand Down Expand Up @@ -237,6 +239,24 @@ def collect_agg(self, *, depth: int) -> AggInfo:
"""Collect information about aggregations in groupbys."""
return self.value.collect_agg(depth=depth)

def reconstruct(self, expr: Expr) -> Self:
"""
Rebuild with a new `Expr` value.

Parameters
----------
expr
New `Expr` value

Returns
-------
New `NamedExpr` with `expr` as the underlying expression.
The name of the original `NamedExpr` is preserved.
"""
if expr is self.value:
return self
return type(self)(self.name, expr)


class Col(Expr):
__slots__ = ("name",)
Expand Down
11 changes: 9 additions & 2 deletions python/cudf_polars/cudf_polars/dsl/traversal.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# SPDX-FileCopyrightText: Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES.
# SPDX-FileCopyrightText: Copyright (c) 2024-2025, NVIDIA CORPORATION & AFFILIATES.
# SPDX-License-Identifier: Apache-2.0

"""Traversal and visitor utilities for nodes."""
Expand All @@ -23,14 +23,19 @@
]


def traversal(nodes: Sequence[NodeT]) -> Generator[NodeT, None, None]:
def traversal(
nodes: Sequence[NodeT], *, cutoff_types: tuple[type[NodeT], ...] = ()
) -> Generator[NodeT, None, None]:
"""
Pre-order traversal of nodes in an expression.

Parameters
----------
nodes
Roots of expressions to traverse.
cutoff_types
Types to terminate traversal at. If a type is in this tuple
then we do not yield any of its children.

Yields
------
Expand All @@ -43,6 +48,8 @@ def traversal(nodes: Sequence[NodeT]) -> Generator[NodeT, None, None]:
while lifo:
node = lifo.pop()
yield node
if cutoff_types and isinstance(node, cutoff_types):
continue
for child in reversed(node.children):
if child not in seen:
seen.add(child)
Expand Down
4 changes: 2 additions & 2 deletions python/cudf_polars/cudf_polars/experimental/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,6 @@ def keys(self, node: Node) -> Iterator[tuple[str, int]]:
yield from ((name, i) for i in range(self.count))


def get_key_name(node: Node) -> str:
def get_key_name(node: Node, *other: Node) -> str:
"""Generate the key name for a Node."""
return f"{type(node).__name__.lower()}-{hash(node)}"
return f"{type(node).__name__.lower()}-{hash((node, *other))}"
Loading
Loading