Skip to content

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Aug 6, 2025

Changes Made

Use associated type for operator states, removes need to box and downcast.

Related Issues

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements a comprehensive refactoring of the Daft execution engine's operator state management system, transitioning from trait object-based state handling to associated types. The changes affect three key operator categories: IntermediateOperator, BlockingSink, and StreamingSink, along with their respective nodes and result types.

The refactoring eliminates the need for trait objects like Box<dyn IntermediateOpState>, Box<dyn BlockingSinkState>, and Box<dyn StreamingSinkState>, replacing them with associated State types on each trait. For example, the IntermediateOperator trait now includes type State: Send + Sync + Unpin and methods accept Self::State directly instead of boxed trait objects. Each operator implementation defines its concrete state type - stateless operators like FilterOperator use type State = (), while stateful operators like GroupedAggregateSink use type State = GroupedAggregateState.

The changes update method signatures across all affected traits: execute(), sink(), finalize(), and introduce new make_state() methods that return DaftResult<Self::State>. Node structures like IntermediateNode, BlockingSinkNode, and StreamingSinkNode become generic over their operator types, enabling compile-time type checking. Result types are updated to use associated type syntax, such as IntermediateOpExecuteResult<Op> now referencing <Op as IntermediateOperator>::State.

This architectural change integrates with Daft's execution framework by maintaining the same operational semantics while providing a more type-safe foundation. The refactoring preserves all existing functionality in operators like filtering, projection, aggregation, joins, and window operations, but eliminates runtime downcasting patterns that were previously required with as_any_mut().downcast_mut() operations.

Confidence score: 4/5

  • This PR represents a significant but well-structured architectural refactoring that improves type safety without changing core functionality
  • Score reflects the comprehensive nature of changes across many critical execution engine files, requiring careful validation
  • Pay close attention to the generic type bounds and associated type implementations across all operator traits

32 files reviewed, 1 comment

Edit Code Review Bot Settings | Greptile

@colin-ho colin-ho requested a review from srilman August 6, 2025 20:24
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 94.49541% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 79.33%. Comparing base (3fceb2c) to head (197347e).
⚠️ Report is 20 commits behind head on main.

Files with missing lines Patch % Lines
src/daft-local-execution/src/sinks/repartition.rs 0.00% 9 Missing ⚠️
...intermediate_ops/distributed_actor_pool_project.rs 0.00% 6 Missing ⚠️
...on/src/sinks/window_partition_and_dynamic_frame.rs 94.82% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4921      +/-   ##
==========================================
- Coverage   79.33%   79.33%   -0.01%     
==========================================
  Files         898      898              
  Lines      125742   125498     -244     
==========================================
- Hits        99762    99562     -200     
+ Misses      25980    25936      -44     
Files with missing lines Coverage Δ
...local-execution/src/intermediate_ops/cross_join.rs 93.61% <100.00%> (-0.62%) ⬇️
...ft-local-execution/src/intermediate_ops/explode.rs 86.66% <100.00%> (+0.95%) ⬆️
...aft-local-execution/src/intermediate_ops/filter.rs 90.47% <100.00%> (+0.47%) ⬆️
...tion/src/intermediate_ops/inner_hash_join_probe.rs 91.08% <100.00%> (-0.49%) ⬇️
...-execution/src/intermediate_ops/intermediate_op.rs 90.05% <100.00%> (+0.75%) ⬆️
...ft-local-execution/src/intermediate_ops/project.rs 100.00% <100.00%> (ø)
...aft-local-execution/src/intermediate_ops/sample.rs 80.76% <100.00%> (+1.17%) ⬆️
...c/daft-local-execution/src/intermediate_ops/udf.rs 80.22% <100.00%> (-0.52%) ⬇️
...ft-local-execution/src/intermediate_ops/unpivot.rs 77.77% <100.00%> (+1.11%) ⬆️
src/daft-local-execution/src/sinks/aggregate.rs 97.75% <100.00%> (-0.29%) ⬇️
... and 23 more

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@srilman srilman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes me unreasonably happy

@colin-ho colin-ho merged commit 634b296 into main Aug 11, 2025
50 checks passed
@colin-ho colin-ho deleted the colin/use-associated-type-for-state branch August 11, 2025 17:48
colin-ho pushed a commit that referenced this pull request Aug 11, 2025
)

## Changes Made

Fixes issue here
https://github.com/Eventual-Inc/Daft/actions/runs/16889682307/job/47846055281

merge conflict between #4935
and #4921

## Checklist

- [ ] Documented in API Docs (if applicable)
- [ ] Documented in User Guide (if applicable)
- [ ] If adding a new documentation page, doc is added to
`docs/mkdocs.yml` navigation
- [ ] Documentation builds and is formatted properly (tag @/ccmao1130
for docs review)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants