Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT] New Local Execution Model #2437

Merged
merged 12 commits into from
Jul 9, 2024
Merged

[FEAT] New Local Execution Model #2437

merged 12 commits into from
Jul 9, 2024

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Jun 25, 2024

Prototype for new local execution model

Comment on lines 78 to 84
def test_tpch_q6(tmp_path, check_answer, get_df):
daft.context.set_execution_config(enable_aqe=True, enable_native_executor=True)
start = time.time()
daft_df = answers.q6(get_df)
daft_pd_df = daft_df.to_pandas()
end = time.time()
print(f"Time taken: {end-start}")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the runtime is the same for current engine vs the simple local execution engine in this PR

Copy link

codecov bot commented Jun 27, 2024

Codecov Report

Attention: Patch coverage is 2.05279% with 334 lines in your changes missing coverage. Please review.

Project coverage is 63.31%. Comparing base (3aeba6f) to head (4c97fba).
Report is 33 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2437      +/-   ##
==========================================
+ Coverage   63.12%   63.31%   +0.19%     
==========================================
  Files         938      961      +23     
  Lines      105269   107395    +2126     
==========================================
+ Hits        66449    68002    +1553     
- Misses      38820    39393     +573     
Files Coverage Δ
src/daft-execution/src/stage/run.rs 0.00% <ø> (ø)
src/daft-scheduler/src/scheduler.rs 90.21% <0.00%> (+0.46%) ⬆️
daft/runners/pyrunner.py 93.39% <77.77%> (-1.32%) ⬇️
src/daft-local-execution/src/sources/in_memory.rs 0.00% <0.00%> (ø)
src/daft-local-execution/src/lib.rs 0.00% <0.00%> (ø)
...aft-local-execution/src/intermediate_ops/filter.rs 0.00% <0.00%> (ø)
...ft-local-execution/src/intermediate_ops/project.rs 0.00% <0.00%> (ø)
src/daft-local-execution/src/run.rs 0.00% <0.00%> (ø)
src/daft-local-execution/src/sources/scan_task.rs 0.00% <0.00%> (ø)
src/daft-local-execution/src/sinks/aggregate.rs 0.00% <0.00%> (ø)
... and 3 more

... and 96 files with indirect coverage changes


let mut handles = FuturesUnordered::new();
let mut receivers = vec![];
while let Some(morsel) = source_stream.next().await {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a fixed number of parallel pipelines that are pushed data to in round robin fashion

@samster25
Copy link
Member

samster25 commented Jul 8, 2024

follow on PRs:

  • Bridge Exec for compute / IO
  • Tracing for streaming pipeline
  • Stateful intermediate operators
  • Fixed number of parallel pipelines
  • Streaming async IO
  • check pointing pipeline
  • streaming Probe Table building
  • streaming aggs

@colin-ho colin-ho changed the title Simple Execution Model [FEAT] New Local Execution Model Jul 8, 2024
@github-actions github-actions bot added the enhancement New feature or request label Jul 8, 2024
@colin-ho colin-ho marked this pull request as ready for review July 8, 2024 20:41
@colin-ho colin-ho merged commit 0bd1d27 into main Jul 9, 2024
43 checks passed
@colin-ho colin-ho deleted the colin/execution branch July 9, 2024 02:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants