Skip to content

Conversation

colin-ho
Copy link
Contributor

@colin-ho colin-ho commented Aug 14, 2025

Changes Made

This fixes the problem where users try to run flotilla in an async context i.e. jupyter.

I've been meaning to get rid of the 'local' flotilla runner anyway because it won't work well if users submit multiple daft jobs to the same cluster, as you will spin up multiple flotilla runners and multiple swordfish workers (per job) instead of reusing.

Related Issues

Checklist

  • Documented in API Docs (if applicable)
  • Documented in User Guide (if applicable)
  • If adding a new documentation page, doc is added to docs/mkdocs.yml navigation
  • Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)

@github-actions github-actions bot added the fix label Aug 14, 2025
@colin-ho colin-ho marked this pull request as ready for review August 14, 2025 20:02
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR removes the dual-mode execution system in the Flotilla scheduler, consolidating all execution to use Ray actors exclusively. The change eliminates the LocalFlotillaRunner class and its associated FlotillaRunnerCore abstraction, forcing all Flotilla operations to run through the RemoteFlotillaRunner.

The modification spans multiple layers of the codebase:

  • Python layer (daft/runners/flotilla.py): Removes the LocalFlotillaRunner class (~80 lines) and simplifies the FlotillaRunner constructor to always create a RemoteFlotillaRunner actor
  • Rust layer (src/daft-distributed/src/python/mod.rs and progress_bar.rs): Removes the on_ray_actor/on_actor parameters from the DistributedPhysicalPlanRunner constructor and hardcodes the progress bar to use Ray's tqdm implementation
  • Type definitions (daft/daft/__init__.pyi): Updates the type stub to reflect the simplified constructor
  • Integration point (daft/runners/ray_runner.py): Removes the ray_client_mode parameter when instantiating the FlotillaRunner

The change also includes a minor typo fix, correcting FLOTILLA_RUNER_NAMESPACE to FLOTILLA_RUNNER_NAMESPACE. This architectural simplification aligns with Daft's distributed computing model, where Ray actors provide better resource management and isolation compared to direct asyncio execution.

Confidence score: 4/5

  • This PR appears safe to merge with good architectural benefits, addressing real async context issues
  • Score reflects the removal of complexity and fixing of known problems, though the broad scope of changes across multiple layers requires attention
  • Pay close attention to the Rust bindings in src/daft-distributed/src/python/mod.rs to ensure the parameter removal doesn't break existing functionality

5 files reviewed, no comments

Edit Code Review Bot Settings | Greptile

Copy link
Collaborator

@desmondcheongzx desmondcheongzx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@colin-ho colin-ho enabled auto-merge (squash) August 14, 2025 20:34
@colin-ho colin-ho merged commit 23549cd into main Aug 14, 2025
54 checks passed
@colin-ho colin-ho deleted the colin/flotilla-runner-actor branch August 14, 2025 21:57
huleilei pushed a commit to huleilei/Daft that referenced this pull request Aug 20, 2025
## Changes Made

This fixes the problem where users try to run flotilla in an async
context i.e. jupyter.

I've been meaning to get rid of the 'local' flotilla runner anyway
because it won't work well if users submit multiple daft jobs to the
same cluster, as you will spin up multiple flotilla runners and multiple
swordfish workers (per job) instead of reusing.

## Related Issues

<!-- Link to related GitHub issues, e.g., "Closes Eventual-Inc#123" -->

## Checklist

- [ ] Documented in API Docs (if applicable)
- [ ] Documented in User Guide (if applicable)
- [ ] If adding a new documentation page, doc is added to
`docs/mkdocs.yml` navigation
- [ ] Documentation builds and is formatted properly (tag @/ccmao1130
for docs review)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants