-
Notifications
You must be signed in to change notification settings - Fork 691
feat(dynamo-llm): Remove bring-your-own-engine #1216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tanmayv25 for visibility |
528f7de to
006b9c9
Compare
006b9c9 to
197ed55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR removes the legacy "bring-your-own-engine" Python support and associated dynamo-run flags, aligning the codebase with the standalone Python engine guide introduced in v0.2.1.
- Deletes the
dynamo-engine-pythoncrate and its Rust bindings - Removes
PythonStrvariants and thepythonfeature fromdynamo-run - Updates the CLI usage string to drop the old
dyn://<path>notation
Reviewed Changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| lib/engines/python/src/lib.rs | Removed entire legacy Python engine implementation |
| lib/engines/python/Cargo.toml | Deleted Python engine crate manifest |
| lib/bindings/python/rust/engine.rs | Moved and duplicated Python engine code into bindings |
| lib/bindings/python/Cargo.toml | Removed path dependency on Python engine crate |
| launch/dynamo-run/src/opt.rs | Removed PythonStr enum variant and parsing logic |
| launch/dynamo-run/src/main.rs | Updated USAGE constant to drop dyn://<path> |
| launch/dynamo-run/src/lib.rs | Deleted PYTHON_STR_SCHEME and related match arm |
| launch/dynamo-run/Cargo.toml | Removed optional python feature |
Files not reviewed (4)
- Earthfile: Language not supported
- container/Dockerfile.none: Language not supported
- container/Dockerfile.tensorrt_llm: Language not supported
- container/Dockerfile.vllm: Language not supported
Comments suppressed due to low confidence (2)
launch/dynamo-run/src/main.rs:33
- The
USAGEstring no longer mentionsdyn://<path>but still showsdyn. To avoid confusion, restore thedyn://<path>notation or clarify thatdynimpliesdyn://<path>.
const USAGE: &str = "USAGE: dynamo-run in=[http|text|dyn://<path>|batch:<folder>] out=ENGINE_LIST|dyn [--...];
lib/bindings/python/rust/engine.rs:112
- [nitpick] The
PythonServerStreamingEngineimplementation is duplicated here and was removed from thedynamo-engine-pythoncrate. Consider extracting this shared logic into a common module to avoid code duplication and ease maintenance.
pub struct PythonServerStreamingEngine {
It was removed from the docs in 0.2.1 and replaced with writing a [standalone Python engine](https://github.com/ai-dynamo/dynamo/blob/main/docs/guides/dynamo_run.md#writing-your-own-engine-in-python). Also remove the associated `dynamo-run` feature `python`. Releasing this in 0.3.0 will resolve #784 and #1109.
197ed55 to
57ae256
Compare
WalkthroughThis change removes support for the "python" engine and all related code, dependencies, and build features from the project. It eliminates the Python engine integration, corresponding Cargo features, and Docker build options, while also updating documentation and usage strings to reflect the removal. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant dynamo-run
participant Rust Engines
User->>dynamo-run: Run inference command
dynamo-run->>Rust Engines: Select engine (Python option removed)
Rust Engines-->>dynamo-run: Engine runs (Python engine unavailable)
dynamo-run-->>User: Return inference results
Assessment against linked issues
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
container/Dockerfile.vllm (1)
371-377: Double-check the scoped-feature syntax for Cargo (crate/feature)
--features dynamo-llm/block-manageris valid only on Cargo ≥ 1.64 and when a workspace build is requested (which you do).
However, because you subsequently restrict the build to a subset of packages with-p …, Cargo occasionally ignores per-crate feature flags (see rust-lang/cargo#9779). If you notice theblock-managercode path not being compiled indynamo-llm, split the command:cargo build --release --locked -p dynamo-llm --features block-manager cargo build --release --locked \ -p dynamo-run -p llmctl -p file://$PWD/components/http -p metricsThis guarantees the feature ends up on the right crate without relying on Cargo’s global resolution.
lib/bindings/python/rust/engine.rs (2)
151-168: Potential GIL contention: always spawning a blocking thread is expensiveEvery call to
generate()spawns a new blocking task just to acquire the GIL. Under high QPS this can exhaust the blocking‐thread pool and starve other tasks.Consider:
-let stream = tokio::task::spawn_blocking(move || { +let stream = tokio::task::spawn_blocking(move || { Python::with_gil(|py| { … }) }) .await??;• Re-using
pyo3_asyncio::tokio::into_streamwithout an extraspawn_blockingwhen the current thread already holds the GIL.
• Or create a dedicatedtokio::runtime::Handlefor Python work with a bounded thread-count.This will reduce context-switch overhead and latency.
282-288: Per-itemspawn_blockingcan become the bottleneck
process_itemconverts every message viatokio::task::spawn_blocking, potentially queuing thousands of short-lived tasks. A lower-overhead alternative is to run the entire depythonize loop inside the existing blocking task that already holds the GIL, pushing deserialised objects through the channel.Example skeleton:
let (tx, rx) = mpsc::channel(128); tokio::task::spawn_blocking(move || Python::with_gil(|py| { for item in stream { // iterate in Python land match depythonize::<Resp>(&item)? { … } if tx.blocking_send(annotated).is_err() { break; } } }));Reduces two context switches per message and preserves back-pressure semantics.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
Cargo.lockis excluded by!**/*.locklib/bindings/python/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
Earthfile(1 hunks)container/Dockerfile.none(1 hunks)container/Dockerfile.tensorrt_llm(1 hunks)container/Dockerfile.vllm(1 hunks)launch/dynamo-run/Cargo.toml(0 hunks)launch/dynamo-run/src/lib.rs(0 hunks)launch/dynamo-run/src/main.rs(1 hunks)launch/dynamo-run/src/opt.rs(0 hunks)lib/bindings/python/Cargo.toml(1 hunks)lib/bindings/python/rust/engine.rs(2 hunks)lib/engines/python/Cargo.toml(0 hunks)lib/engines/python/src/lib.rs(0 hunks)
💤 Files with no reviewable changes (5)
- launch/dynamo-run/src/lib.rs
- lib/engines/python/Cargo.toml
- launch/dynamo-run/src/opt.rs
- launch/dynamo-run/Cargo.toml
- lib/engines/python/src/lib.rs
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: pre-merge-rust (lib/runtime/examples)
- GitHub Check: pre-merge-rust (lib/bindings/python)
- GitHub Check: Build and Test - vllm
- GitHub Check: pre-merge-rust (.)
🔇 Additional comments (7)
launch/dynamo-run/src/main.rs (2)
29-29: LGTM: Example model updated.The model name change in the example usage is appropriate and keeps the documentation current.
33-33: LGTM: Usage string updated to reflect Python engine removal.The removal of
"dyn://<path>"syntax from the usage string is consistent with removing the bring-your-own-engine Python support.lib/bindings/python/Cargo.toml (1)
44-47: LGTM: Dependencies updated for new Rust-based Python engine.The addition of
anyhow,async-stream,async-trait, andasync-openaidependencies aligns with implementing a new Rust-native async engine to replace the external Python engine support.Earthfile (1)
115-115: LGTM: Python feature removed from build configuration.The removal of the
pythonfeature from the cargo build command is consistent with eliminating bring-your-own-engine Python support while maintaining other necessary features.container/Dockerfile.tensorrt_llm (1)
278-283:✅ Verification successful
Verify that mistralrs feature removal doesn't break TensorRT-LLM functionality.
The cargo build command now runs without any features, which removes both
pythonandmistralrsfeatures. While removingpythonaligns with the PR objectives, please confirm that themistralrsfeature is not required for TensorRT-LLM builds.
🏁 Script executed:
#!/bin/bash # Description: Check if mistralrs feature is required for TensorRT-LLM builds # Expected: Find references to mistralrs in TensorRT-LLM related code or configs # Search for mistralrs references in TensorRT-LLM context rg -i "mistralrs|mistral.*rs" --type dockerfile rg -i "mistralrs|mistral.*rs" container/ rg -i "tensorrt.*mistral|mistral.*tensorrt" # Check if mistralrs is conditionally compiled for tensorrt ast-grep --pattern 'cfg(feature = "mistralrs")'Length of output: 757
🏁 Script executed:
#!/bin/bash # Inspect Cargo.toml for mistralrs feature definitions and default features # 1. Find all occurrences of 'mistralrs' in TOML files rg -n "mistralrs" --glob "*.toml" # 2. Locate and display [features] sections in crates 'dynamo-run' and 'llmctl' rg -l 'name = "dynamo-run"' -R "*.toml" | xargs -I{} sh -c 'echo "== {} =="; sed -n "/^\[features\]/,/^\[/{p}" {}' rg -l 'name = "llmctl"' -R "*.toml" | xargs -I{} sh -c 'echo "== {} =="; sed -n "/^\[features\]/,/^\[/{p}" {}' # 3. Check for any use of default-features toggles rg -n "default-features" --glob "*.toml"Length of output: 2931
Mistralrs remains enabled via default features
The
cargo buildincontainer/Dockerfile.tensorrt_llmno longer passes an explicit--featuresflag, but it still picks up the crate’s default features. Inlaunch/dynamo-run/Cargo.toml, the default features list is set to["mistralrs"], so themistralrsengine is still compiled in without any additional flags. No further changes are needed here.container/Dockerfile.none (1)
49-55:❌ Incorrect review comment
Consider passing feature flags or
--workspaceto avoid an accidental partial build
cargo build --release --lockednow relies on the default feature-set that each crate enables.
If any binaries (dynamo-run,http,llmctl,metrics,mock_worker) still expect theblock-manageror other opt-in features, the build will silently omit them and the resulting binaries may be missing symbols at runtime.-RUN cargo build --release --locked && \ +RUN cargo build --release --locked --workspace --features dynamo-llm/block-manager && \Please verify with a clean build that:
- All five binaries still compile successfully.
- Integration tests exercising removed
pythonfunctionality are not compiled in.
🏁 Script executed:
#!/bin/bash # Display workspace section from root Cargo.toml echo "=== Workspace Section ===" awk 'BEGIN{in_ws=0} /^\[workspace\]/{in_ws=1} in_ws{print} /^\[/{if(in_ws && !/\[workspace\]/) exit}' Cargo.toml echo # Show feature definitions in root Cargo.toml echo "=== Root Features ===" grep -R "^features" -n Cargo.toml || echo "No features block in root Cargo.toml" echo # Find any 'block-manager' references (feature or code) echo "=== block-manager Occurrences ===" rg -n "block-manager" .Length of output: 2598
🏁 Script executed:
#!/bin/bash # Display first 80 lines of container/Dockerfile.none with line numbers sed -n '1,80p' -n container/Dockerfile.none | sed = | sed 'N; s/\n/ /'Length of output: 2123
Ignore feature flags for the “none” variant Dockerfile
The
container/Dockerfile.noneis explicitly the “no-features” build (as indicated by its.nonesuffix), meant to produce a minimal set of binaries without any optional components. Adding--workspaceor--features dynamo-llm/block-managerhere would defeat the purpose of this variant. The other Dockerfiles (.sglang,.vllm) already include the appropriate feature flags in their own build stages.Likely an incorrect or invalid review comment.
lib/bindings/python/rust/engine.rs (1)
204-257:⚠️ Potential issue
println!leaks to stdout and blocks – use structured logging insteadInside the hot loop you mix
tracing::trace!and a bareprintln!, which:
- Bypasses the configured tracing subscriber.
- Performs a blocking write to stdout for every Python exception.
Replace with
tracing::error!to integrate with the existing log pipeline.- println!(); - Python::with_gil(|py| e.display(py)); + tracing::error!("python exception: {}", e); + Python::with_gil(|py| e.display(py));Likely an incorrect or invalid review comment.
It was removed from the docs in 0.2.1 and replaced with writing a [standalone Python engine](https://github.com/ai-dynamo/dynamo/blob/main/docs/guides/dynamo_run.md#writing-your-own-engine-in-python). Also remove the associated `dynamo-run` feature `python`. Releasing this in 0.3.0 will resolve #784 and #1109.
It was removed from the docs in 0.2.1 and replaced with writing a standalone Python engine.
Also remove the associated
dynamo-runfeaturepython.Releasing this in 0.3.0 will resolve #784 and #1109.
Summary by CodeRabbit
New Features
Bug Fixes
Refactor
Chores