Skip to content

Conversation

@grahamking
Copy link
Contributor

@grahamking grahamking commented Aug 12, 2025

For LoRA we want the worker to be able to call register_llm multiple times, adding models to it's own endpoint. That means allowing an endpoint to serve multiple models.

Changes:

For #2267 .

Summary by CodeRabbit

  • New Features

    • Automatic generation of unique model network names; models now receive a default name without extra configuration.
  • Refactor

    • Streamlined model registration flow to reduce external dependencies and simplify setup.
    • Removed duplicate-instance restriction, allowing multiple instances of the same model/component to run concurrently.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 12, 2025

Walkthrough

Imports narrowed; LocalModel.card made public; uniqueness checks removed from LocalModel::attach and LocalModelBuilder::build; ModelNetworkName creation switched to ModelNetworkName::new() with Default implemented; previous etcd-derived constructors and loaders removed; Display unchanged.

Changes

Cohort / File(s) Summary
Local model orchestration
lib/llm/src/local_model.rs
Removed ensure_unique calls and function; switched network name creation to ModelNetworkName::new(); made LocalModel.card public; trimmed imports (removed Component).
Network name generation
lib/llm/src/local_model/network_name.rs
Replaced etcd/endpoint-derived constructors with new(); added Default; removed From<&Instance>, from_local, from_entry, and load_entry; simplified imports; internal String and Display retained.

Sequence Diagram(s)

sequenceDiagram
  participant Builder as LocalModelBuilder
  participant Name as ModelNetworkName
  participant Etcd as etcd::Client
  participant Model as LocalModel

  Builder->>Name: new()
  Name-->>Builder: ModelNetworkName
  Builder->>Etcd: register(model_name)
  Builder-->>Model: build()

  Model->>Etcd: attach(model_name)
  Note over Builder,Model: No per-component uniqueness check
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I thump my paws—new names take flight,
No duplicate checks in sight.
Cards peek out, the fields now share,
Etcd hears a simpler prayer.
Hop, hop—UUIDs bloom anew,
A lighter warren for models to run through. 🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
lib/llm/src/local_model/network_name.rs (1)

10-12: UUID-based network name aligns with multi-model endpoints; consider ergonomic trait impls to avoid to_string clones

The change meets the PR goal. To reduce allocations at call sites (e.g., kv_create), consider implementing AsRef and/or Deref<Target = str> so callers can pass &network_name without allocating a new String.

Add these impls outside the shown range:

impl AsRef<str> for ModelNetworkName {
    fn as_ref(&self) -> &str {
        &self.0
    }
}

impl std::ops::Deref for ModelNetworkName {
    type Target = str;
    fn deref(&self) -> &Self::Target {
        &self.0
    }
}
lib/llm/src/local_model.rs (1)

239-239: Avoid making LocalModel.card public; prefer accessor/mutator to preserve invariants

Making card public widens the API and bypasses invariants (e.g., callers can mutate after registration). You already expose card() for read access. If mutation is needed, consider pub(crate) or a dedicated mutable accessor.

Apply within this range:

-    pub card: ModelDeploymentCard, // TEMP pub
+    card: ModelDeploymentCard,

Then add a constrained mutator elsewhere:

impl LocalModel {
    pub(crate) fn card_mut(&mut self) -> &mut ModelDeploymentCard {
        &mut self.card
    }
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7e4eec2 and ee40b6e.

📒 Files selected for processing (2)
  • lib/llm/src/local_model.rs (3 hunks)
  • lib/llm/src/local_model/network_name.rs (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
lib/llm/src/local_model.rs (2)
lib/runtime/src/component.rs (3)
  • component (347-349)
  • component (490-496)
  • new (481-487)
lib/llm/src/local_model/network_name.rs (1)
  • new (10-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (4)
lib/llm/src/local_model/network_name.rs (2)

15-19: Default delegating to new() is appropriate

Using Default -> new() keeps construction consistent and simple. No issues.


4-4: Confirmed MODEL_ROOT_PATH has no trailing slash
MODEL_ROOT_PATH is defined as "models" in lib/llm/src/discovery.rs (line 14), so concatenations won’t produce double slashes.

lib/llm/src/local_model.rs (2)

13-14: Import narrowing LGTM

Dropping Component and importing only Endpoint fits the current usage and reduces surface area.


320-320: Confirm lease semantics and key-shape assumptions; verify card publishing collision risk

  • kv_create(..., None) unwraps to self.primary_lease() (the client’s primary lease), so keys remain ephemeral as before.
  • No remaining uses of ModelNetworkName::from_* or other slug-based constructors, and all consumers call kv_get_prefix(MODEL_ROOT_PATH) and parse ModelEntry from the value—no code assumes a slug-derived key suffix.
  • Could not locate any card_store.publish(..., None, slug, …) call sites in the repo; please manually confirm that writing cards under a slug-only key for multi-model endpoints is intended. If not, include the variant ID in the key (e.g. "{slug}:{variant_id}") or publish variant metadata alongside each card.

For LoRA we want the worker to be able to call `register_llm` multiple
times, adding models to it's own endpoint. That means allowing an
endpoint to serve multiple models.

Changes:
- Register in etcd using a UUID rather than the slugified endpoint name.
- Remove the check that prevents multiples models on an endpoint. This
  reverts #1103

For #2267 .
@grahamking grahamking merged commit 72ec5f5 into main Aug 13, 2025
12 of 13 checks passed
@grahamking grahamking deleted the gk-shared-endpoint branch August 13, 2025 14:48
hhzhang16 pushed a commit that referenced this pull request Aug 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants