-
Couldn't load subscription status.
- Fork 662
chore(discovery): Watch/publish ModelDeploymentCard instead of ModelEntry #3350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughModelEntry and MODEL_ROOT_PATH were removed. Discovery/watcher switched to ModelDeploymentCard and derives EndpointId from etcd keys. ModelDeploymentCard gained name() and checksum caching; move_from_nats became public. ModelManager and model registration APIs now carry a per-card checksum. Entrypoints, routers, tests, and runtime utilities use model_card::ROOT_PATH ("mdc/"). Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Etcd as Etcd KV ("mdc/*")
participant Watcher as ModelWatcher
participant Card as ModelDeploymentCard
participant NATS as NATS
participant Manager as ModelManager / Engines
Etcd->>Watcher: PUT(key="mdc/{endpoint}/...", value=card_json)
Note right of Watcher: Extract EndpointId from key string
Watcher->>Watcher: etcd_key_extract(key) --> (EndpointId, name)
Watcher->>Card: Deserialize JSON -> ModelDeploymentCard
Watcher->>Card: await card.move_from_nats(nats_client) -- fetch artifacts (async)
Card-->>Watcher: populate cache_dir, checksum cached
Watcher->>Manager: save_model_card(key, card.clone())
Manager->>Manager: store checksum and wire engines using card.name()
Manager-->>Watcher: registration ack
Watcher-->>Etcd: complete handling
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
lib/llm/src/model_card.rs (1)
298-355: Compile-time bug: wrong argument type to TempDir::with_prefix.
with_prefixexpects&str; passingStringwon’t compile. Also prefer using theSlug’sAsRef<str>impl.Apply:
- let bucket_name = self.slug(); - let target_dir = tempfile::TempDir::with_prefix(bucket_name.to_string())?; + let bucket_name = self.slug(); + let target_dir = tempfile::TempDir::with_prefix(bucket_name.as_ref())?;The rest of the function looks sound; keeping the
Arc<TempDir>incache_dirensures the directory’s lifetime. Consider a tiny accessor (e.g.,cache_path()) later if callers need the path.lib/llm/src/discovery/watcher.rs (2)
107-120: Log strings still say “model entry”.We now deserialize a ModelDeploymentCard; update logs to avoid confusion.
- tracing::error!(%err, value, "Invalid JSON in model entry") + tracing::error!(%err, value, "Invalid JSON in model card")And similarly for the UTF‑8 branch.
444-469: Bug:all_cardslikely misses versioned keys.Cards are written under keys like
v1/mdc/...; querying prefix"mdc"won’t match those. This can cause false “last instance” detections and premature engine removal.Apply:
- let kvs = etcd_client.kv_get_prefix(model_card::ROOT_PATH).await?; - let mut cards = Vec::with_capacity(kvs.len()); + let v1_prefix = format!("v1/{}/", model_card::ROOT_PATH); + let raw_prefix = format!("{}/", model_card::ROOT_PATH); + let mut kvs = etcd_client.kv_get_prefix(&v1_prefix).await?; + // Also accept unversioned keys for compatibility. + kvs.extend(etcd_client.kv_get_prefix(&raw_prefix).await?); + let mut cards = Vec::with_capacity(kvs.len());Also update log wording in this block from “model entry” to “model card”.
🧹 Nitpick comments (5)
lib/runtime/src/utils/worker_monitor.rs (1)
96-100: Consider using ModelDeploymentCard for type safety.The code uses raw JSON parsing (
serde_json::Value) instead of deserializing toModelDeploymentCardlike other files in this PR (kv_router.rs line 252). This loses compile-time type safety for theruntime_configfield access.Is there a specific reason to avoid using
ModelDeploymentCardhere? If not, consider using the typed approach for better maintainability:- |card: serde_json::Value| { - card.get("runtime_config") - .and_then(|rc| rc.get("total_kv_blocks")) - .and_then(|t_kv| t_kv.as_u64()) + |card: ModelDeploymentCard| { + card.runtime_config.total_kv_blocks },If
ModelDeploymentCardis not accessible from the runtime crate or if there are other constraints preventing this approach, please clarify so we can document the rationale.lib/llm/src/model_card.rs (1)
414-421: Behavior change:load_from_storeno longer localizes artifacts.Previously this populated
cache_dirby pulling from NATS; now it returns an in‑store card. Ensure every callsite invokescard.move_from_nats(...)before using tokenizer/config files (the watcher does).If any non-watcher path still calls
load_from_store, I can scan and patch those to fetch from NATS. Want a follow-up PR?lib/llm/src/discovery/watcher.rs (3)
195-199: Potential namespace mixing on delete.
cards_for_model(model_name)counts across all namespaces; engines are keyed byname()only. Verify intended semantics: should removal be gated by instances in the same namespace, or truly global?If you want namespace scoping, parse the namespace from
keyand filter:- let active_instances = self.cards_for_model(&model_name).await?; + let ns = etcd_key_to_endpoint_id(key)?.namespace; + let active_instances = self.cards_for_model_in_namespace(&model_name, &ns).await?;I can draft
cards_for_model_in_namespaceif desired.
471-478: Optional: filter by namespace here if delete semantics are per-namespace.If you adopt namespace scoping, add a variant that filters on parsed
EndpointId.namespace.I can implement
cards_for_model_in_namespace()and its call-site changes if you confirm desired behavior.
514-538: Tests cover both versioned/unversioned prefixes.Consider adding a case without an instance-id segment (exact 4 parts) to assert acceptance, and a negative case with wrong root. I can add these if wanted.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
lib/llm/src/discovery.rs(0 hunks)lib/llm/src/discovery/model_entry.rs(0 hunks)lib/llm/src/discovery/watcher.rs(19 hunks)lib/llm/src/entrypoint/input/common.rs(2 hunks)lib/llm/src/entrypoint/input/grpc.rs(2 hunks)lib/llm/src/entrypoint/input/http.rs(2 hunks)lib/llm/src/kv_router.rs(2 hunks)lib/llm/src/local_model.rs(1 hunks)lib/llm/src/local_model/network_name.rs(0 hunks)lib/llm/src/model_card.rs(4 hunks)lib/llm/tests/http_metrics.rs(4 hunks)lib/runtime/src/utils/typed_prefix_watcher.rs(1 hunks)lib/runtime/src/utils/worker_monitor.rs(1 hunks)
💤 Files with no reviewable changes (3)
- lib/llm/src/local_model/network_name.rs
- lib/llm/src/discovery/model_entry.rs
- lib/llm/src/discovery.rs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-02T16:46:54.015Z
Learnt from: GuanLuo
PR: ai-dynamo/dynamo#2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.
Applied to files:
lib/runtime/src/utils/worker_monitor.rs
🧬 Code graph analysis (4)
lib/runtime/src/utils/worker_monitor.rs (3)
lib/runtime/src/utils/typed_prefix_watcher.rs (1)
lease_id(211-213)lib/llm/src/local_model.rs (1)
card(331-333)lib/llm/src/discovery/watcher.rs (2)
serde_json(107-107)serde_json(452-452)
lib/llm/src/entrypoint/input/common.rs (2)
lib/runtime/src/transports/etcd.rs (1)
etcd_client(131-133)lib/runtime/src/distributed.rs (1)
etcd_client(269-271)
lib/llm/src/kv_router.rs (3)
lib/bindings/python/src/dynamo/_core.pyi (2)
ModelDeploymentCard(445-450)lease_id(138-142)lib/runtime/src/utils/typed_prefix_watcher.rs (1)
lease_id(211-213)lib/llm/src/local_model.rs (1)
card(331-333)
lib/llm/src/local_model.rs (3)
lib/runtime/src/transports/etcd.rs (1)
lease_id(136-138)lib/runtime/src/component.rs (1)
endpoint(236-243)lib/runtime/src/storage/key_value_store.rs (1)
from_raw(36-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: clippy (.)
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (12)
lib/llm/src/entrypoint/input/http.rs (1)
7-7: LGTM! Clean migration to model_card::ROOT_PATH.The changes correctly update the import and reference the etcd root path from the new
model_cardmodule instead of the removeddiscovery::MODEL_ROOT_PATH. The logic remains unchanged.Also applies to: 13-13, 78-78
lib/llm/src/entrypoint/input/grpc.rs (1)
7-7: LGTM! Consistent migration to model_card::ROOT_PATH.The changes mirror those in http.rs, correctly updating the import and using the new path constant from the
model_cardmodule.Also applies to: 12-12, 50-50
lib/runtime/src/utils/typed_prefix_watcher.rs (1)
70-76: LGTM! Documentation updated to reflect ModelDeploymentCard migration.The example code correctly demonstrates the new API using
ModelDeploymentCardinstead ofModelEntryand the updated etcd prefix"mdc/".lib/llm/src/kv_router.rs (1)
47-47: LGTM! Clean migration to ModelDeploymentCard.The changes correctly:
- Import
ModelDeploymentCardfrom themodel_cardmodule- Update the etcd prefix to
model_card::ROOT_PATH- Update the value extractor closure to work with
ModelDeploymentCardinstead ofModelEntryThe extraction logic remains functionally identical, accessing the
runtime_configfield.Also applies to: 250-253
lib/llm/src/model_card.rs (1)
192-196: Accessorname()is a clean addition.Simple, correct alias to
display_name; helps callsites drop field coupling.lib/llm/src/discovery/watcher.rs (7)
16-16: Import changes look correct.
EndpointIdandAnnotatedare the right additions for the refactor.
24-24: MDC-centric import is correct.Switch to
model_card::{self, ModelDeploymentCard}is aligned with the PR goal.
40-40:ModelManagerpath change is fine.
150-168: PUT handling flow looks correct.Key → EndpointId → optional namespace filter →
handle_putpath is clean.
259-266: Good: localize artifacts before wiring pipelines.Calling
card.move_from_nats(self.drt.nats_client()).await?up front prevents downstream file I/O surprises.
274-282: Early exit on additional endpoints is sound.Avoids rebuilding engines when a new instance appears for an existing model.
396-397: API assumption:Backend::from_mdc(card)takes&ModelDeploymentCard.You’re passing
&mut ModelDeploymentCard; coercion to&Tis fine if the signature is&ModelDeploymentCard. Please confirm the signature.If it consumes by value, clone:
Backend::from_mdc(&card.clone()).
5e26291 to
826944f
Compare
826944f to
7449e7b
Compare
…ntry Those two always overlapped, now there's only one. Allows us to remove ModelEntry and ModelNetworkName. The next and hopefully final step will be to attach the card to an etcd lease for auto-cleanup. Signed-off-by: Graham King <[email protected]>
Signed-off-by: Graham King <[email protected]>
This cleans up the ModelDeploymentCard on shutdown Signed-off-by: Graham King <[email protected]>
If we are already serving a model and a new worker arrives with a different model card, claiming to be for the same model, we now don't know which is the correct model. Log an error. Ideally we would stop using that instance, but that happens in a different crate. It's not clear how we can communicate with it, see TODO. Signed-off-by: Graham King <[email protected]>
7449e7b to
163ff3d
Compare
|
@coderabbitai review please |
|
@grahamking: I'll review the changes in this PR for you. ✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
lib/llm/src/entrypoint/input/http.rs (1)
244-246: Keep the etcd watch handle alive; dropping it may cancel the watch
dissolve()returns a watch handle; binding it to_watcherand dropping it can stop events. Keep a guard in scope to ensure the watch remains active.Apply this diff:
- let (_prefix, _watcher, receiver) = models_watcher.dissolve(); + let (_prefix, watcher, receiver) = models_watcher.dissolve(); + // Keep the watch handle alive for the lifetime of this task + let _watcher_guard = watcher;lib/llm/src/discovery/model_manager.rs (2)
109-112: Bug: has_model_any ignores embeddings/tensor models.This causes handle_put to rebuild/register existing embeddings/tensor models, leading to ModelAlreadyExists errors and noisy logs.
Apply:
pub fn has_model_any(&self, model: &str) -> bool { self.chat_completion_engines.read().contains(model) - || self.completion_engines.read().contains(model) + || self.completion_engines.read().contains(model) + || self.embeddings_engines.read().contains(model) + || self.tensor_engines.read().contains(model) }
289-304: Move/borrow errors with kv_router_config; used after move and moved twice.
- unwrap_or_default() moves the Option; it’s used again later.
- The Option is also passed into two constructors; needs cloning.
Apply:
- etcd_client - .kv_create( - &router_key, - serde_json::to_vec_pretty(&kv_router_config.unwrap_or_default())?, - None, // use primary lease - ) - .await?; - - let selector = Box::new(DefaultWorkerSelector::new(kv_router_config)); + let cfg = kv_router_config.clone().unwrap_or_default(); + etcd_client + .kv_create( + &router_key, + serde_json::to_vec_pretty(&cfg)?, + None, // use primary lease + ) + .await?; + + let selector = Box::new(DefaultWorkerSelector::new(kv_router_config.clone())); let chooser = KvRouter::new( component.clone(), kv_cache_block_size, Some(selector), kv_router_config, router_uuid.to_string(), ) .await?;lib/llm/src/discovery/watcher.rs (1)
337-346: Cannot move self.kv_router_config out of &self.Passing the field by value moves it; clone the Option instead.
Apply:
- self.kv_router_config, + self.kv_router_config.clone(),
♻️ Duplicate comments (2)
lib/runtime/src/utils/worker_monitor.rs (1)
92-103: Use a shared constant for the model-card prefix.The hardcoded
"mdc/"literal is inconsistent with other files in this PR (http.rs, grpc.rs, kv_router.rs, local_model.rs) that usemodel_card::ROOT_PATH. The in-code comments explain thatWorkerMonitoris in the wrong crate (should be in dynamo-llm, not dynamo-runtime), preventing the use ofModelDeploymentCard. However, you can still improve consistency by defining a shared constant for the prefix, either in a common location or by exportingROOT_PATHfrom a module accessible to both crates.Consider defining a shared constant for the "mdc/" prefix or exporting
model_card::ROOT_PATHfrom a common location accessible to both dynamo-runtime and dynamo-llm. This ensures consistency across the codebase and simplifies future updates to the prefix.lib/llm/tests/http_metrics.rs (1)
502-571: Lease-based cleanup note already tracked in prior reviewThis block assumes lease-based cleanup; earlier feedback flagged the lease isn’t yet attached in
LocalModel::attach. Acknowledging it’s slated for the next PR; no action here.
🧹 Nitpick comments (4)
lib/llm/tests/http_metrics.rs (1)
350-357: Keep the etcd watch handle alive in the test watcher
_watcheris dropped immediately; this may stop the stream. Retain it to keep the watch active.- let (_prefix, _watcher, receiver) = models_watcher.dissolve(); + let (_prefix, watcher, receiver) = models_watcher.dissolve(); + let _watcher_guard = watcher;lib/llm/src/model_card.rs (2)
46-53: GGUF variant returns a default checksum; consider a more discriminative fallbackUsing a default value for
GGUFmeans GGUF-only cards won’t influencemdcsum. At minimum, include stable identity (e.g., file path string) to reduce collisions until a content hash is available.Example change:
- ModelInfoType::GGUF(_) => Checksum::default().to_string(), + ModelInfoType::GGUF(p) => { + // Fallback: include path text to distinguish different GGUFs + blake3::hash(p.display().to_string().as_bytes()).to_string() + }Apply similar treatment to
TokenizerKind::GGUFandPromptFormatterArtifact::GGUFfor consistency.
247-288: Reduce mdcsum log verbosityThe “TEMP” debug log will fire on first call per process and can be noisy. Drop to trace or remove.
- let hash = blake3::hash(&bytes_to_hash).to_string(); - tracing::debug!("mdcsum: {hash} of {} bytes", bytes_to_hash.len()); // TEMP - hash + let hash = blake3::hash(&bytes_to_hash).to_string(); + tracing::trace!("mdcsum: {hash} of {} bytes", bytes_to_hash.len()); + hashlib/llm/src/discovery/model_manager.rs (1)
395-397: Avoid cloning in checksum() return type.Return a borrowed &str instead of allocating a new String.
Apply:
-pub fn checksum(&self, model: &str) -> Option<String> { - self.checksums.get(model).map(|s| s.to_string()) -} +pub fn checksum(&self, model: &str) -> Option<&str> { + self.checksums.get(model).map(|s| s.as_str()) +}Call sites (e.g., is_valid_checksum) continue to work.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
lib/bindings/python/rust/http.rs(1 hunks)lib/llm/src/common/checked_file.rs(2 hunks)lib/llm/src/discovery.rs(0 hunks)lib/llm/src/discovery/model_entry.rs(0 hunks)lib/llm/src/discovery/model_manager.rs(8 hunks)lib/llm/src/discovery/watcher.rs(16 hunks)lib/llm/src/entrypoint/input/common.rs(2 hunks)lib/llm/src/entrypoint/input/grpc.rs(6 hunks)lib/llm/src/entrypoint/input/http.rs(7 hunks)lib/llm/src/kv_router.rs(2 hunks)lib/llm/src/local_model.rs(1 hunks)lib/llm/src/local_model/network_name.rs(0 hunks)lib/llm/src/model_card.rs(12 hunks)lib/llm/src/model_type.rs(1 hunks)lib/llm/src/preprocessor.rs(1 hunks)lib/llm/tests/http-service.rs(6 hunks)lib/llm/tests/http_metrics.rs(10 hunks)lib/llm/tests/kserve_service.rs(4 hunks)lib/runtime/src/storage/key_value_store/etcd.rs(2 hunks)lib/runtime/src/utils/typed_prefix_watcher.rs(1 hunks)lib/runtime/src/utils/worker_monitor.rs(1 hunks)
💤 Files with no reviewable changes (3)
- lib/llm/src/discovery.rs
- lib/llm/src/discovery/model_entry.rs
- lib/llm/src/local_model/network_name.rs
✅ Files skipped from review due to trivial changes (1)
- lib/runtime/src/utils/typed_prefix_watcher.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- lib/llm/src/kv_router.rs
- lib/llm/src/entrypoint/input/common.rs
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-02T16:46:54.015Z
Learnt from: GuanLuo
PR: ai-dynamo/dynamo#2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.
Applied to files:
lib/runtime/src/utils/worker_monitor.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: trtllm (amd64)
- GitHub Check: trtllm (arm64)
- GitHub Check: vllm (arm64)
- GitHub Check: vllm (amd64)
- GitHub Check: sglang
- GitHub Check: Mirror Repository to GitLab
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: Build and Test - dynamo
🔇 Additional comments (20)
lib/llm/src/preprocessor.rs (1)
127-127: LGTM!The explicit
to_string()conversion aligns with theStringtype of themdcsumfield defined at line 104 and used throughout the struct. The change ensures consistent type handling for the per-model checksum.lib/llm/src/model_type.rs (1)
77-94: LGTM!The new
units()method correctly decomposes the bitflag into its componentModelTypeunits, mirroring the structure of the existingas_vec()method. This provides a typed alternative for consumers that need to iterate over enabled capabilities.lib/runtime/src/storage/key_value_store/etcd.rs (2)
152-152: LGTM!Attaching the primary lease to the PutOptions ensures that newly created keys are tied to the lease lifecycle, enabling automatic cleanup when the process stops. This aligns with the PR objective of ensuring etcd is fully cleaned up when a process terminates.
220-225: LGTM!The update method now attaches the lease and includes
with_prev_key()to enable returning the previous key from the server. This ensures consistent lease association during updates and maintains the lifecycle management introduced in the create method.lib/llm/src/common/checked_file.rs (2)
36-36: LGTM!Adding the
Copytrait toCryptographicHashMethodsis safe and enables implicit bitwise-copy semantics for the enum, which simplifies checksum propagation throughout the codebase.
262-269: LGTM!The
Defaultimplementation provides sensible initialization (empty string for hash, BLAKE3 for algorithm), which supports the per-model checksum handling introduced across the codebase.lib/llm/src/local_model.rs (1)
421-427: LGTM with noted follow-up.The code derives a lease-aware key but passes
Nonefor the lease parameter inpublish. While this means the MDC record is not yet TTL-scoped, grahamking has acknowledged this in a previous review and confirmed that lease attachment will be addressed in a follow-up PR. Given the substantial scope of this PR (ModelEntry → ModelDeploymentCard migration), the incremental approach is reasonable.lib/llm/src/entrypoint/input/grpc.rs (4)
50-50: LGTM!The switch from
MODEL_ROOT_PATHtomodel_card::ROOT_PATHaligns with the card-centric approach introduced across the codebase in this PR.
66-124: LGTM!The code correctly introduces per-model checksum handling in the
StaticRemotebranch by deriving the checksum viacard.mdcsum()and propagating it through bothadd_chat_completions_modelandadd_completions_modelcalls. This aligns with the broader checksum propagation pattern introduced across the codebase.
132-134: LGTM!The
StaticFullbranch correctly derives the checksum viamodel.card().mdcsum()and propagates it through both model registration calls.
144-160: LGTM!The
StaticCorebranch correctly derives the checksum viamodel.card().mdcsum()and propagates it through both model registration calls, maintaining consistency with the other static branches.lib/bindings/python/rust/http.rs (2)
33-44: LGTM!The updated
add_completions_modelmethod correctly accepts thechecksumparameter and propagates it to the model manager. Note that this is a public API change that will require Python callers to provide the checksum when registering models.
46-57: LGTM!The updated
add_chat_completions_modelmethod correctly accepts thechecksumparameter and propagates it to the model manager, maintaining consistency with theadd_completions_modelchanges.lib/llm/src/entrypoint/input/http.rs (1)
78-81: Verify ROOT_PATH for prefix watching (“mdc” vs “mdc/”)Ensure the etcd keys are written under the same prefix form you’re watching. If keys are stored under “mdc/…”, watching “mdc” could be broader than intended; watching “mdc/” is safer to avoid collisions.
lib/llm/tests/http-service.rs (1)
588-599: LGTM: mdcsum-based registrations are consistentCreating a card per model and passing
card.mdcsum()intoadd_*_modelaligns with the new API.lib/llm/src/model_card.rs (2)
37-37: Confirm intended ROOT_PATH shapeConstant is
mdc(no trailing slash). If keys are stored asmdc/<key>, consider whetherROOT_PATHshould bemdc/for clarity and precise prefix watching.
232-235: LGTM: Public name accessor
name()returningdisplay_nameimproves API clarity.lib/llm/src/discovery/model_manager.rs (2)
339-342: Storing per-model checksums alongside engines looks good.Single structure for engines+checksums reduces inconsistency; removal path clears both.
365-373: add() correctly persists checksum with engine.Error on duplicates, then insert engine and checksum; OK.
lib/llm/src/discovery/watcher.rs (1)
548-574: Key extractor looks solid; doc and errors are clear.Docstring matches “mdc/…” and errors include offending key. Tests cover with/without “v1/”.
Signed-off-by: Graham King <[email protected]>
Signed-off-by: Graham King <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with a comment, but someone else must approve.
…ntry (#3350) Signed-off-by: Graham King <[email protected]> Signed-off-by: Piotr Tarasiewicz <[email protected]>
Since #3350 the MDC is attached to an etcd lease, so it cleans up on shutdown. We don't need to manually clear namespace any more. If there are any remaining non-lease keys (I'm not aware of any, and I have looked), we will treat that as a bug. Signed-off-by: Graham King <[email protected]>
…ntry (#3350) Signed-off-by: Graham King <[email protected]>
Those two always overlapped, now there's only one. Allows us to remove ModelEntry and ModelNetworkName.
We also attach the lease in etcd's KeyValueStore impl, and now the etcd is fully cleaned up when a process stops.
Summary by CodeRabbit
New Features
Refactor
Tests
Chores