Skip to content

Conversation

@grahamking
Copy link
Contributor

@grahamking grahamking commented Sep 8, 2025

This ensures frontend and worker are using exactly the same files.

What is old is new again.

The ModelDeploymentCard looks like this now:

{
  "display_name": "Qwen3-0.6B",
  "slug": "qwen3-0_6b",
  "model_info": {
    "hf_config_json": {
      "path": "nats://0.0.0.0:4222/qwen3-0_6b/config.json",
      "checksum": "blake3:d873e9bb2bd54ebf6c4a60732887a57c33881cca4b0d92ec5107e9cc4f023660"
    }
  },
  "tokenizer": {
    "hf_tokenizer_json": {
      "path": "nats://0.0.0.0:4222/qwen3-0_6b/tokenizer.json",
      "checksum": "blake3:b0cb923fc505fdf0a53f0287654fa26577d3f333d4134350da0a97664b228739"
    }
  },
  "prompt_formatter": {
    "hf_tokenizer_config_json": {
      "path": "nats://0.0.0.0:4222/qwen3-0_6b/tokenizer_config.json",
      "checksum": "blake3:4422061cdcb205f75bc448b09f4017f8f2f92f172aeb2cf0248f4ce2f0f360e6"
    }
  },
  "gen_config": {
    "hf_generation_config_json": {
      "path": "nats://0.0.0.0:4222/qwen3-0_6b/generation_config.json",
      "checksum": "blake3:18b891ec9627c101a441aec842f39c05d9f13df63b4cfd4226594941d977a9a6"
    }
  },
  "last_published": null,
  "context_length": 40960,
  "kv_cache_block_size": 16,
  "migration_limit": 0,
  "runtime_config": {
    "total_kv_blocks": 46111,
    "max_num_seqs": 1024,
    "max_num_batched_tokens": 8192,
    "tool_call_parser": null,
    "reasoning_parser": null
  }
}

Summary by CodeRabbit

  • New Features
    • Added a unified file reference that supports both local paths and URLs, with built‑in checksum verification for integrity.
  • Improvements
    • Faster hashing for large files and more efficient I/O, improving performance during model asset handling.
    • Smarter artifact transfers: local files are uploaded to remote storage and referenced by URL; downloads verify integrity before use.
    • Clearer, more actionable error messages when required local files are missing or invalid.
  • Chores
    • Updated cryptographic library configuration to enable performance features.

@grahamking grahamking requested a review from a team as a code owner September 8, 2025 19:57
@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 8, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 8, 2025

Walkthrough

Introduces a new CheckedFile abstraction with Blake3 checksums and serde support, updates model_card and template processing to use it, adjusts NATS object store APIs to borrow Url, and enables blake3 mmap/rayon features. Adds a new module, updates public enums and function signatures, and removes a license header.

Changes

Cohort / File(s) Summary
Dependencies: blake3 features
lib/llm/Cargo.toml
Change blake3 from "1" to { version = "1.8", features = ["mmap","rayon"] }.
Common module exports
lib/llm/src/common.rs
Remove license header; add pub mod checked_file;.
CheckedFile abstraction
lib/llm/src/common/checked_file.rs
New module: CheckedFile, Checksum, CryptographicHashMethods. From-disk creation with Blake3 hashing; move between local path and URL; serde serialize/deserialize; checksum comparison; display impl; internal mmap/rayon hashing strategy; unit tests.
Model card refactor to CheckedFile
lib/llm/src/model_card.rs
Replace String paths with CheckedFile in multiple enums (ModelInfoType, TokenizerKind, PromptFormatterArtifact, GenerationConfig). Update repo/file loading to build/use CheckedFile. Adjust NATS upload/download macros to operate on path()/url(). Change HFConfig::from_json_file to accept P: AsRef<Path>. Import updates.
Template preprocessing updates
lib/llm/src/preprocessor/prompt/template.rs
Switch to CheckedFile for tokenizer config and chat template; require local path via path(); improved error messages; unchanged public API.
NATS transport API tweaks
lib/runtime/src/transports/nats.rs
Change object_store_upload and object_store_download to take &Url instead of owning Url; internal call adjustments.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor C as Caller
  participant MC as ModelCard/Template Logic
  participant CF as CheckedFile
  participant RT as NATS Runtime
  Note over CF: New abstraction: local Path or remote Url<br/>with Blake3 checksum

  C->>MC: Prepare artifact for upload
  MC->>CF: path()
  alt Has local path
    MC->>RT: object_store_upload(filepath, &url)
    RT-->>MC: Ok
    MC->>CF: move_to_url(url)
  else No local path (already URL)
    MC-->>C: Skip upload
  end
Loading
sequenceDiagram
  autonumber
  actor C as Caller
  participant MC as ModelCard/Template Logic
  participant CF as CheckedFile
  participant RT as NATS Runtime

  C->>MC: Ensure local file for processing
  MC->>CF: url()
  alt Has remote Url
    MC->>RT: object_store_download(&url, filepath)
    RT-->>MC: Ok
    MC->>CF: move_to_disk(filepath)
    MC->>CF: checksum_matches(filepath)
    alt Checksum ok
      MC-->>C: Proceed
    else Checksum mismatch
      MC-->>C: Error
    end
  else Has local path
    MC-->>C: Use existing path
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Poem

A rabbit with checksums hops by the stream,
BLAKE3 sparkles—mmap makes it gleam.
Paths turn to URLs with a whisk and twirl,
NATS carries bytes in a cloudward swirl.
Templates unfold, tokens align—
All files now checked, and feeling fine. 🐇✨

Tip

👮 Agentic pre-merge checks are now available in preview!

Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.

  - Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.
  - Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/runtime/src/transports/nats.rs (1)

419-432: url_to_bucket_and_key drops nested path segments (breaks keys with slashes).

Only the first two segments are used; additional segments are ignored, corrupting keys like nats://h/bucket/dir1/dir2/file. Join the remainder into the key.

Apply:

-pub fn url_to_bucket_and_key(url: &Url) -> anyhow::Result<(String, String)> {
+pub fn url_to_bucket_and_key(url: &Url) -> anyhow::Result<(String, String)> {
     let Some(mut path_segments) = url.path_segments() else {
         anyhow::bail!("No path in NATS URL: {url}");
     };
     let Some(bucket) = path_segments.next() else {
         anyhow::bail!("No bucket in NATS URL: {url}");
     };
-    let Some(key) = path_segments.next() else {
-        anyhow::bail!("No key in NATS URL: {url}");
-    };
-    Ok((bucket.to_string(), key.to_string()))
+    let key = {
+        let first = path_segments.next().ok_or_else(|| anyhow::anyhow!("No key in NATS URL: {url}"))?;
+        let mut k = first.to_string();
+        for seg in path_segments {
+            k.push('/');
+            k.push_str(seg);
+        }
+        k
+    };
+    Ok((bucket.to_string(), key))
 }
lib/llm/src/model_card.rs (1)

204-207: mdcsum includes volatile fields and unwraps; compute a stable hash and avoid panic

As written, the checksum changes whenever last_published changes and can panic on serialization failure. If the goal is “same files” parity, hash only the CheckedFile checksums (and variant tags) and exclude volatile fields.

-    pub fn mdcsum(&self) -> String {
-        let json = self.to_json().unwrap();
-        format!("{}", blake3::hash(json.as_bytes()))
-    }
+    pub fn mdcsum(&self) -> String {
+        // Stable, file-oriented digest: ignore volatile fields like `last_published`.
+        let mut hasher = blake3::Hasher::new();
+        if let Some(ModelInfoType::HfConfigJson(f)) = &self.model_info {
+            hasher.update(b"model_info:hf_config_json:");
+            hasher.update(f.checksum_hex().as_bytes());
+        }
+        if let Some(GenerationConfig::HfGenerationConfigJson(f)) = &self.gen_config {
+            hasher.update(b"gen_config:hf_generation_config_json:");
+            hasher.update(f.checksum_hex().as_bytes());
+        }
+        if let Some(TokenizerKind::HfTokenizerJson(f)) = &self.tokenizer {
+            hasher.update(b"tokenizer:hf_tokenizer_json:");
+            hasher.update(f.checksum_hex().as_bytes());
+        }
+        if let Some(PromptFormatterArtifact::HfTokenizerConfigJson(f)) = &self.prompt_formatter {
+            hasher.update(b"prompt_formatter:hf_tokenizer_config_json:");
+            hasher.update(f.checksum_hex().as_bytes());
+        }
+        if let Some(PromptFormatterArtifact::HfChatTemplate(f)) = &self.chat_template_file {
+            hasher.update(b"chat_template:hf_chat_template:");
+            hasher.update(f.checksum_hex().as_bytes());
+        }
+        hasher.finalize().to_string()
+    }

If checksum_hex() doesn’t exist on CheckedFile, we should add a public getter for a canonical hex digest. I can follow up with that change.

🧹 Nitpick comments (11)
lib/llm/Cargo.toml (1)

97-97: Consider gating rayon behind a crate feature to trim build size when not needed.

If hashing large files isn’t on all build paths, expose a cargo feature (e.g., fast-hash) and move blake3’s "rayon" under it.

lib/runtime/src/transports/nats.rs (1)

228-259: Make data upload/download APIs borrow Url too for consistency.

object_store_upload_data and object_store_download_data still take Url by value. Switch to &Url to align with the new API and avoid moves.

-    pub async fn object_store_upload_data<T>(&self, data: &T, nats_url: Url) -> anyhow::Result<()>
+    pub async fn object_store_upload_data<T>(&self, data: &T, nats_url: &Url) -> anyhow::Result<()>
...
-        let (bucket_name, key) = url_to_bucket_and_key(&nats_url)?;
+        let (bucket_name, key) = url_to_bucket_and_key(nats_url)?;
-    pub async fn object_store_download_data<T>(&self, nats_url: Url) -> anyhow::Result<T>
+    pub async fn object_store_download_data<T>(&self, nats_url: &Url) -> anyhow::Result<T>
...
-        let (bucket_name, key) = url_to_bucket_and_key(&nats_url)?;
+        let (bucket_name, key) = url_to_bucket_and_key(nats_url)?;

Also applies to: 254-279

lib/llm/src/preprocessor/prompt/template.rs (1)

41-57: Same restriction for chat template — confirm and consider preserving newlines.

  • HfChatTemplate now requires a local path; confirm that’s intended across all call sites.
  • Replacing '\n' may alter valid Jinja. Consider keeping original content unless you observed template issues.
lib/llm/src/common/checked_file.rs (3)

117-124: Typo in serialized struct name.

serialize_struct is labeled "Checksum" for CheckedFile; use "CheckedFile".

-        let mut cf = serializer.serialize_struct("Checksum", 2)?;
+        let mut cf = serializer.serialize_struct("CheckedFile", 2)?;

137-147: URL vs local-path detection can misclassify Windows-like strings.

Url::parse succeeds on any “scheme:…” string. Consider classifying as URL only for known schemes (e.g., nats|http|https|s3|gs|file) or when url.has_host()/scheme() matches expected.

Example:

-        match Url::parse(&temp.path) {
-            Ok(url) => CheckedFile { path: Either::Right(url), checksum: temp.checksum },
+        match Url::parse(&temp.path) {
+            Ok(url) if matches!(url.scheme(), "nats" | "http" | "https" | "s3" | "gs" | "file") => {
+                CheckedFile { path: Either::Right(url), checksum: temp.checksum }
+            }
             Err(_) => CheckedFile {
                 path: Either::Left(PathBuf::from(temp.path)),
                 checksum: temp.checksum,
             },
         }

169-176: Threshold heuristic: consider using blake3’s own mmap fallback logic.

blake3::Hasher::update_mmap_rayon already decides when to mmap vs. fall back. Calling it directly (for sufficiently large files) or using update_reader for small ones can simplify and avoid arbitrary 10 MB thresholds.

Refs: blake3 Hasher docs. (docs.rs)

lib/llm/src/model_card.rs (5)

224-231: Clarify invariant and error when tokenizer is URL-backed

tokenizer_hf() bails if the tokenizer is a URL. That’s fine, but the error is generic. Make it actionable by including the URL and suggesting move_from_nats() first.

-                let p = checked_file.path().ok_or_else(|| {
-                    anyhow::anyhow!("URL tokenizer cannot be loaded by HF tokenizer")
-                })?;
+                let p = checked_file.path().ok_or_else(|| {
+                    let u = checked_file.url();
+                    anyhow::anyhow!(
+                        "Tokenizer is URL-backed ({:?}); call move_from_nats() before tokenizer_hf()",
+                        u
+                    )
+                })?;
```<!-- review_comment_end -->

---

`312-326`: **Clean up downloaded file on checksum mismatch to avoid leaving bad artifacts**

If checksum validation fails, remove the just-downloaded file before bailing.


```diff
-                    if !src_file.checksum_matches(&target) {
-                        anyhow::bail!(
+                    if !src_file.checksum_matches(&target) {
+                        let _ = tokio::fs::remove_file(&target).await;
+                        anyhow::bail!(
                             "Invalid {} in NATS for {}, checksum does not match.",
                             $filename,
                             self.display_name
                         );
                     }
```<!-- review_comment_end -->

---

`570-577`: **ModelInfoType::get_model_info: enforce local path, but consider hinting remediation**

Good to require a local path here. Consider hinting the caller to invoke move_from_nats() when hitting the error to reduce churn.
<!-- review_comment_end -->

---

`626-631`: **Avoid temporary PathBufs in parent().unwrap_or(...), and dedupe generation_config path creation**

Minor cleanup/readability: prefer Path::new("") over allocating a temporary PathBuf, and compute generation_config.json path once.


```diff
-        let file_path = file.as_ref();
+        let file_path = file.as_ref();
+        let gen_cfg_path = file_path
+            .parent()
+            .unwrap_or_else(|| std::path::Path::new(""))
+            .join("generation_config.json");
@@
-            let bos_token_id = crate::file_json_field::<TokenIdType>(
-                &Path::join(
-                    file_path.parent().unwrap_or(&PathBuf::from("")),
-                    "generation_config.json",
-                ),
-                "bos_token_id",
-            )
+            let bos_token_id = crate::file_json_field::<TokenIdType>(&gen_cfg_path, "bos_token_id")
@@
-                crate::file_json_field(
-                    &Path::join(
-                        file_path.parent().unwrap_or(&PathBuf::from("")),
-                        "generation_config.json",
-                    ),
-                    "eos_token_id",
-                )
+                crate::file_json_field(&gen_cfg_path, "eos_token_id")
@@
-                        path = %file_path.display(),
+                        path = %file_path.display(),
```<!-- review_comment_end -->



Also applies to: 644-647, 693-696, 684-685

---

`878-881`: **Test nits: pass Path instead of String to from_json_file**

The new signature accepts AsRef<Path>. Avoiding String allocations is cleaner.


```diff
-        let config = HFConfig::from_json_file(config_file.display().to_string())?;
+        let config = HFConfig::from_json_file(&config_file)?;

Apply the same change to the Llama-4 test.

Also applies to: 887-890

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f93a7ca and e231383.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • lib/bindings/python/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • lib/llm/Cargo.toml (1 hunks)
  • lib/llm/src/common.rs (1 hunks)
  • lib/llm/src/common/checked_file.rs (1 hunks)
  • lib/llm/src/model_card.rs (16 hunks)
  • lib/llm/src/preprocessor/prompt/template.rs (1 hunks)
  • lib/runtime/src/transports/nats.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-13T22:07:24.843Z
Learnt from: kthui
PR: ai-dynamo/dynamo#1424
File: lib/runtime/src/pipeline/network/egress/push_router.rs:204-209
Timestamp: 2025-06-13T22:07:24.843Z
Learning: The codebase uses async-nats version 0.40, not the older nats crate. Error handling should use async_nats::error::Error variants, not nats::Error variants.

Applied to files:

  • lib/runtime/src/transports/nats.rs
🧬 Code graph analysis (2)
lib/llm/src/preprocessor/prompt/template.rs (1)
lib/llm/src/local_model.rs (1)
  • display_name (340-342)
lib/llm/src/model_card.rs (2)
lib/llm/src/local_model.rs (2)
  • path (335-337)
  • display_name (340-342)
lib/runtime/src/transports/nats.rs (2)
  • object_store_upload (176-191)
  • object_store_download (194-212)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build and Test - dynamo
  • GitHub Check: pre-merge-rust (.)
  • GitHub Check: pre-merge-rust (lib/runtime/examples)
  • GitHub Check: pre-merge-rust (lib/bindings/python)
🔇 Additional comments (10)
lib/llm/Cargo.toml (1)

97-97: Blake3 features align with code path using update_mmap_rayon — good.

Enabling "mmap" and "rayon" is correct for Hasher::update_mmap_rayon used in this PR. Docs confirm those methods are gated by these features.

Refs: blake3 Hasher docs and crate features. (docs.rs)

lib/runtime/src/transports/nats.rs (2)

175-191: Borrowing Url avoids unnecessary clones — good change.

Signature change to take &Url for uploads is an improvement and keeps ownership at the callsite.


193-212: Symmetric improvement on download — good.

Matching the upload API, borrowing &Url here is consistent and reduces moves.

lib/llm/src/common.rs (1)

4-4: Module exposure LGTM.

Publicly exporting checked_file is appropriate given downstream usage.

lib/llm/src/preprocessor/prompt/template.rs (1)

26-33: New restriction: tokenizer_config must be a local path — confirm intent.

This bails on URLs. If remote/MDC URLs were previously valid, this is a breaking change; confirm that workers will always materialize these locally before formatting.

lib/llm/src/common/checked_file.rs (1)

112-125: Url crate serde feature is already enabled
Cargo.toml includes url = { version = "2.5", features = ["serde"] }, so CheckedFile::serialize will compile as is.

lib/llm/src/model_card.rs (4)

270-275: NATS transfers for generation_config: looks good

Macro usage and variant selection are consistent for upload/download of generation_config.json.

Also applies to: 329-334


505-507: Custom template ingest: good move to CheckedFile

from_disk() here ensures the checksum is recorded at ingest time.


804-810: Repo loaders migrated to CheckedFile: solid

from_repo() helpers now precompute checksums at ingest. Error contexts are clear and specific.

Also applies to: 812-818, 820-836, 838-844


39-44: ModelInfoType and related enums need backward-compatibility for legacy string payloads

lib/llm/src/model_card.rs definitions for ModelInfoType (39–44), TokenizerKind (46–53), PromptFormatterArtifact (66–72), and GenerationConfig (83–89) now wrap CheckedFile instead of PathBuf, so existing serialized strings/paths will fail to deserialize. Implement either:

  • custom Deserialize for these enums to accept both legacy string/URL and CheckedFile (using CheckedFile::from_disk/from_url)
  • a one-time migration to rewrite persisted MDCs to the new shape
⛔ Skipped due to learnings
Learnt from: GuanLuo
PR: ai-dynamo/dynamo#2714
File: lib/llm/src/discovery/model_entry.rs:38-42
Timestamp: 2025-09-02T16:46:54.015Z
Learning: In lib/llm/src/discovery/model_entry.rs, GuanLuo prefers not to add serde defaults for model_type and model_input fields to keep the specification explicit and avoid user errors, relying on atomic deployment strategy to avoid backward compatibility issues.

@grahamking grahamking enabled auto-merge (squash) September 8, 2025 22:11
This ensures frontend and worker are using exactly the same files.

Signed-off-by: Graham King <[email protected]>
Signed-off-by: Graham King <[email protected]>
Signed-off-by: Graham King <[email protected]>
@grahamking
Copy link
Contributor Author

/ok to test

@copy-pr-bot
Copy link

copy-pr-bot bot commented Sep 9, 2025

/ok to test

@grahamking, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@grahamking
Copy link
Contributor Author

/ok to test f5f39c9

@grahamking grahamking merged commit 6f14e94 into main Sep 9, 2025
12 of 13 checks passed
@grahamking grahamking deleted the gk-mdc-checksum branch September 9, 2025 16:12
indrajit96 pushed a commit that referenced this pull request Sep 9, 2025
Signed-off-by: Graham King <[email protected]>
Signed-off-by: Indrajit Bhosale <[email protected]>
indrajit96 pushed a commit that referenced this pull request Sep 9, 2025
tedzhouhk pushed a commit that referenced this pull request Sep 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants