Skip to content

Conversation

@keivenchang
Copy link
Contributor

@keivenchang keivenchang commented Sep 30, 2025

Overview:

  1. Fix bug in etcd storage bucket delete method that caused test keys to accumulate indefinitely.
  2. Also fix integration test expectations for NATS service metrics.

Details:

  • Fixed EtcdBucket::delete() to include bucket prefix via make_key() - was silently failing due to missing prefix
  • Fixed EtcdBucket::get() to use make_key() for consistency with other methods
  • Updated metrics integration test to allow valid range (0-10) for nats_service::TOTAL_REQUESTS instead of requiring exactly 0
  • Bug impact: Keys accumulated in etcd with each test run:
test_concurrent_bucket/concurrent_test_key_72e3120a-3fa4-4d3f-95c3-fede907fab77
test_concurrent_bucket/concurrent_test_key_8c07b44e-c5b6-495d-bcdf-05b0e02f0a8b
test_concurrent_bucket/concurrent_test_key_a9c60d0e-d388-4bff-a7b9-14bdc2f5bec5
(and more with each subsequent test run)
  • Verification: With bug, keys accumulated (2 → 3 → 4...); with fix, etcd is clean after tests

Where should the reviewer start?

lib/runtime/src/storage/key_value_store/etcd.rs

/coderabbit profile chill

Summary by CodeRabbit

  • Bug Fixes
    • Improved consistency of key-value retrieval and deletion, reducing risks of missing or incorrect entries.
  • Refactor
    • Standardized key handling in the storage backend and enhanced trace logging for better observability.
  • Tests
    • Aligned NATS-related metrics expectations with actual service statistics to reduce flaky results.

@keivenchang keivenchang self-assigned this Sep 30, 2025
@keivenchang keivenchang requested a review from a team as a code owner September 30, 2025 23:29
@github-actions github-actions bot added the fix label Sep 30, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 30, 2025

Walkthrough

Tests in metrics.rs adjust expected NATS TOTAL_REQUESTS range. Etcd key-value store get/delete now construct keys via a shared make_key helper; delete adds tracing and uses the constructed key for kv_delete.

Changes

Cohort / File(s) Summary
Metrics test expectation update
lib/runtime/src/metrics.rs
Adjusts NATS TOTAL_REQUESTS expected range from exactly 0.0 to up to 10.0; updates intended semantics noting NATS stats may differ from work handler counts.
Etcd key construction and logging
lib/runtime/src/storage/key_value_store/etcd.rs
Replaces manual key formatting with make_key(...) in get and delete; adds tracing log on delete; updates kv_delete to use the constructed key.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Caller
  participant S as EtcdKeyValueStore
  participant H as make_key(...)
  participant E as Etcd Client

  C->>S: delete(bucket, key)
  S->>H: build namespaced key
  H-->>S: constructed_key
  S-->>S: trace!("deleting", constructed_key)
  S->>E: kv_delete(constructed_key)
  E-->>S: result
  S-->>C: result
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

I twitch my whiskers at keys made right,
A tidy trail through Etcd’s night.
Metrics hop from zero’s pen,
To ten small carrots—count again!
Logs go thump, requests take flight—
Ship it, nibble, squeak delight! 🥕

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description follows the Overview, Details, and “Where should the reviewer start?” headings defined in the template but omits the required “Related Issues” section, leaving out the action keyword and issue reference that the repository’s description standard mandates. Please add a “Related Issues” section that uses one of the action keywords (Closes, Fixes, Resolves, or Relates to) and includes the corresponding issue number(s) to align with the repository’s pull request description template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly highlights the primary changes—correcting the Etcd bucket delete prefix and adjusting the metrics test range—and clearly relates to the main changes in the pull request.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f95ba89 and aa6e28d.

📒 Files selected for processing (2)
  • lib/runtime/src/metrics.rs (1 hunks)
  • lib/runtime/src/storage/key_value_store/etcd.rs (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: trtllm
  • GitHub Check: vllm
  • GitHub Check: sglang
  • GitHub Check: clippy (launch/dynamo-run)
  • GitHub Check: tests (launch/dynamo-run)
  • GitHub Check: clippy (lib/bindings/python)
  • GitHub Check: clippy (.)
  • GitHub Check: clippy (lib/runtime/examples)
  • GitHub Check: tests (.)
  • GitHub Check: tests (lib/runtime/examples)
  • GitHub Check: tests (lib/bindings/python)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (3)
lib/runtime/src/metrics.rs (1)

1587-1589: LGTM! Test range adjustment addresses non-deterministic metrics.

The change from requiring exactly 0.0 to accepting 0.0–10.0 for TOTAL_REQUESTS is appropriate. The updated comment correctly notes that NATS service stats may differ from work handler counts due to timing and internal NATS behavior.

lib/runtime/src/storage/key_value_store/etcd.rs (2)

73-73: Good consistency improvement.

Aligning get() to use make_key() ensures all key operations consistently include the bucket prefix, matching the pattern already used in create(), update(), watch(), and entries().


239-241: Clean up orphaned test keys in etcd

Previous test runs created keys under the raw bucket prefix (test_concurrent_bucket/...) before slugification. Remove any stale entries with etcdctl:

# List keys under the old prefix
etcdctl get test_concurrent_bucket --prefix --keys-only
# Delete them
etcdctl del test_concurrent_bucket --prefix

This ensures no leftover test keys remain before running the updated integration tests.

- Fix EtcdBucket::delete() to include bucket prefix using make_key()
- Fix EtcdBucket::get() to use make_key() for consistency
- Fix metrics test nats_service::TOTAL_REQUESTS expected range from 0-0 to 0-10

The delete() method was missing the bucket prefix, causing silent failures
in test cleanup and accumulation of leftover keys in etcd. All bucket
methods now consistently use make_key() for proper key construction.

Signed-off-by: Keiven Chang <[email protected]>
@keivenchang keivenchang force-pushed the keivenchang/fix-integration-tests-and-nats-deregistration branch from aa6e28d to 3edd2f7 Compare October 1, 2025 03:40
@keivenchang keivenchang merged commit 0f63a05 into main Oct 1, 2025
25 of 26 checks passed
@keivenchang keivenchang deleted the keivenchang/fix-integration-tests-and-nats-deregistration branch October 1, 2025 18:57
nv-tusharma pushed a commit that referenced this pull request Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants