Skip to content

Conversation

@yczhang-nv
Copy link
Contributor

@yczhang-nv yczhang-nv commented Oct 10, 2025

Description

This PR introduces a comprehensive load testing suite for MCP servers in NeMo Agent toolkit. The utility simulates concurrent users making tool calls to MCP servers and generates detailed performance reports including latency statistics, memory usage, and per-tool performance metrics.

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • Added MCP server load-testing utilities with CLI and Python APIs for running YAML-configured tests.
    • Supports concurrent user simulation, warmup, transport selection, weighted tool calls, and memory monitoring.
    • Produces detailed summaries: success rate, RPS, latency percentiles (min/mean/median/p95/p99), per-tool stats, error aggregation, and CSV/text reports saved to an output directory.
  • Documentation

    • Added comprehensive README and example YAML config with quick start, configuration schema, usage examples (CLI and Python), and report details.

Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
@yczhang-nv yczhang-nv self-assigned this Oct 10, 2025
@yczhang-nv yczhang-nv added the feature request New feature or request label Oct 10, 2025
@yczhang-nv yczhang-nv requested a review from a team as a code owner October 10, 2025 18:29
@yczhang-nv yczhang-nv added the non-breaking Non-breaking change label Oct 10, 2025
@coderabbitai
Copy link

coderabbitai bot commented Oct 10, 2025

Walkthrough

Adds a new MCP load-testing toolkit: YAML config loader, core MCPLoadTest runner with server lifecycle, client invocations, concurrency and memory sampling, report generation (CSV/text), a CLI entrypoint, public API exports, and documentation/example config. Outputs a structured summary and files to an output directory.

Changes

Cohort / File(s) Summary of changes
Docs
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md
New README describing requirements, configuration schema, usage (CLI/Python), example commands, and outputs (CSV/summary).
Public API
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
New module re-exporting MCPLoadTest, run_load_test, and run_load_test_from_yaml and defining __all__.
CLI
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
New CLI entrypoint: parse_args, print_summary, main; logging, argument parsing, runs YAML-configured tests and prints formatted summary.
Config loader
.../load_test_utils/config_loader.py
New YAML loader and validator: load_config_from_yaml builds LoadTestConfig and ToolCallConfig; validate_config enforces required fields, types, paths, ranges, and transport choices.
Example config
.../load_test_utils/configs/config.yml
Adds example YAML with server settings, load_test params, output dir, and weighted tool_calls.
Core load tester
.../load_test_utils/load_tester.py
New load-testing implementation: dataclasses (ToolCallConfig, LoadTestConfig, ToolCallResult, MemorySample), MCPLoadTest with server start/stop, transport-aware client context, weighted tool selection, per-user simulation, memory monitoring, result collection, report saving; plus run_load_test and run_load_test_from_yaml helpers.
Reporting
.../load_test_utils/report_generator.py
New generate_summary_report computing overall counts, latency stats (min/max/mean/median/p95/p99/stdev), per-tool stats, error aggregation, and memory statistics; returns nested summary dict.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant CLI as CLI (cli.py)
  participant Loader as Config Loader
  participant Tester as MCPLoadTest
  participant Server as MCP Server
  participant Client as MCP Client
  participant Reporter as Report Generator

  User->>CLI: mcp-load-test --config config.yml
  CLI->>Loader: load_config_from_yaml(config.yml)
  Loader-->>CLI: LoadTestConfig
  CLI->>Tester: run_load_test_from_yaml(...)
  Tester->>Server: start server (host, port, transport)
  Tester->>Client: create client session (transport)
  par Warmup
    Tester->>Client: invoke weighted tool calls
    Client-->>Tester: ToolCallResult
  end
  par Main test window
    loop concurrent users
      Tester->>Client: tool call
      Client-->>Tester: ToolCallResult
    end
    par Memory monitor
      Tester->>Tester: sample memory (rss/vms/percent)
    end
  end
  Tester->>Server: stop server
  Tester->>Reporter: generate_summary_report(results, mem_samples)
  Reporter-->>Tester: summary dict
  Tester->>Tester: write CSV & summary.txt
  Tester-->>CLI: summary dict
  CLI-->>User: print summary
Loading
sequenceDiagram
  autonumber
  participant CLI
  participant FS as Filesystem
  participant Loader
  CLI->>FS: read config.yml
  alt config exists & valid
    CLI->>Loader: parse + validate
    Loader-->>CLI: LoadTestConfig
  else missing/invalid
    Loader-->>CLI: raise error
    CLI-->>User: log/exit
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “Implement load test scripts for MCP server” uses the imperative mood, concisely and accurately describes the main change of adding load testing utilities for the MCP server, and falls well within the 72-character limit.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2797ead and 417990f.

📒 Files selected for processing (1)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🧹 Nitpick comments (3)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)

41-43: Add explicit return type hints for CLI functions.

Conform to project typing standards.

Apply this diff:

-def parse_args():
+def parse_args() -> argparse.Namespace:
@@
-def list_configs():
+def list_configs() -> None:
@@
-def print_summary(results: dict):
+from typing import Any
+
+
+def print_summary(results: dict[str, Any]) -> None:
@@
-def main():
+def main() -> int:

Also applies to: 93-101, 114-121, 180-186

src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)

81-83: Harden YAML loading (encoding + YAML error handling).

Specify UTF‑8 and wrap YAML parse errors with a clear message.

Apply this diff:

-    with open(config_path) as f:
-        config_data = yaml.safe_load(f)
+    try:
+        with open(config_path, encoding="utf-8") as f:
+            config_data = yaml.safe_load(f)
+    except yaml.YAMLError as e:
+        logger.error("Failed to parse YAML config: %s", e)
+        raise ValueError(f"Invalid YAML in config file: {config_path}") from e

As per coding guidelines

src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)

392-433: Optimize nearest memory-sample lookup to avoid O(n^2).

Use bisect on sorted timestamps; overall O(n log n).

Apply this diff:

         with open(file_path, "w", newline="") as f:
             writer = csv.writer(f)
@@
-            for result in self.results:
-                # Find the closest memory sample for this result
-                memory_rss = ""
-                memory_vms = ""
-                memory_percent = ""
-
-                if self.memory_samples:
-                    # Find the closest memory sample by timestamp
-                    closest_sample = min(
-                        self.memory_samples,
-                        key=lambda sample: abs(sample.timestamp - result.timestamp),
-                    )
-                    memory_rss = f"{closest_sample.rss_mb:.2f}"
-                    memory_vms = f"{closest_sample.vms_mb:.2f}"
-                    memory_percent = f"{closest_sample.percent:.2f}"
+            # Pre-sort memory samples once and build timestamp array
+            samples_sorted = sorted(self.memory_samples, key=lambda s: s.timestamp) if self.memory_samples else []
+            sample_times = [s.timestamp for s in samples_sorted]
+
+            import bisect
+
+            for result in self.results:
+                memory_rss = ""
+                memory_vms = ""
+                memory_percent = ""
+
+                if samples_sorted:
+                    pos = bisect.bisect_left(sample_times, result.timestamp)
+                    candidates = []
+                    if pos > 0:
+                        candidates.append(samples_sorted[pos - 1])
+                    if pos < len(samples_sorted):
+                        candidates.append(samples_sorted[pos])
+                    closest_sample = min(candidates, key=lambda s: abs(s.timestamp - result.timestamp))
+                    memory_rss = f"{closest_sample.rss_mb:.2f}"
+                    memory_vms = f"{closest_sample.vms_mb:.2f}"
+                    memory_percent = f"{closest_sample.percent:.2f}"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8c2fd89 and 6ab7baa.

📒 Files selected for processing (7)
  • src/nat/front_ends/mcp/load_test_utils/README.md (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/__init__.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/cli.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/README.md
  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/README.md
  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/mcp/load_test_utils/__init__.py
  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/README.md
  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
  • src/nat/front_ends/mcp/load_test_utils/report_generator.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/README.@(md|ipynb)

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings

Files:

  • src/nat/front_ends/mcp/load_test_utils/README.md
**/*.{yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.

Files:

  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
**/configs/**

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Configuration files consumed by code must be stored next to that code in a configs/ folder

Files:

  • src/nat/front_ends/mcp/load_test_utils/configs/config.yml
🧬 Code graph analysis (5)
src/nat/front_ends/mcp/load_test_utils/__init__.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
  • MCPLoadTest (106-502)
  • run_load_test (505-575)
  • run_load_test_from_yaml (578-601)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
  • run_load_test_from_yaml (578-601)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
  • TextContent (101-105)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
  • generate_summary_report (27-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
  • load_config_from_yaml (28-135)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
  • LoadTestConfig (53-81)
  • MemorySample (97-103)
  • ToolCallResult (85-93)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
  • LoadTestConfig (53-81)
  • ToolCallConfig (39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/cli.py

222-222: Consider moving this statement to an else block

(TRY300)


228-228: Do not catch blind exception: Exception

(BLE001)


229-229: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/nat/front_ends/mcp/load_test_utils/load_tester.py

152-152: subprocess call: check for execution of untrusted input

(S603)


168-168: Consider moving this statement to an else block

(TRY300)


171-171: Avoid specifying long messages outside the exception class

(TRY003)


189-189: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


242-242: Do not catch blind exception: Exception

(BLE001)


269-269: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


552-552: Avoid specifying long messages outside the exception class

(TRY003)

src/nat/front_ends/mcp/load_test_utils/config_loader.py

77-77: Avoid specifying long messages outside the exception class

(TRY003)


85-85: Avoid specifying long messages outside the exception class

(TRY003)


90-90: Avoid specifying long messages outside the exception class

(TRY003)


111-111: Avoid specifying long messages outside the exception class

(TRY003)


116-116: Avoid specifying long messages outside the exception class

(TRY003)


148-148: Avoid specifying long messages outside the exception class

(TRY003)


151-151: Avoid specifying long messages outside the exception class

(TRY003)


154-154: Avoid specifying long messages outside the exception class

(TRY003)


157-157: Avoid specifying long messages outside the exception class

(TRY003)


160-160: Avoid specifying long messages outside the exception class

(TRY003)


163-163: Avoid specifying long messages outside the exception class

(TRY003)


166-166: Avoid specifying long messages outside the exception class

(TRY003)


169-169: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: CI Pipeline / Check
🔇 Additional comments (2)
src/nat/front_ends/mcp/load_test_utils/__init__.py (1)

15-24: Public API exports look good.

Header, docstring, and re-exports are tidy and correct.

src/nat/front_ends/mcp/load_test_utils/configs/config.yml (1)

16-61: Sample config structure is consistent with the loader.

Values and comments align with config_loader expectations.

yczhang-nv and others added 2 commits October 10, 2025 11:49
Signed-off-by: Yuchen Zhang <[email protected]>
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

* **Documentation**
  * Clarified Model Context Protocol (MCP) Authentication docs with renamed titles, standardized terminology, and grammar improvements.
  * Expanded token storage guidance: clearer backend options (in-memory default, S3, MySQL, Redis, custom), encryption notes, persistence behavior, multi-user isolation, plus a new note on in-memory use for dev/testing.
  * Simplified headings and steps for initial auth, token retrieval/refresh, UI launch, and multi-user mode.
  * Improved security and troubleshooting wording.
  * Polished MCP client guide for readability and consistency.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah)

URL: NVIDIA#977

Signed-off-by: Yuchen Zhang <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
src/nat/front_ends/mcp/load_test_utils/cli.py (2)

195-199: Use logger.exception() instead of conditional exc_info.

When catching without re-raising, use logger.exception() to unconditionally capture the full stack trace.

As per coding guidelines

Apply this diff:

-    except Exception as e:
-        logger.error("Load test failed: %s", e, exc_info=args.verbose)
+    except Exception:
+        logger.exception("Load test failed")
         if not args.verbose:
             logger.info("Use --verbose flag for detailed error information")
         return 1

195-199: Use logger.exception() to always capture stack traces.

When catching without re-raising, use logger.exception() to ensure full trace is logged. Keep the conditional info message about --verbose.

Apply this diff:

-    except Exception as e:
-        logger.error("Load test failed: %s", e, exc_info=args.verbose)
+    except Exception:
+        logger.exception("Load test failed")
         if not args.verbose:
             logger.info("Use --verbose flag for detailed error information")
         return 1

As per coding guidelines

src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)

164-169: Subprocess PIPE can deadlock - pipes are never consumed.

The server process uses subprocess.PIPE for both stdout and stderr, but nothing consumes these pipes. If the server produces sufficient output, the pipes will fill and the server will block, potentially causing a deadlock.

Apply this diff to inherit parent stdio (or redirect to DEVNULL if you want to suppress output):

 self.server_process = subprocess.Popen(
     cmd,
-    stdout=subprocess.PIPE,
-    stderr=subprocess.PIPE,
+    stdout=None,  # inherit parent stdout
+    stderr=None,  # inherit parent stderr
     text=True,
 )

Alternatively, redirect to DEVNULL to suppress output:

 self.server_process = subprocess.Popen(
     cmd,
-    stdout=subprocess.PIPE,
-    stderr=subprocess.PIPE,
+    stdout=subprocess.DEVNULL,
+    stderr=subprocess.DEVNULL,
     text=True,
 )

164-169: Avoid PIPE for server stdio (deadlock risk).

Unconsumed PIPE buffers can fill and block the server process. Use subprocess.DEVNULL or inherit parent stdio.

Apply this diff:

 self.server_process = subprocess.Popen(
     cmd,
-    stdout=subprocess.PIPE,
-    stderr=subprocess.PIPE,
+    stdout=subprocess.DEVNULL,
+    stderr=subprocess.DEVNULL,
     text=True,
 )

Alternatively, set both to None to inherit parent stdio for debugging.

🧹 Nitpick comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)

146-162: Normalize config_file before subprocess invocation. Although validate_config ensures the file exists, call self.config.config_file = Path(self.config.config_file).resolve() (or similar) in _start_server to canonicalize the path prior to building the cmd list.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ce6c31 and a8fcaa9.

📒 Files selected for processing (3)
  • src/nat/front_ends/mcp/load_test_utils/README.md (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/cli.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • src/nat/front_ends/mcp/load_test_utils/README.md
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/mcp/load_test_utils/cli.py
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
  • run_load_test_from_yaml (590-613)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
  • TextContent (101-105)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
  • generate_summary_report (27-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
  • load_config_from_yaml (28-135)
  • validate_config (138-171)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/cli.py

189-189: Consider moving this statement to an else block

(TRY300)


195-195: Do not catch blind exception: Exception

(BLE001)


196-196: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

src/nat/front_ends/mcp/load_test_utils/load_tester.py

164-164: subprocess call: check for execution of untrusted input

(S603)


180-180: Consider moving this statement to an else block

(TRY300)


183-183: Avoid specifying long messages outside the exception class

(TRY003)


201-201: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


254-254: Do not catch blind exception: Exception

(BLE001)


281-281: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


564-564: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (19)
src/nat/front_ends/mcp/load_test_utils/cli.py (4)

69-75: Previous review comment about --list-configs appears outdated.

The previous review suggested making --config_file optional to support --list-configs, but no such option exists in this CLI implementation. The current design correctly requires --config_file as the only configuration method.


69-75: Past comment about --list-configs is no longer applicable.

The previous review flagged that --config_file being required would break a --list-configs option, but no such option exists in the current implementation. The code correctly requires --config_file for its intended purpose.


87-151: LGTM - summary formatting is clear and comprehensive.

The function correctly handles all summary sections including optional memory and per-tool statistics, with appropriate formatting and error handling.


87-151: LGTM!

The summary formatting is well-structured, handles optional sections gracefully, and provides clear, actionable output for users.

src/nat/front_ends/mcp/load_test_utils/load_tester.py (15)

134-144: LGTM! Transport-specific client selection correctly implemented.

The _client_ctx helper properly selects between streamablehttp_client and sse_client based on the configured transport type, addressing the previous review's concern about respecting the transport configuration.


254-264: Exception handling is appropriate for tool call isolation.

Catching all exceptions here ensures that failures in individual tool calls are captured as ToolCallResult objects with error details, allowing the load test to continue. This broad exception handling is acceptable for this use case.


198-210: Weighted random selection correctly implemented.

The weighted random selection using cumulative probabilities is mathematically correct and will properly respect the relative weights assigned to each tool call.


134-144: Well done - transport selection correctly implemented.

This method properly addresses the previous review concern by selecting the appropriate MCP client (streamablehttp_client vs sse_client) based on the configured transport type. The lazy import of sse_client is a good practice for optional dependencies.


517-613: LGTM - public API functions are well-designed.

Both run_load_test() and run_load_test_from_yaml() provide clear, well-documented entry points for load testing. The parameter validation, example usage, and integration with config loading are all correctly implemented.


240-240: No change needed: isError is valid
isError appears in the MCP Python SDK client and tests; this attribute name is correct.


134-144: LGTM! Transport-aware client selection implemented.

The _client_ctx() helper correctly selects the appropriate MCP client based on the configured transport type, addressing the earlier concern about SSE support.


186-196: LGTM! Graceful shutdown with kill fallback.

The server termination logic properly attempts graceful shutdown before forceful kill.


198-210: LGTM! Weighted tool selection is correct.

The weighted random selection logic correctly distributes tool calls according to configured weights. Using random.uniform() is appropriate for load testing (crypto-secure random not needed).


212-264: LGTM! Tool invocation with proper error handling.

The method correctly:

  • Uses transport-aware client via _client_ctx()
  • Measures latency for both success and failure cases
  • Records all errors in ToolCallResult for analysis

Catching broad Exception is appropriate here to capture all tool execution failures.


266-283: LGTM! User simulation logic is sound.

The simulation correctly executes weighted tool calls with realistic delays until the test duration expires.


285-315: LGTM! Memory monitoring implementation is correct.

The periodic memory sampling with proper error handling provides valuable resource usage insights during load tests.


317-378: LGTM! Test orchestration is well-structured.

The run method properly:

  • Sequences warmup and actual test phases
  • Runs concurrent user simulations and memory monitoring in parallel
  • Ensures server cleanup via finally block
  • Generates comprehensive reports

380-514: LGTM! Report generation is comprehensive.

The report persistence methods correctly save:

  • Detailed CSV with per-request metrics and closest memory samples
  • Human-readable summary with configuration, metrics, and errors

Both formats provide valuable insights for load test analysis.


517-613: LGTM! Public API entry points are well-designed.

Both entry points provide clean interfaces:

  • run_load_test(): Direct programmatic usage with parameters
  • run_load_test_from_yaml(): YAML-driven configuration for CLI and automation

The example docstrings clearly demonstrate usage patterns.

Signed-off-by: Yuchen Zhang <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)

164-169: Address the unresolved subprocess.PIPE deadlock risk.

The previous review correctly identified that using subprocess.PIPE for both stdout and stderr can cause the long-running server process to block when pipe buffers fill.

The recommended solutions from the previous review remain valid:

Solution 1 (simplest): Inherit parent stdio

 self.server_process = subprocess.Popen(
     cmd,
-    stdout=subprocess.PIPE,
-    stderr=subprocess.PIPE,
+    stdout=None,
+    stderr=None,
     text=True,
 )

Solution 2: Redirect to log files in output_dir

log_file = self.output_dir / "server.log"
with open(log_file, "w") as f:
    self.server_process = subprocess.Popen(
        cmd,
        stdout=f,
        stderr=subprocess.STDOUT,
        text=True,
    )

Solution 3: Redirect to DEVNULL

 self.server_process = subprocess.Popen(
     cmd,
-    stdout=subprocess.PIPE,
-    stderr=subprocess.PIPE,
+    stdout=subprocess.DEVNULL,
+    stderr=subprocess.DEVNULL,
     text=True,
 )

Redirecting to log files (Solution 2) would be most helpful for debugging test failures.

🧹 Nitpick comments (1)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)

81-82: Consider explicit YAML parsing error handling.

While yaml.safe_load() is secure, it can raise yaml.YAMLError for malformed YAML. Consider wrapping this in a try-except to provide a clearer error message to users.

Apply this diff:

-    with open(config_path) as f:
-        config_data = yaml.safe_load(f)
+    try:
+        with open(config_path) as f:
+            config_data = yaml.safe_load(f)
+    except yaml.YAMLError as e:
+        raise ValueError(f"Invalid YAML in config file: {e}") from e
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8fcaa9 and 8b5c860.

📒 Files selected for processing (2)
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
  • TextContent (101-105)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
  • generate_summary_report (27-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
  • load_config_from_yaml (28-135)
  • validate_config (138-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
  • LoadTestConfig (53-81)
  • ToolCallConfig (39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/load_tester.py

164-164: subprocess call: check for execution of untrusted input

(S603)


180-180: Consider moving this statement to an else block

(TRY300)


183-183: Avoid specifying long messages outside the exception class

(TRY003)


201-201: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


254-254: Do not catch blind exception: Exception

(BLE001)


281-281: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


564-564: Avoid specifying long messages outside the exception class

(TRY003)

src/nat/front_ends/mcp/load_test_utils/config_loader.py

77-77: Avoid specifying long messages outside the exception class

(TRY003)


85-85: Avoid specifying long messages outside the exception class

(TRY003)


90-90: Avoid specifying long messages outside the exception class

(TRY003)


111-111: Avoid specifying long messages outside the exception class

(TRY003)


116-116: Avoid specifying long messages outside the exception class

(TRY003)


148-148: Avoid specifying long messages outside the exception class

(TRY003)


151-151: Avoid specifying long messages outside the exception class

(TRY003)


154-154: Avoid specifying long messages outside the exception class

(TRY003)


157-157: Avoid specifying long messages outside the exception class

(TRY003)


160-160: Avoid specifying long messages outside the exception class

(TRY003)


163-163: Avoid specifying long messages outside the exception class

(TRY003)


166-166: Avoid specifying long messages outside the exception class

(TRY003)


169-169: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)

134-144: LGTM! Correctly addresses transport selection.

The lazy import of sse_client and conditional client selection properly supports both streamable-http and sse transports as indicated in the past review.

Signed-off-by: Yuchen Zhang <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)

164-169: Remove unused PIPEs to avoid deadlocks.

nat mcp serve is launched with both stdout and stderr piped, but nothing ever consumes those streams. As soon as the server emits enough output to fill the OS pipe buffers, the child blocks on write and the entire load test hangs. Please inherit stdio, redirect to files, or otherwise drain the pipes before running this at scale.

         self.server_process = subprocess.Popen(
             cmd,
-            stdout=subprocess.PIPE,
-            stderr=subprocess.PIPE,
+            stdout=None,
+            stderr=None,
             text=True,
         )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b5c860 and 0b2b653.

📒 Files selected for processing (2)
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py (1 hunks)
  • src/nat/front_ends/mcp/load_test_utils/load_tester.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All importable Python code must live under src/ (or packages//src/)

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Changes in src/nat should prioritize backward compatibility

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py

⚙️ CodeRabbit configuration file

This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • src/nat/front_ends/mcp/load_test_utils/load_tester.py
  • src/nat/front_ends/mcp/load_test_utils/config_loader.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
  • TextContent (101-105)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
  • generate_summary_report (27-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
  • load_config_from_yaml (28-102)
  • validate_config (105-138)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
  • LoadTestConfig (53-81)
  • ToolCallConfig (39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/load_tester.py

164-164: subprocess call: check for execution of untrusted input

(S603)


180-180: Consider moving this statement to an else block

(TRY300)


183-183: Avoid specifying long messages outside the exception class

(TRY003)


201-201: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


254-254: Do not catch blind exception: Exception

(BLE001)


281-281: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


549-549: Avoid specifying long messages outside the exception class

(TRY003)

src/nat/front_ends/mcp/load_test_utils/config_loader.py

44-44: Avoid specifying long messages outside the exception class

(TRY003)


52-52: Avoid specifying long messages outside the exception class

(TRY003)


57-57: Avoid specifying long messages outside the exception class

(TRY003)


78-78: Avoid specifying long messages outside the exception class

(TRY003)


83-83: Avoid specifying long messages outside the exception class

(TRY003)


115-115: Avoid specifying long messages outside the exception class

(TRY003)


118-118: Avoid specifying long messages outside the exception class

(TRY003)


121-121: Avoid specifying long messages outside the exception class

(TRY003)


124-124: Avoid specifying long messages outside the exception class

(TRY003)


127-127: Avoid specifying long messages outside the exception class

(TRY003)


130-130: Avoid specifying long messages outside the exception class

(TRY003)


133-133: Avoid specifying long messages outside the exception class

(TRY003)


136-136: Avoid specifying long messages outside the exception class

(TRY003)

Signed-off-by: Yuchen Zhang <[email protected]>
@yczhang-nv yczhang-nv changed the title Implement load test suite to MCP server Implement load test scripts for MCP server Oct 10, 2025
Signed-off-by: Yuchen Zhang <[email protected]>
@AnuradhaKaruppiah
Copy link
Contributor

Thx for adding this @yczhang-nv. I am not sure if this needs to be a part of the core package/CLI. We can keep it in the test package and evolve it from there and based on how things play put we can promote it to UI. Also for simulating concurrent users we can use the eval sub-system if possible.

Signed-off-by: Yuchen Zhang <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)

17-57: Validate YAML root is a mapping before .get.

yaml.safe_load can return a list/scalar, so the subsequent .get calls will raise an AttributeError instead of a useful ValueError. Please assert the loaded object is a mapping first.

+from collections.abc import Mapping
@@
-    if not config_data:
-        raise ValueError("Config file is empty")
+    if not config_data:
+        raise ValueError("Config file is empty")
+
+    if not isinstance(config_data, Mapping):
+        raise ValueError("Config file must contain a top-level mapping")
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (1)

149-168: Avoid piping server stdio without readers.

stdout=subprocess.PIPE and stderr=subprocess.PIPE will block the MCP server once the buffers fill because nothing consumes those pipes. Let the child inherit stdio (or redirect explicitly) so long-running tests can’t deadlock.

-        self.server_process = subprocess.Popen(
-            cmd,
-            stdout=subprocess.PIPE,
-            stderr=subprocess.PIPE,
-            text=True,
-        )
+        self.server_process = subprocess.Popen(
+            cmd,
+            stdout=None,
+            stderr=None,
+            text=True,
+        )
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py (1)

157-161: Always log full tracebacks on failure.

Per the project logging rules, when catching without re-raising we must call logger.exception() so the stack trace is preserved regardless of verbosity flags.

-    except Exception as e:
-        logger.error("Load test failed: %s", e, exc_info=args.verbose)
+    except Exception:
+        logger.exception("Load test failed")
         if not args.verbose:
             logger.info("Use --verbose flag for detailed error information")
         return 1

As per coding guidelines

packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (1)

44-52: Inconsistent return structure for empty results still present.

This issue was previously flagged: when results is empty, the function returns a flat dictionary, but when results exist, it returns a nested structure with keys like test_configuration, summary, latency_statistics, etc. Callers expecting the nested structure will encounter errors.

Apply the diff from the previous review to return a consistent nested structure:

     if not results:
         return {
-            "total_requests": 0,
-            "successful_requests": 0,
-            "failed_requests": 0,
-            "success_rate": 0.0,
-            "test_duration_seconds": test_duration,
-            "requests_per_second": 0.0,
+            "test_configuration": {
+                "config_file": config.config_file,
+                "num_concurrent_users": config.num_concurrent_users,
+                "duration_seconds": config.duration_seconds,
+                "server_url": f"http://{config.server_host}:{config.server_port}",
+                "transport": config.transport,
+                "warmup_seconds": config.warmup_seconds,
+                "tool_calls_configured": len(config.tool_calls),
+                "output_dir": config.output_dir or "load_test_results",
+            },
+            "summary": {
+                "total_requests": 0,
+                "successful_requests": 0,
+                "failed_requests": 0,
+                "success_rate": 0.0,
+                "test_duration_seconds": round(test_duration, 2),
+                "requests_per_second": 0.0,
+                "avg_concurrent_rps": 0.0,
+            },
+            "latency_statistics": {
+                "min_ms": 0.0,
+                "max_ms": 0.0,
+                "mean_ms": 0.0,
+                "median_ms": 0.0,
+                "p95_ms": 0.0,
+                "p99_ms": 0.0,
+                "stdev_ms": 0.0,
+            },
+            "per_tool_statistics": {},
+            "errors": {},
         }
🧹 Nitpick comments (2)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (2)

115-125: Redundant check for rss_values.

The check if rss_values: on line 115 is redundant because rss_values is derived directly from memory_samples on line 111, and the outer check on line 110 already ensures memory_samples is truthy.

Consider removing the inner check:

         if rss_values:
-            memory_stats = {
+        memory_stats = {
                 "samples_count": len(memory_samples),
                 "rss_mean_mb": statistics.mean(rss_values),
                 "rss_max_mb": max(rss_values),
                 "rss_min_mb": min(rss_values),
                 "vms_mean_mb": statistics.mean(vms_values),
                 "vms_max_mb": max(vms_values),
                 "percent_mean": statistics.mean(percent_values),
                 "percent_max": max(percent_values),
-            }
+        }

169-185: Consider using statistics.quantiles for percentile calculations.

The custom percentile implementation uses a simplified nearest-rank method that rounds down (int(len(sorted_data) * percentile)), which can be slightly inaccurate. For example, with 100 elements at p95, it returns the 96th element rather than the 95th.

Python's statistics.quantiles (available since 3.8) provides more accurate percentile calculations:

 def _percentile(data: list[float], percentile: float) -> float:
-    """Calculate percentile value.
+    """Calculate percentile value using quantiles.
 
     Args:
         data: List of values
         percentile: Percentile to calculate
 
     Returns:
         Percentile value
     """
     if not data:
         return 0.0
 
-    sorted_data = sorted(data)
-    index = int(len(sorted_data) * percentile)
-    index = min(index, len(sorted_data) - 1)
-    return sorted_data[index]
+    # statistics.quantiles returns n-1 cut points, so for percentiles we use n=100
+    # and index by the percentile value
+    quantile_index = int(percentile * 100) - 1
+    quantiles = statistics.quantiles(data, n=100, method='inclusive')
+    return quantiles[quantile_index] if quantile_index < len(quantiles) else quantiles[-1]

Alternatively, for a simpler fix that maintains the current approach:

-    index = int(len(sorted_data) * percentile)
+    import math
+    index = int(math.ceil(len(sorted_data) * percentile)) - 1
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 186ffbc and 2797ead.

📒 Files selected for processing (7)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (1 hunks)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{py,yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).

**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
packages/*/src/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Importable Python code inside packages must live under packages//src/

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
{src/**/*.py,packages/*/src/**/*.py}

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*

⚙️ CodeRabbit configuration file

**/*: # Code Review Instructions

  • Ensure the code follows best practices and coding standards. - For Python code, follow
    PEP 20 and
    PEP 8 for style guidelines.
  • Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
    Example:
    def my_function(param1: int, param2: str) -> bool:
        pass
  • For Python exception handling, ensure proper stack trace preservation:
    • When re-raising exceptions: use bare raise statements to maintain the original stack trace,
      and use logger.error() (not logger.exception()) to avoid duplicate stack trace output.
    • When catching and logging exceptions without re-raising: always use logger.exception()
      to capture the full stack trace information.

Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any

words listed in the ci/vale/styles/config/vocabularies/nat/reject.txt file, words that might appear to be
spelling mistakes but are listed in the ci/vale/styles/config/vocabularies/nat/accept.txt file are OK.

Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,

and should contain an Apache License 2.0 header comment at the top of each file.

  • Confirm that copyright years are up-to date whenever a file is changed.

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
packages/**/*

⚙️ CodeRabbit configuration file

packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain a pyproject.toml file. - The pyproject.toml file should declare a dependency on nvidia-nat or another package with a name starting
with nvidia-nat-. This dependency should be declared using ~=<version>, and the version should be a two
digit version (ex: ~=1.0).

  • Not all packages contain Python code, if they do they should also contain their own set of tests, in a
    tests/ directory at the same level as the pyproject.toml file.

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*.{yaml,yml}

📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)

In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
**/configs/**

📄 CodeRabbit inference engine (.cursor/rules/general.mdc)

Configuration files consumed by code must be stored next to that code in a configs/ folder

Files:

  • packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
🧬 Code graph analysis (5)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
  • MCPLoadTest (106-490)
  • run_load_test (493-544)
  • run_load_test_from_yaml (547-563)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (3)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
  • LoadTestConfig (53-81)
  • MemorySample (97-103)
  • ToolCallResult (85-93)
packages/nvidia_nat_mcp/src/nat/plugins/mcp/client_base.py (1)
  • transport (184-185)
src/nat/profiler/calc/calc_runner.py (1)
  • output_dir (200-201)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (1)
  • run_load_test_from_yaml (547-563)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (2)
  • LoadTestConfig (53-81)
  • ToolCallConfig (39-49)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
  • TextContent (101-105)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (1)
  • generate_summary_report (27-166)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)
  • load_config_from_yaml (28-97)
🪛 Ruff (0.13.3)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py

151-151: Consider moving this statement to an else block

(TRY300)


157-157: Do not catch blind exception: Exception

(BLE001)


158-158: Use logging.exception instead of logging.error

Replace with exception

(TRY400)

packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py

44-44: Avoid specifying long messages outside the exception class

(TRY003)


52-52: Avoid specifying long messages outside the exception class

(TRY003)


56-56: Avoid specifying long messages outside the exception class

(TRY003)


73-73: Avoid specifying long messages outside the exception class

(TRY003)


78-78: Avoid specifying long messages outside the exception class

(TRY003)


110-110: Avoid specifying long messages outside the exception class

(TRY003)


113-113: Avoid specifying long messages outside the exception class

(TRY003)


116-116: Avoid specifying long messages outside the exception class

(TRY003)


119-119: Avoid specifying long messages outside the exception class

(TRY003)


122-122: Avoid specifying long messages outside the exception class

(TRY003)


125-125: Avoid specifying long messages outside the exception class

(TRY003)


128-128: Avoid specifying long messages outside the exception class

(TRY003)


131-131: Avoid specifying long messages outside the exception class

(TRY003)

packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py

163-163: subprocess call: check for execution of untrusted input

(S603)


178-178: Consider moving this statement to an else block

(TRY300)


181-181: Avoid specifying long messages outside the exception class

(TRY003)


199-199: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


250-250: Do not catch blind exception: Exception

(BLE001)


276-276: Standard pseudo-random generators are not suitable for cryptographic purposes

(S311)


521-521: Avoid specifying long messages outside the exception class

(TRY003)


```bash
# Basic usage
python packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not install as a script command under nvidia-nat-test package ?

That way it's transparent/easy for any user to run.

psutil could also be added as a dependency for the the test package.

logger = logging.getLogger(__name__)


def load_config_from_yaml(config_path: str | Path) -> LoadTestConfig:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is susceptible to loading an invalid/malformed configuration and using defaults.

Instead, we should have concrete configuration BaseModel types and simply validate via Pydantic model instantiation.

)


def validate_config(config: LoadTestConfig) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these could be specified as constraints on Pydantic fields

@yczhang-nv yczhang-nv marked this pull request as draft October 16, 2025 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants