- 
                Notifications
    You must be signed in to change notification settings 
- Fork 406
Implement load test scripts for MCP server #978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Implement load test scripts for MCP server #978
Conversation
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
| WalkthroughAdds a new MCP load-testing toolkit: YAML config loader, core MCPLoadTest runner with server lifecycle, client invocations, concurrency and memory sampling, report generation (CSV/text), a CLI entrypoint, public API exports, and documentation/example config. Outputs a structured summary and files to an output directory. Changes
 Sequence Diagram(s)sequenceDiagram
  autonumber
  actor User
  participant CLI as CLI (cli.py)
  participant Loader as Config Loader
  participant Tester as MCPLoadTest
  participant Server as MCP Server
  participant Client as MCP Client
  participant Reporter as Report Generator
  User->>CLI: mcp-load-test --config config.yml
  CLI->>Loader: load_config_from_yaml(config.yml)
  Loader-->>CLI: LoadTestConfig
  CLI->>Tester: run_load_test_from_yaml(...)
  Tester->>Server: start server (host, port, transport)
  Tester->>Client: create client session (transport)
  par Warmup
    Tester->>Client: invoke weighted tool calls
    Client-->>Tester: ToolCallResult
  end
  par Main test window
    loop concurrent users
      Tester->>Client: tool call
      Client-->>Tester: ToolCallResult
    end
    par Memory monitor
      Tester->>Tester: sample memory (rss/vms/percent)
    end
  end
  Tester->>Server: stop server
  Tester->>Reporter: generate_summary_report(results, mem_samples)
  Reporter-->>Tester: summary dict
  Tester->>Tester: write CSV & summary.txt
  Tester-->>CLI: summary dict
  CLI-->>User: print summary
sequenceDiagram
  autonumber
  participant CLI
  participant FS as Filesystem
  participant Loader
  CLI->>FS: read config.yml
  alt config exists & valid
    CLI->>Loader: parse + validate
    Loader-->>CLI: LoadTestConfig
  else missing/invalid
    Loader-->>CLI: raise error
    CLI-->>User: log/exit
  end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
 ✨ Finishing touches🧪 Generate unit tests (beta)
 📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
 🚧 Files skipped from review as they are similar to previous changes (1)
 ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🧹 Nitpick comments (3)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)
41-43: Add explicit return type hints for CLI functions.Conform to project typing standards.
Apply this diff:
-def parse_args(): +def parse_args() -> argparse.Namespace: @@ -def list_configs(): +def list_configs() -> None: @@ -def print_summary(results: dict): +from typing import Any + + +def print_summary(results: dict[str, Any]) -> None: @@ -def main(): +def main() -> int:Also applies to: 93-101, 114-121, 180-186
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
81-83: Harden YAML loading (encoding + YAML error handling).Specify UTF‑8 and wrap YAML parse errors with a clear message.
Apply this diff:
- with open(config_path) as f: - config_data = yaml.safe_load(f) + try: + with open(config_path, encoding="utf-8") as f: + config_data = yaml.safe_load(f) + except yaml.YAMLError as e: + logger.error("Failed to parse YAML config: %s", e) + raise ValueError(f"Invalid YAML in config file: {config_path}") from eAs per coding guidelines
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
392-433: Optimize nearest memory-sample lookup to avoid O(n^2).Use bisect on sorted timestamps; overall O(n log n).
Apply this diff:
with open(file_path, "w", newline="") as f: writer = csv.writer(f) @@ - for result in self.results: - # Find the closest memory sample for this result - memory_rss = "" - memory_vms = "" - memory_percent = "" - - if self.memory_samples: - # Find the closest memory sample by timestamp - closest_sample = min( - self.memory_samples, - key=lambda sample: abs(sample.timestamp - result.timestamp), - ) - memory_rss = f"{closest_sample.rss_mb:.2f}" - memory_vms = f"{closest_sample.vms_mb:.2f}" - memory_percent = f"{closest_sample.percent:.2f}" + # Pre-sort memory samples once and build timestamp array + samples_sorted = sorted(self.memory_samples, key=lambda s: s.timestamp) if self.memory_samples else [] + sample_times = [s.timestamp for s in samples_sorted] + + import bisect + + for result in self.results: + memory_rss = "" + memory_vms = "" + memory_percent = "" + + if samples_sorted: + pos = bisect.bisect_left(sample_times, result.timestamp) + candidates = [] + if pos > 0: + candidates.append(samples_sorted[pos - 1]) + if pos < len(samples_sorted): + candidates.append(samples_sorted[pos]) + closest_sample = min(candidates, key=lambda s: abs(s.timestamp - result.timestamp)) + memory_rss = f"{closest_sample.rss_mb:.2f}" + memory_vms = f"{closest_sample.vms_mb:.2f}" + memory_percent = f"{closest_sample.percent:.2f}"
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
- src/nat/front_ends/mcp/load_test_utils/README.md(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/__init__.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/cli.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/config_loader.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/load_tester.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/report_generator.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All importable Python code must live under src/ (or packages//src/)
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Changes in src/nat should prioritize backward compatibility
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/README.md
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/README.md
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
- src/nat/front_ends/mcp/load_test_utils/__init__.py
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/README.md
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
- src/nat/front_ends/mcp/load_test_utils/report_generator.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/README.@(md|ipynb)
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings
Files:
- src/nat/front_ends/mcp/load_test_utils/README.md
**/*.{yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.
Files:
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
**/configs/**
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Configuration files consumed by code must be stored next to that code in a configs/ folder
Files:
- src/nat/front_ends/mcp/load_test_utils/configs/config.yml
🧬 Code graph analysis (5)
src/nat/front_ends/mcp/load_test_utils/__init__.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
MCPLoadTest(106-502)
run_load_test(505-575)
run_load_test_from_yaml(578-601)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
run_load_test_from_yaml(578-601)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
TextContent(101-105)src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
generate_summary_report(27-171)src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
load_config_from_yaml(28-135)
src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
LoadTestConfig(53-81)
MemorySample(97-103)
ToolCallResult(85-93)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
LoadTestConfig(53-81)
ToolCallConfig(39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/cli.py
222-222: Consider moving this statement to an else block
(TRY300)
228-228: Do not catch blind exception: Exception
(BLE001)
229-229: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
src/nat/front_ends/mcp/load_test_utils/load_tester.py
152-152: subprocess call: check for execution of untrusted input
(S603)
168-168: Consider moving this statement to an else block
(TRY300)
171-171: Avoid specifying long messages outside the exception class
(TRY003)
189-189: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
242-242: Do not catch blind exception: Exception
(BLE001)
269-269: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
552-552: Avoid specifying long messages outside the exception class
(TRY003)
src/nat/front_ends/mcp/load_test_utils/config_loader.py
77-77: Avoid specifying long messages outside the exception class
(TRY003)
85-85: Avoid specifying long messages outside the exception class
(TRY003)
90-90: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Avoid specifying long messages outside the exception class
(TRY003)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
151-151: Avoid specifying long messages outside the exception class
(TRY003)
154-154: Avoid specifying long messages outside the exception class
(TRY003)
157-157: Avoid specifying long messages outside the exception class
(TRY003)
160-160: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CI Pipeline / Check
🔇 Additional comments (2)
src/nat/front_ends/mcp/load_test_utils/__init__.py (1)
15-24: Public API exports look good.Header, docstring, and re-exports are tidy and correct.
src/nat/front_ends/mcp/load_test_utils/configs/config.yml (1)
16-61: Sample config structure is consistent with the loader.Values and comments align with config_loader expectations.
        
          
                packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
          
            Show resolved
            Hide resolved
        
              
          
                packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
          
            Show resolved
            Hide resolved
        
      Signed-off-by: Yuchen Zhang <[email protected]>
- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. * **Documentation** * Clarified Model Context Protocol (MCP) Authentication docs with renamed titles, standardized terminology, and grammar improvements. * Expanded token storage guidance: clearer backend options (in-memory default, S3, MySQL, Redis, custom), encryption notes, persistence behavior, multi-user isolation, plus a new note on in-memory use for dev/testing. * Simplified headings and steps for initial auth, token retrieval/refresh, UI launch, and multi-user mode. * Improved security and troubleshooting wording. * Polished MCP client guide for readability and consistency. Authors: - David Gardner (https://github.com/dagardner-nv) Approvers: - Anuradha Karuppiah (https://github.com/AnuradhaKaruppiah) URL: NVIDIA#977 Signed-off-by: Yuchen Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
src/nat/front_ends/mcp/load_test_utils/cli.py (2)
195-199: Uselogger.exception()instead of conditionalexc_info.When catching without re-raising, use
logger.exception()to unconditionally capture the full stack trace.As per coding guidelines
Apply this diff:
- except Exception as e: - logger.error("Load test failed: %s", e, exc_info=args.verbose) + except Exception: + logger.exception("Load test failed") if not args.verbose: logger.info("Use --verbose flag for detailed error information") return 1
195-199: Uselogger.exception()to always capture stack traces.When catching without re-raising, use
logger.exception()to ensure full trace is logged. Keep the conditional info message about--verbose.Apply this diff:
- except Exception as e: - logger.error("Load test failed: %s", e, exc_info=args.verbose) + except Exception: + logger.exception("Load test failed") if not args.verbose: logger.info("Use --verbose flag for detailed error information") return 1As per coding guidelines
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
164-169: Subprocess PIPE can deadlock - pipes are never consumed.The server process uses
subprocess.PIPEfor both stdout and stderr, but nothing consumes these pipes. If the server produces sufficient output, the pipes will fill and the server will block, potentially causing a deadlock.Apply this diff to inherit parent stdio (or redirect to DEVNULL if you want to suppress output):
self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=None, # inherit parent stdout + stderr=None, # inherit parent stderr text=True, )Alternatively, redirect to DEVNULL to suppress output:
self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, text=True, )
164-169: Avoid PIPE for server stdio (deadlock risk).Unconsumed PIPE buffers can fill and block the server process. Use
subprocess.DEVNULLor inherit parent stdio.Apply this diff:
self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, text=True, )Alternatively, set both to
Noneto inherit parent stdio for debugging.
🧹 Nitpick comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
146-162: Normalizeconfig_filebefore subprocess invocation. Althoughvalidate_configensures the file exists, callself.config.config_file = Path(self.config.config_file).resolve()(or similar) in_start_serverto canonicalize the path prior to building thecmdlist.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
- src/nat/front_ends/mcp/load_test_utils/README.md(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/cli.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/load_tester.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- src/nat/front_ends/mcp/load_test_utils/README.md
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All importable Python code must live under src/ (or packages//src/)
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Changes in src/nat should prioritize backward compatibility
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
- src/nat/front_ends/mcp/load_test_utils/cli.py
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/cli.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
run_load_test_from_yaml(590-613)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
TextContent(101-105)src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
generate_summary_report(27-171)src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
load_config_from_yaml(28-135)
validate_config(138-171)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/cli.py
189-189: Consider moving this statement to an else block
(TRY300)
195-195: Do not catch blind exception: Exception
(BLE001)
196-196: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
src/nat/front_ends/mcp/load_test_utils/load_tester.py
164-164: subprocess call: check for execution of untrusted input
(S603)
180-180: Consider moving this statement to an else block
(TRY300)
183-183: Avoid specifying long messages outside the exception class
(TRY003)
201-201: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
254-254: Do not catch blind exception: Exception
(BLE001)
281-281: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
564-564: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (19)
src/nat/front_ends/mcp/load_test_utils/cli.py (4)
69-75: Previous review comment about--list-configsappears outdated.The previous review suggested making
--config_fileoptional to support--list-configs, but no such option exists in this CLI implementation. The current design correctly requires--config_fileas the only configuration method.
69-75: Past comment about--list-configsis no longer applicable.The previous review flagged that
--config_filebeing required would break a--list-configsoption, but no such option exists in the current implementation. The code correctly requires--config_filefor its intended purpose.
87-151: LGTM - summary formatting is clear and comprehensive.The function correctly handles all summary sections including optional memory and per-tool statistics, with appropriate formatting and error handling.
87-151: LGTM!The summary formatting is well-structured, handles optional sections gracefully, and provides clear, actionable output for users.
src/nat/front_ends/mcp/load_test_utils/load_tester.py (15)
134-144: LGTM! Transport-specific client selection correctly implemented.The
_client_ctxhelper properly selects betweenstreamablehttp_clientandsse_clientbased on the configured transport type, addressing the previous review's concern about respecting the transport configuration.
254-264: Exception handling is appropriate for tool call isolation.Catching all exceptions here ensures that failures in individual tool calls are captured as
ToolCallResultobjects with error details, allowing the load test to continue. This broad exception handling is acceptable for this use case.
198-210: Weighted random selection correctly implemented.The weighted random selection using cumulative probabilities is mathematically correct and will properly respect the relative weights assigned to each tool call.
134-144: Well done - transport selection correctly implemented.This method properly addresses the previous review concern by selecting the appropriate MCP client (
streamablehttp_clientvssse_client) based on the configured transport type. The lazy import ofsse_clientis a good practice for optional dependencies.
517-613: LGTM - public API functions are well-designed.Both
run_load_test()andrun_load_test_from_yaml()provide clear, well-documented entry points for load testing. The parameter validation, example usage, and integration with config loading are all correctly implemented.
240-240: No change needed:isErroris valid
isErrorappears in the MCP Python SDK client and tests; this attribute name is correct.
134-144: LGTM! Transport-aware client selection implemented.The
_client_ctx()helper correctly selects the appropriate MCP client based on the configured transport type, addressing the earlier concern about SSE support.
186-196: LGTM! Graceful shutdown with kill fallback.The server termination logic properly attempts graceful shutdown before forceful kill.
198-210: LGTM! Weighted tool selection is correct.The weighted random selection logic correctly distributes tool calls according to configured weights. Using
random.uniform()is appropriate for load testing (crypto-secure random not needed).
212-264: LGTM! Tool invocation with proper error handling.The method correctly:
- Uses transport-aware client via
_client_ctx()- Measures latency for both success and failure cases
- Records all errors in
ToolCallResultfor analysisCatching broad
Exceptionis appropriate here to capture all tool execution failures.
266-283: LGTM! User simulation logic is sound.The simulation correctly executes weighted tool calls with realistic delays until the test duration expires.
285-315: LGTM! Memory monitoring implementation is correct.The periodic memory sampling with proper error handling provides valuable resource usage insights during load tests.
317-378: LGTM! Test orchestration is well-structured.The run method properly:
- Sequences warmup and actual test phases
- Runs concurrent user simulations and memory monitoring in parallel
- Ensures server cleanup via
finallyblock- Generates comprehensive reports
380-514: LGTM! Report generation is comprehensive.The report persistence methods correctly save:
- Detailed CSV with per-request metrics and closest memory samples
- Human-readable summary with configuration, metrics, and errors
Both formats provide valuable insights for load test analysis.
517-613: LGTM! Public API entry points are well-designed.Both entry points provide clean interfaces:
run_load_test(): Direct programmatic usage with parameters
run_load_test_from_yaml(): YAML-driven configuration for CLI and automationThe example docstrings clearly demonstrate usage patterns.
        
          
                packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
          
            Show resolved
            Hide resolved
        
      Signed-off-by: Yuchen Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
164-169: Address the unresolved subprocess.PIPE deadlock risk.The previous review correctly identified that using
subprocess.PIPEfor bothstdoutandstderrcan cause the long-running server process to block when pipe buffers fill.The recommended solutions from the previous review remain valid:
Solution 1 (simplest): Inherit parent stdio
self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=None, + stderr=None, text=True, )Solution 2: Redirect to log files in output_dir
log_file = self.output_dir / "server.log" with open(log_file, "w") as f: self.server_process = subprocess.Popen( cmd, stdout=f, stderr=subprocess.STDOUT, text=True, )Solution 3: Redirect to DEVNULL
self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=subprocess.DEVNULL, + stderr=subprocess.DEVNULL, text=True, )Redirecting to log files (Solution 2) would be most helpful for debugging test failures.
🧹 Nitpick comments (1)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
81-82: Consider explicit YAML parsing error handling.While
yaml.safe_load()is secure, it can raiseyaml.YAMLErrorfor malformed YAML. Consider wrapping this in a try-except to provide a clearer error message to users.Apply this diff:
- with open(config_path) as f: - config_data = yaml.safe_load(f) + try: + with open(config_path) as f: + config_data = yaml.safe_load(f) + except yaml.YAMLError as e: + raise ValueError(f"Invalid YAML in config file: {e}") from e
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
- src/nat/front_ends/mcp/load_test_utils/config_loader.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/load_tester.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All importable Python code must live under src/ (or packages//src/)
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Changes in src/nat should prioritize backward compatibility
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
TextContent(101-105)src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
generate_summary_report(27-171)src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
load_config_from_yaml(28-135)
validate_config(138-171)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
LoadTestConfig(53-81)
ToolCallConfig(39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/load_tester.py
164-164: subprocess call: check for execution of untrusted input
(S603)
180-180: Consider moving this statement to an else block
(TRY300)
183-183: Avoid specifying long messages outside the exception class
(TRY003)
201-201: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
254-254: Do not catch blind exception: Exception
(BLE001)
281-281: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
564-564: Avoid specifying long messages outside the exception class
(TRY003)
src/nat/front_ends/mcp/load_test_utils/config_loader.py
77-77: Avoid specifying long messages outside the exception class
(TRY003)
85-85: Avoid specifying long messages outside the exception class
(TRY003)
90-90: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Avoid specifying long messages outside the exception class
(TRY003)
148-148: Avoid specifying long messages outside the exception class
(TRY003)
151-151: Avoid specifying long messages outside the exception class
(TRY003)
154-154: Avoid specifying long messages outside the exception class
(TRY003)
157-157: Avoid specifying long messages outside the exception class
(TRY003)
160-160: Avoid specifying long messages outside the exception class
(TRY003)
163-163: Avoid specifying long messages outside the exception class
(TRY003)
166-166: Avoid specifying long messages outside the exception class
(TRY003)
169-169: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
134-144: LGTM! Correctly addresses transport selection.The lazy import of
sse_clientand conditional client selection properly supports bothstreamable-httpandssetransports as indicated in the past review.
Signed-off-by: Yuchen Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (1)
164-169: Remove unused PIPEs to avoid deadlocks.
nat mcp serveis launched with bothstdoutandstderrpiped, but nothing ever consumes those streams. As soon as the server emits enough output to fill the OS pipe buffers, the child blocks on write and the entire load test hangs. Please inherit stdio, redirect to files, or otherwise drain the pipes before running this at scale.self.server_process = subprocess.Popen( cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, + stdout=None, + stderr=None, text=True, )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
- src/nat/front_ends/mcp/load_test_utils/config_loader.py(1 hunks)
- src/nat/front_ends/mcp/load_test_utils/load_tester.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All importable Python code must live under src/ (or packages//src/)
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
src/nat/**/*
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Changes in src/nat should prioritize backward compatibility
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
⚙️ CodeRabbit configuration file
This directory contains the core functionality of the toolkit. Changes should prioritize backward compatibility.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
- src/nat/front_ends/mcp/load_test_utils/load_tester.py
- src/nat/front_ends/mcp/load_test_utils/config_loader.py
🧬 Code graph analysis (2)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
TextContent(101-105)src/nat/front_ends/mcp/load_test_utils/report_generator.py (1)
generate_summary_report(27-171)src/nat/front_ends/mcp/load_test_utils/config_loader.py (2)
load_config_from_yaml(28-102)
validate_config(105-138)
src/nat/front_ends/mcp/load_test_utils/config_loader.py (1)
src/nat/front_ends/mcp/load_test_utils/load_tester.py (2)
LoadTestConfig(53-81)
ToolCallConfig(39-49)
🪛 Ruff (0.13.3)
src/nat/front_ends/mcp/load_test_utils/load_tester.py
164-164: subprocess call: check for execution of untrusted input
(S603)
180-180: Consider moving this statement to an else block
(TRY300)
183-183: Avoid specifying long messages outside the exception class
(TRY003)
201-201: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
254-254: Do not catch blind exception: Exception
(BLE001)
281-281: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
549-549: Avoid specifying long messages outside the exception class
(TRY003)
src/nat/front_ends/mcp/load_test_utils/config_loader.py
44-44: Avoid specifying long messages outside the exception class
(TRY003)
52-52: Avoid specifying long messages outside the exception class
(TRY003)
57-57: Avoid specifying long messages outside the exception class
(TRY003)
78-78: Avoid specifying long messages outside the exception class
(TRY003)
83-83: Avoid specifying long messages outside the exception class
(TRY003)
115-115: Avoid specifying long messages outside the exception class
(TRY003)
118-118: Avoid specifying long messages outside the exception class
(TRY003)
121-121: Avoid specifying long messages outside the exception class
(TRY003)
124-124: Avoid specifying long messages outside the exception class
(TRY003)
127-127: Avoid specifying long messages outside the exception class
(TRY003)
130-130: Avoid specifying long messages outside the exception class
(TRY003)
133-133: Avoid specifying long messages outside the exception class
(TRY003)
136-136: Avoid specifying long messages outside the exception class
(TRY003)
        
          
                packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
          
            Show resolved
            Hide resolved
        
      Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
| Thx for adding this @yczhang-nv. I am not sure if this needs to be a part of the core package/CLI. We can keep it in the test package and evolve it from there and based on how things play put we can promote it to UI. Also for simulating concurrent users we can use the eval sub-system if possible. | 
Signed-off-by: Yuchen Zhang <[email protected]>
Signed-off-by: Yuchen Zhang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)
17-57: Validate YAML root is a mapping before.get.
yaml.safe_loadcan return a list/scalar, so the subsequent.getcalls will raise anAttributeErrorinstead of a usefulValueError. Please assert the loaded object is a mapping first.+from collections.abc import Mapping @@ - if not config_data: - raise ValueError("Config file is empty") + if not config_data: + raise ValueError("Config file is empty") + + if not isinstance(config_data, Mapping): + raise ValueError("Config file must contain a top-level mapping")packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (1)
149-168: Avoid piping server stdio without readers.
stdout=subprocess.PIPEandstderr=subprocess.PIPEwill block the MCP server once the buffers fill because nothing consumes those pipes. Let the child inherit stdio (or redirect explicitly) so long-running tests can’t deadlock.- self.server_process = subprocess.Popen( - cmd, - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, - text=True, - ) + self.server_process = subprocess.Popen( + cmd, + stdout=None, + stderr=None, + text=True, + )packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py (1)
157-161: Always log full tracebacks on failure.Per the project logging rules, when catching without re-raising we must call
logger.exception()so the stack trace is preserved regardless of verbosity flags.- except Exception as e: - logger.error("Load test failed: %s", e, exc_info=args.verbose) + except Exception: + logger.exception("Load test failed") if not args.verbose: logger.info("Use --verbose flag for detailed error information") return 1As per coding guidelines
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (1)
44-52: Inconsistent return structure for empty results still present.This issue was previously flagged: when
resultsis empty, the function returns a flat dictionary, but when results exist, it returns a nested structure with keys liketest_configuration,summary,latency_statistics, etc. Callers expecting the nested structure will encounter errors.Apply the diff from the previous review to return a consistent nested structure:
if not results: return { - "total_requests": 0, - "successful_requests": 0, - "failed_requests": 0, - "success_rate": 0.0, - "test_duration_seconds": test_duration, - "requests_per_second": 0.0, + "test_configuration": { + "config_file": config.config_file, + "num_concurrent_users": config.num_concurrent_users, + "duration_seconds": config.duration_seconds, + "server_url": f"http://{config.server_host}:{config.server_port}", + "transport": config.transport, + "warmup_seconds": config.warmup_seconds, + "tool_calls_configured": len(config.tool_calls), + "output_dir": config.output_dir or "load_test_results", + }, + "summary": { + "total_requests": 0, + "successful_requests": 0, + "failed_requests": 0, + "success_rate": 0.0, + "test_duration_seconds": round(test_duration, 2), + "requests_per_second": 0.0, + "avg_concurrent_rps": 0.0, + }, + "latency_statistics": { + "min_ms": 0.0, + "max_ms": 0.0, + "mean_ms": 0.0, + "median_ms": 0.0, + "p95_ms": 0.0, + "p99_ms": 0.0, + "stdev_ms": 0.0, + }, + "per_tool_statistics": {}, + "errors": {}, }
🧹 Nitpick comments (2)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (2)
115-125: Redundant check forrss_values.The check
if rss_values:on line 115 is redundant becauserss_valuesis derived directly frommemory_sampleson line 111, and the outer check on line 110 already ensuresmemory_samplesis truthy.Consider removing the inner check:
if rss_values: - memory_stats = { + memory_stats = { "samples_count": len(memory_samples), "rss_mean_mb": statistics.mean(rss_values), "rss_max_mb": max(rss_values), "rss_min_mb": min(rss_values), "vms_mean_mb": statistics.mean(vms_values), "vms_max_mb": max(vms_values), "percent_mean": statistics.mean(percent_values), "percent_max": max(percent_values), - } + }
169-185: Consider usingstatistics.quantilesfor percentile calculations.The custom percentile implementation uses a simplified nearest-rank method that rounds down (
int(len(sorted_data) * percentile)), which can be slightly inaccurate. For example, with 100 elements at p95, it returns the 96th element rather than the 95th.Python's
statistics.quantiles(available since 3.8) provides more accurate percentile calculations:def _percentile(data: list[float], percentile: float) -> float: - """Calculate percentile value. + """Calculate percentile value using quantiles. Args: data: List of values percentile: Percentile to calculate Returns: Percentile value """ if not data: return 0.0 - sorted_data = sorted(data) - index = int(len(sorted_data) * percentile) - index = min(index, len(sorted_data) - 1) - return sorted_data[index] + # statistics.quantiles returns n-1 cut points, so for percentiles we use n=100 + # and index by the percentile value + quantile_index = int(percentile * 100) - 1 + quantiles = statistics.quantiles(data, n=100, method='inclusive') + return quantiles[quantile_index] if quantile_index < len(quantiles) else quantiles[-1]Alternatively, for a simpler fix that maintains the current approach:
- index = int(len(sorted_data) * percentile) + import math + index = int(math.ceil(len(sorted_data) * percentile)) - 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py(1 hunks)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/README.md
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
packages/*/src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Importable Python code inside packages must live under packages//src/
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
**/*.{yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
**/configs/**
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Configuration files consumed by code must be stored next to that code in a configs/ folder
Files:
- packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/configs/config.yml
🧬 Code graph analysis (5)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/__init__.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
MCPLoadTest(106-490)
run_load_test(493-544)
run_load_test_from_yaml(547-563)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (3)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
LoadTestConfig(53-81)
MemorySample(97-103)
ToolCallResult(85-93)packages/nvidia_nat_mcp/src/nat/plugins/mcp/client_base.py (1)
transport(184-185)src/nat/profiler/calc/calc_runner.py (1)
output_dir(200-201)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (1)
run_load_test_from_yaml(547-563)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (2)
LoadTestConfig(53-81)
ToolCallConfig(39-49)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py (3)
src/nat/data_models/api_server.py (1)
TextContent(101-105)packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/report_generator.py (1)
generate_summary_report(27-166)packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py (1)
load_config_from_yaml(28-97)
🪛 Ruff (0.13.3)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py
151-151: Consider moving this statement to an else block
(TRY300)
157-157: Do not catch blind exception: Exception
(BLE001)
158-158: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
44-44: Avoid specifying long messages outside the exception class
(TRY003)
52-52: Avoid specifying long messages outside the exception class
(TRY003)
56-56: Avoid specifying long messages outside the exception class
(TRY003)
73-73: Avoid specifying long messages outside the exception class
(TRY003)
78-78: Avoid specifying long messages outside the exception class
(TRY003)
110-110: Avoid specifying long messages outside the exception class
(TRY003)
113-113: Avoid specifying long messages outside the exception class
(TRY003)
116-116: Avoid specifying long messages outside the exception class
(TRY003)
119-119: Avoid specifying long messages outside the exception class
(TRY003)
122-122: Avoid specifying long messages outside the exception class
(TRY003)
125-125: Avoid specifying long messages outside the exception class
(TRY003)
128-128: Avoid specifying long messages outside the exception class
(TRY003)
131-131: Avoid specifying long messages outside the exception class
(TRY003)
packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/load_tester.py
163-163: subprocess call: check for execution of untrusted input
(S603)
178-178: Consider moving this statement to an else block
(TRY300)
181-181: Avoid specifying long messages outside the exception class
(TRY003)
199-199: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
250-250: Do not catch blind exception: Exception
(BLE001)
276-276: Standard pseudo-random generators are not suitable for cryptographic purposes
(S311)
521-521: Avoid specifying long messages outside the exception class
(TRY003)
        
          
                packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/config_loader.py
          
            Show resolved
            Hide resolved
        
      |  | ||
| ```bash | ||
| # Basic usage | ||
| python packages/nvidia_nat_test/src/nat/test/mcp/load_test_utils/cli.py \ | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not install as a script command under nvidia-nat-test package ?
That way it's transparent/easy for any user to run.
psutil could also be added as a dependency for the the test package.
| logger = logging.getLogger(__name__) | ||
|  | ||
|  | ||
| def load_config_from_yaml(config_path: str | Path) -> LoadTestConfig: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is susceptible to loading an invalid/malformed configuration and using defaults.
Instead, we should have concrete configuration BaseModel types and simply validate via Pydantic model instantiation.
| ) | ||
|  | ||
|  | ||
| def validate_config(config: LoadTestConfig) -> None: | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of these could be specified as constraints on Pydantic fields
Description
This PR introduces a comprehensive load testing suite for MCP servers in NeMo Agent toolkit. The utility simulates concurrent users making tool calls to MCP servers and generates detailed performance reports including latency statistics, memory usage, and per-tool performance metrics.
By Submitting this PR I confirm:
Summary by CodeRabbit
New Features
Documentation