Skip to content
276 changes: 276 additions & 0 deletions src/nat/front_ends/mcp/load_test_utils/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,276 @@
<!--
SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# MCP Server Load Testing

This utility simulates concurrent users making tool calls to MCP servers and generates detailed performance reports for NVIDIA NeMo Agent toolkit.

## Requirements

Before running load tests, ensure you have the following:

- NeMo Agent toolkit with MCP support installed through `nvidia-nat[mcp]`
- Valid NeMo Agent toolkit workflow configuration with MCP-compatible tools
- Python 3.10 or higher
- `psutil` package for memory monitoring

### Installing psutil

The `psutil` package is required for monitoring server memory usage during load tests. Install it using one of the following methods:

```bash
uv pip install psutil
```

If you have already installed NeMo Agent toolkit with all dependencies, psutil may already be available. Verify the installation:

```bash
python -c "import psutil; print(f'psutil {psutil.__version__} installed')"
```

## Quick Start

Run a load test from the project root:

```bash
python src/nat/front_ends/mcp/load_test_utils/cli.py \
--config_file=src/nat/front_ends/mcp/load_test_utils/configs/config.yml
```

List available configurations:

```bash
python src/nat/front_ends/mcp/load_test_utils/cli.py --list-configs
```

Get help:

```bash
python src/nat/front_ends/mcp/load_test_utils/cli.py --help
```

## Configuration

Configure load test options using YAML files stored in the `configs/` directory.

### Example Configuration

```yaml
# Path to NeMo Agent toolkit workflow configuration file
config_file: "examples/getting_started/simple_calculator/configs/config.yml"

# Server configuration
server:
host: "localhost"
port: 9901
transport: "streamable-http" # Options: "streamable-http" or "sse"

# Load test parameters
load_test:
num_concurrent_users: 10
duration_seconds: 30
warmup_seconds: 5

# Output configuration
output:
directory: "load_test_results"

# Tool calls to execute during load testing
tool_calls:
- tool_name: "calculator_multiply"
args:
text: "2 * 3"
weight: 2.0 # Called twice as often as weight 1.0 tools

- tool_name: "calculator_divide"
args:
text: "10 / 2"
weight: 1.0
```

### Configuration Parameters

#### Required Parameters

**`config_file`** (string)
: Path to the NeMo Agent toolkit workflow configuration file.

#### Server Configuration

Configure the MCP server settings in the `server` section:

**`host`** (string, default: `"localhost"`)
: Host address where the MCP server will run.

**`port`** (integer, default: `9901`)
: Port number for the MCP server.

**`transport`** (string, default: `"streamable-http"`)
: Transport protocol type. Options: `"streamable-http"` or `"sse"`.

#### Load Test Parameters

Configure load test behavior in the `load_test` section:

**`num_concurrent_users`** (integer, default: `10`)
: Number of concurrent users to simulate.

**`duration_seconds`** (integer, default: `60`)
: Duration of the load test in seconds.

**`warmup_seconds`** (integer, default: `5`)
: Warmup period before measurements begin, in seconds.

#### Output Configuration

Configure report output in the `output` section:

**`directory`** (string, default: `"load_test_results"`)
: Directory where test reports will be saved.

#### Tool Calls

Define tool calls to execute in the `tool_calls` list. Each tool call includes:

**`tool_name`** (string, required)
: Name of the MCP tool to call.

**`args`** (dictionary, optional)
: Arguments to pass to the tool.

**`weight`** (float, default: `1.0`)
: Relative call frequency. Tools with higher weights are called more frequently. A tool with weight 2.0 is called twice as often as a tool with weight 1.0.

## Running Load Tests

### Command Line

Run load tests from the project root using the command-line interface:

```bash
# Basic usage
python src/nat/front_ends/mcp/load_test_utils/cli.py \
--config_file=src/nat/front_ends/mcp/load_test_utils/configs/config.yml

# With verbose logging
python src/nat/front_ends/mcp/load_test_utils/cli.py \
--config_file=src/nat/front_ends/mcp/load_test_utils/configs/config.yml \
--verbose

# Short form
python src/nat/front_ends/mcp/load_test_utils/cli.py \
-c src/nat/front_ends/mcp/load_test_utils/configs/config.yml
```

### Python API

#### Using YAML Configuration

```python
from nat.front_ends.mcp.load_test_utils import run_load_test_from_yaml

results = run_load_test_from_yaml(
"src/nat/front_ends/mcp/load_test_utils/configs/config.yml"
)
```

#### Programmatic Usage

```python
from nat.front_ends.mcp.load_test_utils import run_load_test

results = run_load_test(
config_file="examples/getting_started/simple_calculator/configs/config.yml",
tool_calls=[
{
"tool_name": "calculator_multiply",
"args": {"text": "2 * 3"},
"weight": 2.0,
},
{
"tool_name": "calculator_divide",
"args": {"text": "10 / 2"},
"weight": 1.0,
},
],
num_concurrent_users=10,
duration_seconds=30,
)
```

## Output Reports

The load test generates two report files in the output directory:

### CSV Report

**File name**: `load_test_YYYYMMDD_HHMMSS.csv`

Detailed per-request data with the following columns:

- `timestamp`: Request timestamp
- `tool_name`: Name of the tool called
- `success`: Boolean success status
- `latency_ms`: Request latency in milliseconds
- `memory_rss_mb`: Resident Set Size (RSS) memory in MB at request time
- `memory_vms_mb`: Virtual Memory Size (VMS) in MB at request time
- `memory_percent`: Memory usage percentage at request time
- `error`: Error message if the request failed

### Summary Report

**File name**: `load_test_YYYYMMDD_HHMMSS_summary.txt`

Human-readable summary with the following statistics:

**Summary Metrics**
: Total requests, success rate, requests per second

**Latency Statistics**
: Mean, median, P95, P99, minimum, and maximum latencies

**Memory Statistics**
: RSS and VMS memory usage (mean and max), memory percentage (mean and max)

**Per-Tool Statistics**
: Individual performance metrics for each tool

**Error Analysis**
: Breakdown of failed requests by error type

## Creating Custom Tests

To create a custom load test configuration:

1. Copy the example configuration file:

```bash
cp src/nat/front_ends/mcp/load_test_utils/configs/config.yml \
src/nat/front_ends/mcp/load_test_utils/configs/my_test.yml
```

2. Edit `my_test.yml` to customize the following parameters:
- Update `config_file` to point to your NeMo Agent toolkit workflow
- Adjust `tool_calls` to match your available tools
- Set load test parameters such as `num_concurrent_users` and `duration_seconds`

3. Run your custom test:

```bash
python src/nat/front_ends/mcp/load_test_utils/cli.py \
--config_file=src/nat/front_ends/mcp/load_test_utils/configs/my_test.yml
```
24 changes: 24 additions & 0 deletions src/nat/front_ends/mcp/load_test_utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# SPDX-FileCopyrightText: Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""MCP Server Load Testing Utilities.

This module provides utilities for load testing MCP servers.
"""

from nat.front_ends.mcp.load_test_utils.load_tester import MCPLoadTest
from nat.front_ends.mcp.load_test_utils.load_tester import run_load_test
from nat.front_ends.mcp.load_test_utils.load_tester import run_load_test_from_yaml

__all__ = ["MCPLoadTest", "run_load_test", "run_load_test_from_yaml"]
Loading
Loading