Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 28 additions & 20 deletions test/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,33 +4,34 @@ SGLang uses the built-in library [unittest](https://docs.python.org/3/library/un

## Test Backend Runtime
```bash
cd sglang/test/srt

# Run a single file
python3 test_srt_endpoint.py
> cd test/registered
> python3 core/test_srt_endpoint.py

# Run a single test
python3 test_srt_endpoint.py TestSRTEndpoint.test_simple_decode
> cd test/registered
> python3 core/test_srt_endpoint.py TestSRTEndpoint.test_simple_decode

# Run a suite with multiple files
python3 run_suite.py --suite per-commit
> cd test
> python run_suite.py --hw cuda --suite stage-b-test-small-1-gpu
```

## Test Frontend Language
```bash
cd sglang/test/lang
> cd test/manual/lang_frontend

# Run a single file
python3 test_choices.py
> python3 test_choices.py
```

## Adding or Updating Tests in CI

- Create new test files under `test/srt` or `test/lang` depending on the type of test.
- For nightly tests, place them in `test/srt/nightly/`. Use the `NightlyBenchmarkRunner` helper class in `nightly_utils.py` for performance benchmarking tests.
- Ensure they are referenced in the respective `run_suite.py` (e.g., `test/srt/run_suite.py`) so they are picked up in CI. For most small test cases, they can be added to the `per-commit-1-gpu` suite. Sort the test cases alphabetically by name.
- Ensure you added `unittest.main()` for unittest and `sys.exit(pytest.main([__file__]))` for pytest in the scripts. The CI run them via `python3 test_file.py`.
- The CI will run some suites such as `per-commit-1-gpu`, `per-commit-2-gpu`, and `nightly-1-gpu` automatically. If you need special setup or custom test groups, you may modify the workflows in [`.github/workflows/`](https://github.com/sgl-project/sglang/tree/main/.github/workflows).
- Create new test files under `test/registered/` (organized by category) for CI tests, or `test/manual/` for manual tests.
- For nightly tests, use the CI registry with `nightly=True`. For performance benchmarking tests, use the `NightlyBenchmarkRunner` helper class in `python/sglang/test/nightly_utils.py`.
- Register tests using the CI registry system (see below). For most small test cases, use the `stage-b-test-small-1-gpu` suite. Sort the test cases alphabetically by name.
- Ensure you added `unittest.main()` for unittest and `sys.exit(pytest.main([__file__]))` for pytest in the scripts. The CI runs them via `python3 test_file.py`.
- The CI will run some suites such as `stage-b-test-small-1-gpu`, `stage-b-test-large-2-gpu`, and `nightly-1-gpu` automatically. If you need special setup or custom test groups, you may modify the workflows in [`.github/workflows/`](https://github.com/sgl-project/sglang/tree/main/.github/workflows).

## CI Registry System

Expand Down Expand Up @@ -60,7 +61,7 @@ register_cuda_ci(est_time=200, suite="nightly-1-gpu", nightly=True)

# Multi-backend test
register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu")
register_amd_ci(est_time=120, suite="stage-a-test-1")
register_amd_ci(est_time=120, suite="stage-b-test-small-1-gpu-amd")

# Temporarily disabled test
register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu", disabled="flaky - see #12345")
Expand Down Expand Up @@ -98,16 +99,24 @@ If a test cannot run on 5090 due to any of the above, use `stage-b-test-large-1-
### Available Suites

**Per-Commit (CUDA)**:
- Stage A: `stage-a-test-1` (locked), `stage-a-test-2`, `stage-a-test-cpu`
- Stage A: `stage-a-test-1` (locked), `stage-a-cpu-only`
- Stage B: `stage-b-test-small-1-gpu` (5090), `stage-b-test-large-1-gpu` (H100), `stage-b-test-large-2-gpu`
- Stage C (4-GPU): `stage-c-test-4-gpu-h100`, `stage-c-test-4-gpu-b200`, `stage-c-test-4-gpu-gb200`, `stage-c-test-deepep-4-gpu`
- Stage C (8-GPU): `stage-c-test-8-gpu-h20`, `stage-c-test-8-gpu-h200`, `stage-c-test-8-gpu-b200`, `stage-c-test-deepep-8-gpu-h200`

**Per-Commit (AMD)**:
- `stage-a-test-1`, `stage-b-test-small-1-gpu-amd`, `stage-b-test-large-2-gpu-amd`
- `stage-a-test-1-amd`, `stage-b-test-small-1-gpu-amd`, `stage-b-test-large-1-gpu-amd`, `stage-b-test-large-2-gpu-amd`

**Per-Commit (NPU)**:
- `stage-a-test-1`, `stage-b-test-1-npu-a2`, `stage-b-test-2-npu-a2`, `stage-b-test-4-npu-a3`, `stage-b-test-16-npu-a3`

**Nightly**:
**Nightly (CUDA)**:
- `nightly-1-gpu`, `nightly-2-gpu`, `nightly-4-gpu`, `nightly-8-gpu`, etc.
- Eval: `nightly-eval-text-2-gpu`, `nightly-eval-vlm-2-gpu`
- Perf: `nightly-perf-text-2-gpu`, `nightly-perf-vlm-2-gpu`

**Nightly (AMD)**:
- `nightly-amd`, `nightly-amd-1-gpu`, `nightly-amd-8-gpu`, `nightly-amd-vlm`

### Running Tests with run_suite.py

Expand All @@ -125,17 +134,16 @@ python test/run_suite.py --hw cuda --suite stage-b-test-small-1-gpu \

## Writing Elegant Test Cases

- Learn from existing examples in [sglang/test/srt](https://github.com/sgl-project/sglang/tree/main/test/srt).
- Learn from existing examples in [sglang/test/registered](https://github.com/sgl-project/sglang/tree/main/test/registered).
- Reduce the test time by using smaller models and reusing the server for multiple test cases. Launching a server takes a lot of time.
- Use as few GPUs as possible. Do not run long tests with 8-gpu runners.
- If the test cases take too long, considering adding them to nightly tests instead of per-commit tests.
- Keep each test function focused on a single scenario or piece of functionality.
- Give tests descriptive names reflecting their purpose.
- Use robust assertions (e.g., assert, unittest methods) to validate outcomes.
- Clean up resources to avoid side effects and preserve test independence.
- Reduce the test time by using smaller models and reusing the server for multiple test cases.


## Adding New Models to Nightly CI
- **For text models**: extend [global model lists variables](https://github.com/sgl-project/sglang/blob/85c1f7937781199203b38bb46325a2840f353a04/python/sglang/test/test_utils.py#L104) in `test_utils.py`, or add more model lists
- **For vlms**: extend the `MODEL_THRESHOLDS` global dictionary in `test/srt/nightly/test_vlms_mmmu_eval.py`
- **For text models**: extend the `DEFAULT_MODEL_NAME_FOR_NIGHTLY_EVAL_*` variables in `python/sglang/test/test_utils.py`, or add new model constants.
- **For VLMs**: extend the `MODEL_THRESHOLDS` dictionary in `test/registered/eval/test_vlms_mmmu_eval.py`.
Loading