diff --git a/test/README.md b/test/README.md index 1dc3c6d35e74..e4c34dbc1c0a 100644 --- a/test/README.md +++ b/test/README.md @@ -4,33 +4,34 @@ SGLang uses the built-in library [unittest](https://docs.python.org/3/library/un ## Test Backend Runtime ```bash -cd sglang/test/srt - # Run a single file -python3 test_srt_endpoint.py +> cd test/registered +> python3 core/test_srt_endpoint.py # Run a single test -python3 test_srt_endpoint.py TestSRTEndpoint.test_simple_decode +> cd test/registered +> python3 core/test_srt_endpoint.py TestSRTEndpoint.test_simple_decode # Run a suite with multiple files -python3 run_suite.py --suite per-commit +> cd test +> python run_suite.py --hw cuda --suite stage-b-test-small-1-gpu ``` ## Test Frontend Language ```bash -cd sglang/test/lang +> cd test/manual/lang_frontend # Run a single file -python3 test_choices.py +> python3 test_choices.py ``` ## Adding or Updating Tests in CI -- Create new test files under `test/srt` or `test/lang` depending on the type of test. -- For nightly tests, place them in `test/srt/nightly/`. Use the `NightlyBenchmarkRunner` helper class in `nightly_utils.py` for performance benchmarking tests. -- Ensure they are referenced in the respective `run_suite.py` (e.g., `test/srt/run_suite.py`) so they are picked up in CI. For most small test cases, they can be added to the `per-commit-1-gpu` suite. Sort the test cases alphabetically by name. -- Ensure you added `unittest.main()` for unittest and `sys.exit(pytest.main([__file__]))` for pytest in the scripts. The CI run them via `python3 test_file.py`. -- The CI will run some suites such as `per-commit-1-gpu`, `per-commit-2-gpu`, and `nightly-1-gpu` automatically. If you need special setup or custom test groups, you may modify the workflows in [`.github/workflows/`](https://github.com/sgl-project/sglang/tree/main/.github/workflows). +- Create new test files under `test/registered/` (organized by category) for CI tests, or `test/manual/` for manual tests. +- For nightly tests, use the CI registry with `nightly=True`. For performance benchmarking tests, use the `NightlyBenchmarkRunner` helper class in `python/sglang/test/nightly_utils.py`. +- Register tests using the CI registry system (see below). For most small test cases, use the `stage-b-test-small-1-gpu` suite. Sort the test cases alphabetically by name. +- Ensure you added `unittest.main()` for unittest and `sys.exit(pytest.main([__file__]))` for pytest in the scripts. The CI runs them via `python3 test_file.py`. +- The CI will run some suites such as `stage-b-test-small-1-gpu`, `stage-b-test-large-2-gpu`, and `nightly-1-gpu` automatically. If you need special setup or custom test groups, you may modify the workflows in [`.github/workflows/`](https://github.com/sgl-project/sglang/tree/main/.github/workflows). ## CI Registry System @@ -60,7 +61,7 @@ register_cuda_ci(est_time=200, suite="nightly-1-gpu", nightly=True) # Multi-backend test register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu") -register_amd_ci(est_time=120, suite="stage-a-test-1") +register_amd_ci(est_time=120, suite="stage-b-test-small-1-gpu-amd") # Temporarily disabled test register_cuda_ci(est_time=80, suite="stage-b-test-small-1-gpu", disabled="flaky - see #12345") @@ -98,16 +99,24 @@ If a test cannot run on 5090 due to any of the above, use `stage-b-test-large-1- ### Available Suites **Per-Commit (CUDA)**: -- Stage A: `stage-a-test-1` (locked), `stage-a-test-2`, `stage-a-test-cpu` +- Stage A: `stage-a-test-1` (locked), `stage-a-cpu-only` - Stage B: `stage-b-test-small-1-gpu` (5090), `stage-b-test-large-1-gpu` (H100), `stage-b-test-large-2-gpu` - Stage C (4-GPU): `stage-c-test-4-gpu-h100`, `stage-c-test-4-gpu-b200`, `stage-c-test-4-gpu-gb200`, `stage-c-test-deepep-4-gpu` - Stage C (8-GPU): `stage-c-test-8-gpu-h20`, `stage-c-test-8-gpu-h200`, `stage-c-test-8-gpu-b200`, `stage-c-test-deepep-8-gpu-h200` **Per-Commit (AMD)**: -- `stage-a-test-1`, `stage-b-test-small-1-gpu-amd`, `stage-b-test-large-2-gpu-amd` +- `stage-a-test-1-amd`, `stage-b-test-small-1-gpu-amd`, `stage-b-test-large-1-gpu-amd`, `stage-b-test-large-2-gpu-amd` + +**Per-Commit (NPU)**: +- `stage-a-test-1`, `stage-b-test-1-npu-a2`, `stage-b-test-2-npu-a2`, `stage-b-test-4-npu-a3`, `stage-b-test-16-npu-a3` -**Nightly**: +**Nightly (CUDA)**: - `nightly-1-gpu`, `nightly-2-gpu`, `nightly-4-gpu`, `nightly-8-gpu`, etc. +- Eval: `nightly-eval-text-2-gpu`, `nightly-eval-vlm-2-gpu` +- Perf: `nightly-perf-text-2-gpu`, `nightly-perf-vlm-2-gpu` + +**Nightly (AMD)**: +- `nightly-amd`, `nightly-amd-1-gpu`, `nightly-amd-8-gpu`, `nightly-amd-vlm` ### Running Tests with run_suite.py @@ -125,7 +134,7 @@ python test/run_suite.py --hw cuda --suite stage-b-test-small-1-gpu \ ## Writing Elegant Test Cases -- Learn from existing examples in [sglang/test/srt](https://github.com/sgl-project/sglang/tree/main/test/srt). +- Learn from existing examples in [sglang/test/registered](https://github.com/sgl-project/sglang/tree/main/test/registered). - Reduce the test time by using smaller models and reusing the server for multiple test cases. Launching a server takes a lot of time. - Use as few GPUs as possible. Do not run long tests with 8-gpu runners. - If the test cases take too long, considering adding them to nightly tests instead of per-commit tests. @@ -133,9 +142,8 @@ python test/run_suite.py --hw cuda --suite stage-b-test-small-1-gpu \ - Give tests descriptive names reflecting their purpose. - Use robust assertions (e.g., assert, unittest methods) to validate outcomes. - Clean up resources to avoid side effects and preserve test independence. -- Reduce the test time by using smaller models and reusing the server for multiple test cases. ## Adding New Models to Nightly CI -- **For text models**: extend [global model lists variables](https://github.com/sgl-project/sglang/blob/85c1f7937781199203b38bb46325a2840f353a04/python/sglang/test/test_utils.py#L104) in `test_utils.py`, or add more model lists -- **For vlms**: extend the `MODEL_THRESHOLDS` global dictionary in `test/srt/nightly/test_vlms_mmmu_eval.py` +- **For text models**: extend the `DEFAULT_MODEL_NAME_FOR_NIGHTLY_EVAL_*` variables in `python/sglang/test/test_utils.py`, or add new model constants. +- **For VLMs**: extend the `MODEL_THRESHOLDS` dictionary in `test/registered/eval/test_vlms_mmmu_eval.py`.