Add e2e smoke tests for the new datagen system#378
Conversation
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
📦 Build Artifacts Available |
|
The quality checks have failed. Please run |
shanjiaz
left a comment
There was a problem hiding this comment.
Just curious, are we testing acceptance rate for these trained models?
Not for these tests, they're just smoke tests. I will add regression tests as a follow up, which will basically be the same but use more samples and run for longer. |
shanjiaz
left a comment
There was a problem hiding this comment.
Looks good! Maybe we can make these tests configurable so it's easier to add regression tests in the future.
📝 WalkthroughWalkthroughThis PR modifies the vLLM launcher to execute the Python module via the current interpreter, updates the training script to propagate hidden-states dtype to dataset constructors, and introduces comprehensive end-to-end tests for both offline and online training workflows with supporting server orchestration utilities. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as test_offline_training
participant Server as vLLM Server
participant DataGen as data_generation_offline2.py
participant Trainer as train.py
participant Engine as run_vllm_engine
Test->>Server: launch_vllm_server(Qwen3-0.6B)
activate Server
Test->>Test: prepare_data()
Test->>DataGen: subprocess.run with HTTP endpoint
DataGen->>Server: POST requests for hidden states
Server-->>DataGen: hidden state responses
DataGen-->>Test: return (exit code 0)
Test->>Server: stop_vllm_server()
deactivate Server
Test->>Trainer: subprocess.run with prepared data
Trainer-->>Test: return (exit code 0, checkpoint saved)
Test->>Engine: run_vllm_engine(checkpoint/0)
Engine-->>Test: validation results
sequenceDiagram
participant Test as test_online_training
participant Server as vLLM Server
participant Trainer as train.py
participant Engine as run_vllm_engine
Test->>Server: launch_vllm_server(Qwen3-0.6B)
activate Server
Test->>Test: prepare_data()
Test->>Trainer: subprocess.run with live endpoint
activate Trainer
Trainer->>Server: query hidden states during training
Server-->>Trainer: streaming responses
Trainer-->>Test: return (exit code 0, checkpoint saved)
deactivate Trainer
Test->>Server: stop_vllm_server()
deactivate Server
Test->>Engine: run_vllm_engine(checkpoint/0)
Engine-->>Test: validation results
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/e2e/vllm/utils.py (1)
147-147: Minor:VLLM_PYTHONshadows module-level constant.Line 147 redefines
VLLM_PYTHONlocally, shadowing the module-level constant defined at line 24. Consider reusing the module-level constant for consistency.♻️ Suggested fix
def run_vllm_engine( model_path: str, tmp_path: Path, prompts: list[list[dict[str, str]]], disable_compile_cache: bool = False, max_tokens: int = 50, ignore_eos: bool = True, acceptance_thresholds: Iterable[float] | None = None, ): - VLLM_PYTHON = os.environ.get("VLLM_PYTHON", sys.executable) logger.info("vLLM Python executable: {}", VLLM_PYTHON)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/e2e/vllm/utils.py` at line 147, A local assignment redefines VLLM_PYTHON and shadows the module-level constant; remove the local `VLLM_PYTHON = os.environ.get("VLLM_PYTHON", sys.executable)` and use the existing module-level `VLLM_PYTHON` constant instead (or, if a different value is required, rename the local variable to something like `vllm_python_override`), ensuring any code that referenced the local name now references the module-level `VLLM_PYTHON` constant.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/e2e/vllm/utils.py`:
- Line 147: A local assignment redefines VLLM_PYTHON and shadows the
module-level constant; remove the local `VLLM_PYTHON =
os.environ.get("VLLM_PYTHON", sys.executable)` and use the existing module-level
`VLLM_PYTHON` constant instead (or, if a different value is required, rename the
local variable to something like `vllm_python_override`), ensuring any code that
referenced the local name now references the module-level `VLLM_PYTHON`
constant.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f31d5945-e839-4614-837c-d3a1c4a44871
📒 Files selected for processing (5)
scripts/launch_vllm.pyscripts/train.pytests/e2e/vllm/test_offline_training.pytests/e2e/vllm/test_online_training.pytests/e2e/vllm/utils.py
<!-- markdownlint-disable --> PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED. ## Purpose Currently we aren't storing the layer ids in the Eagle3 model configs and instead just match the defaults vllm use. We would instead like to explicitly set these, which will also allow users to use custom layers. <!--- Why your changes are needed --> ## Description Update launch.py and train.py with a `--target-layer-ids` arg. Explicitly add layer ids to eagle3 config, even if they are automatically inferred from num_hidden_layers. Add user warnings to remind users to that custom layer ids must be passed into both scripts. <!--- High-level concise summary of changes --> ## Related Issue <!--- Link related issue if applicable --> ## Tests ~~WIP. I need to test that this still loads into vLLM well. I also want to merge #378 first, because it fixes an issue with `launch_vllm.py` arg processing.~~ Tested on the merge commit between this pr and #378. Works as expected. <!--- Please describe in detail how you tested your changes. --> I have filled in: - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan/results, such as providing test command and pasting the results. - [ ] (Optional) The necessary documentation update. - [x] I (a human) have written or reviewed the code in this pr to the best of my ability. --------- Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
<!-- markdownlint-disable --> PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED. ## Purpose <!--- Why your changes are needed --> vllm-project#353 adds new online/offline training datagen using vLLM's hidden states extraction feature. This pr adds some e2e smoke tests for both workflows. It also fixes an issue in `train.py` that the tests found. ## Description <!--- High-level concise summary of changes --> Add the two tests under: `tests/e2e/vllm/test_offline_training.py` `tests/e2e/vllm/test_online_training.py` Add utils for launching / stopping the vllm server + polling it to see when its ready. Also a util fn for running the prepare_data.py step. Update `launch_vllm.py` so that it uses the python env it is run with, and improve its handling of args. Fix a bug in `train.py` regarding hidden states dtype handling. ## Related Issue <!--- Link related issue if applicable --> ## Tests Run e2e: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/23920324667 <!--- Please describe in detail how you tested your changes. --> I have filled in: - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan/results, such as providing test command and pasting the results. - [ ] (Optional) The necessary documentation update. - [x] I (a human) have written or reviewed the code in this pr to the best of my ability. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit # Release Notes * **Tests** * Added comprehensive end-to-end tests for offline and online training workflows to validate complete training pipelines. * **Improvements** * Enhanced data preparation with improved hidden states dtype handling throughout the training pipeline. * Refined vLLM server integration and management for more reliable model serving during training. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
<!-- markdownlint-disable --> PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED. ## Purpose Currently we aren't storing the layer ids in the Eagle3 model configs and instead just match the defaults vllm use. We would instead like to explicitly set these, which will also allow users to use custom layers. <!--- Why your changes are needed --> ## Description Update launch.py and train.py with a `--target-layer-ids` arg. Explicitly add layer ids to eagle3 config, even if they are automatically inferred from num_hidden_layers. Add user warnings to remind users to that custom layer ids must be passed into both scripts. <!--- High-level concise summary of changes --> ## Related Issue <!--- Link related issue if applicable --> ## Tests ~~WIP. I need to test that this still loads into vLLM well. I also want to merge vllm-project#378 first, because it fixes an issue with `launch_vllm.py` arg processing.~~ Tested on the merge commit between this pr and vllm-project#378. Works as expected. <!--- Please describe in detail how you tested your changes. --> I have filled in: - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan/results, such as providing test command and pasting the results. - [ ] (Optional) The necessary documentation update. - [x] I (a human) have written or reviewed the code in this pr to the best of my ability. --------- Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
#353 adds new online/offline training datagen using vLLM's hidden states extraction feature. This pr adds some e2e smoke tests for both workflows. It also fixes an issue in
train.pythat the tests found.Description
Add the two tests under:
tests/e2e/vllm/test_offline_training.pytests/e2e/vllm/test_online_training.pyAdd utils for launching / stopping the vllm server + polling it to see when its ready. Also a util fn for running the prepare_data.py step.
Update
launch_vllm.pyso that it uses the python env it is run with, and improve its handling of args.Fix a bug in
train.pyregarding hidden states dtype handling.Related Issue
Tests
Run e2e:
https://github.com/neuralmagic/llm-compressor-testing/actions/runs/23920324667
I have filled in:
Summary by CodeRabbit
Release Notes
Tests
Improvements