Add explicit target-layer-ids handling#379
Conversation
|
📦 Build Artifacts Available |
shanjiaz
left a comment
There was a problem hiding this comment.
approved pending successful tests.
dsikka
left a comment
There was a problem hiding this comment.
Do you have a sample config which shows what the fields will look like? Any checkpoint that we end up using to validate the vLLM integration
I ran the online data generation (for just 1 epoch with 5 samples) with both layers manually set and layers automatically selected. Both result in config.json files like the below (note the I also ran vLLM serve with both model checkpoints and confirmed the correct layers were used (based on the output logs) and the outputs were coherent (despite not training and using non-standard eagle3 layers). Example config.json{
"architectures": [
"Eagle3DraftModel"
],
"auto_map": {
"": "config.Eagle3SpeculatorConfig"
},
"base_model_ep_plan": null,
"draft_vocab_size": 32000,
"dtype": "bfloat16",
"eagle_aux_hidden_state_layer_ids": [
1,
2,
3
],
"embed_requires_grad": false,
"has_no_defaults_at_init": false,
"norm_before_fc": false,
"norm_before_residual": true,
"speculators_config": {
"algorithm": "eagle3",
"default_proposal_method": "greedy",
"proposal_methods": [
{
"accept_tolerance": 0.0,
"proposal_type": "greedy",
"speculative_tokens": 3,
"verifier_accept_k": 1
}
],
"verifier": {
"architectures": [],
"name_or_path": "Qwen/Qwen3-8B"
}
},
"speculators_model_type": "eagle3",
"speculators_version": "0.5.0.dev21",
"target_hidden_size": null,
"tie_word_embeddings": false,
"transformer_layer_config": {
"attention_bias": false,
"attention_dropout": 0.0,
"bos_token_id": 1,
"eos_token_id": 2,
"head_dim": 128,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 12288,
"max_position_embeddings": 40960,
"mlp_bias": false,
"model_type": "llama",
"num_attention_heads": 32,
"num_hidden_layers": 1,
"num_key_value_heads": 8,
"pad_token_id": null,
"pretraining_tp": 1,
"rms_norm_eps": 1e-06,
"rope_parameters": {
"rope_theta": 10000.0,
"rope_type": "default"
},
"tie_word_embeddings": false,
"use_cache": true,
"vocab_size": 151936
},
"transformers_version": "5.3.0"
} |
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughRenamed CLI option Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
<!-- markdownlint-disable --> PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED. ## Purpose Currently we aren't storing the layer ids in the Eagle3 model configs and instead just match the defaults vllm use. We would instead like to explicitly set these, which will also allow users to use custom layers. <!--- Why your changes are needed --> ## Description Update launch.py and train.py with a `--target-layer-ids` arg. Explicitly add layer ids to eagle3 config, even if they are automatically inferred from num_hidden_layers. Add user warnings to remind users to that custom layer ids must be passed into both scripts. <!--- High-level concise summary of changes --> ## Related Issue <!--- Link related issue if applicable --> ## Tests ~~WIP. I need to test that this still loads into vLLM well. I also want to merge vllm-project#378 first, because it fixes an issue with `launch_vllm.py` arg processing.~~ Tested on the merge commit between this pr and vllm-project#378. Works as expected. <!--- Please describe in detail how you tested your changes. --> I have filled in: - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan/results, such as providing test command and pasting the results. - [ ] (Optional) The necessary documentation update. - [x] I (a human) have written or reviewed the code in this pr to the best of my ability. --------- Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Currently we aren't storing the layer ids in the Eagle3 model configs and instead just match the defaults vllm use. We would instead like to explicitly set these, which will also allow users to use custom layers.
Description
Update launch.py and train.py with a
--target-layer-idsarg. Explicitly add layer ids to eagle3 config, even if they are automatically inferred from num_hidden_layers.Add user warnings to remind users to that custom layer ids must be passed into both scripts.
Related Issue
Tests
WIP. I need to test that this still loads into vLLM well. I also want to merge #378 first, because it fixes an issue withlaunch_vllm.pyarg processing.Tested on the merge commit between this pr and #378. Works as expected.
I have filled in:
Summary by CodeRabbit
New Features
Changes
--layersto--target-layer-idsfor improved clarity and consistency across configuration scripts.