Skip to content

Add explicit target-layer-ids handling#379

Merged
fynnsu merged 2 commits into
mainfrom
target_layer_ids
Apr 8, 2026
Merged

Add explicit target-layer-ids handling#379
fynnsu merged 2 commits into
mainfrom
target_layer_ids

Conversation

@fynnsu
Copy link
Copy Markdown
Collaborator

@fynnsu fynnsu commented Apr 3, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Currently we aren't storing the layer ids in the Eagle3 model configs and instead just match the defaults vllm use. We would instead like to explicitly set these, which will also allow users to use custom layers.

Description

Update launch.py and train.py with a --target-layer-ids arg. Explicitly add layer ids to eagle3 config, even if they are automatically inferred from num_hidden_layers.

Add user warnings to remind users to that custom layer ids must be passed into both scripts.

Related Issue

Tests

WIP. I need to test that this still loads into vLLM well. I also want to merge #378 first, because it fixes an issue with launch_vllm.py arg processing.

Tested on the merge commit between this pr and #378. Works as expected.

I have filled in:

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan/results, such as providing test command and pasting the results.
  • (Optional) The necessary documentation update.
  • I (a human) have written or reviewed the code in this pr to the best of my ability.

Summary by CodeRabbit

  • New Features

    • Automatic layer ID derivation: When target layer IDs are not explicitly provided, they are automatically derived from your verifier model configuration.
    • Configuration synchronization warnings to ensure consistency across launch and training configurations.
  • Changes

    • CLI parameter renamed from --layers to --target-layer-ids for improved clarity and consistency across configuration scripts.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 3, 2026

📦 Build Artifacts Available
The build artifacts (`.whl` and `.tar.gz`) have been successfully generated and are available for download: https://github.com/vllm-project/speculators/actions/runs/24141073851/artifacts/6329754963.
They will be retained for up to 30 days.
Commit: b76c71b

@fynnsu fynnsu marked this pull request as ready for review April 6, 2026 15:28
Copy link
Copy Markdown
Collaborator

@shanjiaz shanjiaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approved pending successful tests.

Copy link
Copy Markdown
Collaborator

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a sample config which shows what the fields will look like? Any checkpoint that we end up using to validate the vLLM integration

@fynnsu
Copy link
Copy Markdown
Collaborator Author

fynnsu commented Apr 7, 2026

Do you have a sample config which shows what the fields will look like? Any checkpoint that we end up using to validate the vLLM integration

I ran the online data generation (for just 1 epoch with 5 samples) with both layers manually set and layers automatically selected. Both result in config.json files like the below (note the eagle_aux_hidden_state_layer_ids field).

I also ran vLLM serve with both model checkpoints and confirmed the correct layers were used (based on the output logs) and the outputs were coherent (despite not training and using non-standard eagle3 layers).

Example config.json
{
  "architectures": [
    "Eagle3DraftModel"
  ],
  "auto_map": {
    "": "config.Eagle3SpeculatorConfig"
  },
  "base_model_ep_plan": null,
  "draft_vocab_size": 32000,
  "dtype": "bfloat16",
  "eagle_aux_hidden_state_layer_ids": [
    1,
    2,
    3
  ],
  "embed_requires_grad": false,
  "has_no_defaults_at_init": false,
  "norm_before_fc": false,
  "norm_before_residual": true,
  "speculators_config": {
    "algorithm": "eagle3",
    "default_proposal_method": "greedy",
    "proposal_methods": [
      {
        "accept_tolerance": 0.0,
        "proposal_type": "greedy",
        "speculative_tokens": 3,
        "verifier_accept_k": 1
      }
    ],
    "verifier": {
      "architectures": [],
      "name_or_path": "Qwen/Qwen3-8B"
    }
  },
  "speculators_model_type": "eagle3",
  "speculators_version": "0.5.0.dev21",
  "target_hidden_size": null,
  "tie_word_embeddings": false,
  "transformer_layer_config": {
    "attention_bias": false,
    "attention_dropout": 0.0,
    "bos_token_id": 1,
    "eos_token_id": 2,
    "head_dim": 128,
    "hidden_act": "silu",
    "hidden_size": 4096,
    "initializer_range": 0.02,
    "intermediate_size": 12288,
    "max_position_embeddings": 40960,
    "mlp_bias": false,
    "model_type": "llama",
    "num_attention_heads": 32,
    "num_hidden_layers": 1,
    "num_key_value_heads": 8,
    "pad_token_id": null,
    "pretraining_tp": 1,
    "rms_norm_eps": 1e-06,
    "rope_parameters": {
      "rope_theta": 10000.0,
      "rope_type": "default"
    },
    "tie_word_embeddings": false,
    "use_cache": true,
    "vocab_size": 151936
  },
  "transformers_version": "5.3.0"
}

fynnsu added 2 commits April 8, 2026 14:33
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
@fynnsu fynnsu force-pushed the target_layer_ids branch from 4d5aa58 to b76c71b Compare April 8, 2026 14:34
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

Renamed CLI option --layers to --target-layer-ids in the launch script and added the same argument to the training script. Enhanced Eagle3DraftModel.from_training_args to auto-derive target layer IDs from the verifier model config when not explicitly provided. Updated test fixtures to use concrete model paths instead of placeholders.

Changes

Cohort / File(s) Summary
CLI Argument Updates
scripts/launch_vllm.py, scripts/train.py
Renamed --layers to --target-layer-ids in launch script; added new --target-layer-ids argument to training script. Both accept space-separated integers. Launch script now emits warning when argument is provided.
Model Configuration Logic
src/speculators/models/eagle3/core.py
Enhanced Eagle3DraftModel.from_training_args to derive target_layer_ids from verifier model config (as [2, num_hidden_layers // 2, num_hidden_layers - 3]) when not user-supplied. Wires computed/provided layer IDs into Eagle3SpeculatorConfig.
Test Fixtures
tests/unit/train/test_setup_model.py
Updated verifier_name_or_path parameter from "dummy" to "nm-testing/tinysmokellama-3.2" in test invocation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Target layers, oh how they dance,
Once named --layers, now they prance,
With --target-layer-ids so grand,
Auto-derived across the land,
Config flows from script to core—
The eagle soars forevermore! 🦅

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and directly summarizes the main change: adding explicit handling for target-layer-ids across multiple files and scripts.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch target_layer_ids

Comment @coderabbitai help to get the list of available commands and usage tips.

@fynnsu fynnsu enabled auto-merge (squash) April 8, 2026 14:34
@fynnsu fynnsu merged commit d37a6c3 into main Apr 8, 2026
13 of 15 checks passed
@fynnsu fynnsu deleted the target_layer_ids branch April 8, 2026 14:39
my-other-github-account pushed a commit to my-other-github-account/speculators that referenced this pull request May 15, 2026
<!-- markdownlint-disable -->

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT
THE BOTTOM) HAVE BEEN CONSIDERED.

## Purpose

Currently we aren't storing the layer ids in the Eagle3 model configs
and instead just match the defaults vllm use. We would instead like to
explicitly set these, which will also allow users to use custom layers.


<!--- Why your changes are needed -->

## Description

Update launch.py and train.py with a `--target-layer-ids` arg.
Explicitly add layer ids to eagle3 config, even if they are
automatically inferred from num_hidden_layers.

Add user warnings to remind users to that custom layer ids must be
passed into both scripts.

<!--- High-level concise summary of changes -->

## Related Issue

<!--- Link related issue if applicable -->

## Tests

~~WIP. I need to test that this still loads into vLLM well. I also want
to merge vllm-project#378 first, because it fixes an issue with `launch_vllm.py` arg
processing.~~

Tested on the merge commit between this pr and vllm-project#378. Works as expected. 

<!--- Please describe in detail how you tested your changes. -->

I have filled in:

- [x] The purpose of the PR, such as "Fix some issue (link existing
issues this PR will resolve)".
- [x] The test plan/results, such as providing test command and pasting
the results.
- [ ] (Optional) The necessary documentation update.
- [x] I (a human) have written or reviewed the code in this pr to the
best of my ability.

---------

Signed-off-by: Fynn Schmitt-Ulms <fschmitt@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants