-
-
Notifications
You must be signed in to change notification settings - Fork 11.5k
[WIP] Run eagle with full cudagraph #20190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
b5adae6
53223d5
5f5683e
c38e003
f36f8c1
41d76db
302677b
b60e53c
657be61
f226a8b
e780c7d
49ad485
0e5124d
d96a375
8a8f6bd
b1373c2
8b8a283
f0ea54f
759f3ba
7aea174
7aa2bac
fae5235
25d0c72
c7f963b
40ab4c4
80634e8
46b75f4
29646b5
a7cae7c
db98d04
6666593
3eb125c
6728377
5aafc16
8fcfe36
faf8b1a
b3dead9
d64c0ff
b2f7613
74d8cbc
701a331
7c61321
f8e8456
062ac71
44653f8
071801e
a7f791d
c6e12ff
97c24f6
d840c8a
0790c5e
8c5ed35
5d860d9
c8ea28a
54e6fce
507f651
d698fd2
169cb78
94a6358
a1fb3aa
09c7ebb
5d0155d
49b48be
f08230a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1860,7 +1860,7 @@ def maybe_randomize_inputs(self, input_ids: torch.Tensor): | |
| Randomize input_ids if VLLM_RANDOMIZE_DP_DUMMY_INPUTS is set. | ||
| This is to help balance expert-selection | ||
| - during profile_run | ||
| - during DP rank dummy run | ||
| - during DP rank dummy run | ||
| """ | ||
| dp_size = self.vllm_config.parallel_config.data_parallel_size | ||
| randomize_inputs = envs.VLLM_RANDOMIZE_DP_DUMMY_INPUTS and dp_size > 1 | ||
|
|
@@ -1982,7 +1982,7 @@ def _dummy_run( | |
|
|
||
| if self.speculative_config and self.speculative_config.use_eagle(): | ||
| assert isinstance(self.drafter, EagleProposer) | ||
| self.drafter.dummy_run(num_tokens) | ||
| self.drafter.dummy_run(num_tokens, attn_metadata) | ||
|
||
|
|
||
| logit_indices = np.cumsum(num_scheduled_tokens) - 1 | ||
| return hidden_states, hidden_states[logit_indices] | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The direct call to
json.loadscan cause the script to crash with ajson.JSONDecodeErrorif an invalid JSON string is passed to the--compilation_configargument. Consider adding a try-except block to handle potential parsing errors gracefully.