Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim#30389
Standardise get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim#30389hmellor merged 17 commits intovllm-project:mainfrom
get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim#30389Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
get_rope to use rope_parameters["partial_rotary_factor"], not rotary_dim
There was a problem hiding this comment.
Code Review
This pull request simplifies the get_rope function by removing the rotary_dim argument and deriving it from head_size and partial_rotary_factor instead. The changes are applied across a large number of model files. The refactoring is well-executed, but I've found several instances where the shared config object is modified in-place. This can lead to unexpected side effects and should be avoided. I've provided suggestions to create copies of rope_parameters before modification to prevent this.
|
Can you resolve the merge conflict? |
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…n-standard names Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…"]`, not `rotary_dim` (vllm-project#30389) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Ubuntu <mjtaheri68@gmail.com>
…"]`, not `rotary_dim` (vllm-project#30389) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: dsuhinin <suhinin.dmitriy@gmail.com>
Implements #30349 (comment).
Globally applies the fix from #30384.
rotary_dimas a config field only exists in GPT-J and a few custom models (Minimax for example). The Transformers library has standardised onpartial_rotary_factorwhich lives inrope_parameters.This PR removes the
rotary_dimargument fromget_ropeso that there is now only one way to set therotary_dimfor RoPE:rope_parameterswithpartial_rotary_factorspopulatedget_ropedoesrotary_dim = head_dim * partial_rotary_factorinternallyFor the few edge cases that did not set
config.rope_parameters["partial_rotary_factor"]it has been reverse engineered fromconfig.rotary_dim.