Various Transformers v5 fixes#38127
Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request introduces several changes, including updating aliased module patterns for image processing utilities, handling sliding_window=0 in model configurations by converting it to None, and refactoring Olmo3Config to use the upstream transformers library instead of a custom vLLM implementation. Additionally, a check was added in deepseek_vl2.py to remove kv_lora_rank if its value is None before passing it to DeepseekV2Config. A review comment suggests that modifying the language_config dictionary in-place in deepseek_vl2.py could lead to unexpected side effects and recommends using a copy of the dictionary instead.
| # remove kv_lora_rank if not specified, passing None is prohibited | ||
| if language_config.get("kv_lora_rank") is None: | ||
| language_config.pop("kv_lora_rank", None) | ||
| self.text_config = DeepseekV2Config(**language_config) |
There was a problem hiding this comment.
Modifying the language_config dictionary in-place can lead to unexpected side effects for the caller if they reuse the kwargs dictionary. It's safer to work with a copy of the dictionary to avoid such issues.
| # remove kv_lora_rank if not specified, passing None is prohibited | |
| if language_config.get("kv_lora_rank") is None: | |
| language_config.pop("kv_lora_rank", None) | |
| self.text_config = DeepseekV2Config(**language_config) | |
| # remove kv_lora_rank if not specified, passing None is prohibited | |
| language_config_copy = language_config.copy() | |
| if language_config_copy.get("kv_lora_rank") is None: | |
| language_config_copy.pop("kv_lora_rank", None) | |
| self.text_config = DeepseekV2Config(**language_config_copy) |
This reverts commit 3c3c084.
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Michel Belleau <michel.belleau@malaiwah.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Nithin Chalapathi <nithin.ch10@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: EricccYang <yangyang4991@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: bhargav-patel-29 <bhargav.patel@tihiitb.org>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: rishitdholakia13 <rishit+github@cohere.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Rishi Puri <riship@nvidia.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
transformers.image_processing_utils_fastin the offline mode test0instead ofNonewhich doesn't actually disable sliding window)DeepSeekVL2Configfrom passing an invalid value toDeepSeekV2Confg(DeepSeekV2 always has MLA butDeepSeekVL2does not)