Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces several patches for Mistral model configuration handling. It silences some warnings from the Transformers library, improves data type casting for RoPE parameters, and refactors the data type inference to be more robust. The changes are generally good, but I've found a critical thread-safety issue with how global constants are being modified. My review includes a suggestion to fix this potential race condition.
| @contextmanager | ||
| def _mistral_patch_hf_hub_constants() -> Iterator[None]: | ||
| hf_safetensors_single_file = constants.SAFETENSORS_SINGLE_FILE | ||
| hf_safetensors_index_file = constants.SAFETENSORS_INDEX_FILE | ||
| constants.SAFETENSORS_SINGLE_FILE = "consolidated.safetensors" | ||
| constants.SAFETENSORS_INDEX_FILE = "consolidated.safetensors.index.json" | ||
| try: | ||
| yield | ||
| finally: | ||
| constants.SAFETENSORS_SINGLE_FILE = hf_safetensors_single_file | ||
| constants.SAFETENSORS_INDEX_FILE = hf_safetensors_index_file |
There was a problem hiding this comment.
The modification of global constants in huggingface_hub.constants is not thread-safe. In a scenario where multiple models are loaded concurrently in different threads (e.g., one Mistral model and one standard Hugging Face model), this monkey-patching can create a race condition. One thread might be expecting the default constant values while another has temporarily changed them, potentially leading to FileNotFoundError or other unpredictable behavior during model loading. To prevent this, the critical section where constants are modified should be protected by a lock.
Please also add import threading at the top of the file.
_mistral_patch_lock = threading.Lock()
@contextmanager
def _mistral_patch_hf_hub_constants() -> Iterator[None]:
with _mistral_patch_lock:
hf_safetensors_single_file = constants.SAFETENSORS_SINGLE_FILE
hf_safetensors_index_file = constants.SAFETENSORS_INDEX_FILE
constants.SAFETENSORS_SINGLE_FILE = "consolidated.safetensors"
constants.SAFETENSORS_INDEX_FILE = "consolidated.safetensors.index.json"
try:
yield
finally:
constants.SAFETENSORS_SINGLE_FILE = hf_safetensors_single_file
constants.SAFETENSORS_INDEX_FILE = hf_safetensors_index_file|
To clarify, are these warnings that are appearing in v4 or v5? Also, would it not be better to add |
|
@hmellor it only occurs for v5.
vLLM use arguments slightly differently than HF that requires adding this key which is not needed in HF so not sure it would make sense. Either way for now it's not in hf so silencing the warnings would be cool as a temporary fix if that's ok to you. |
|
Ok, would a better soltuion be to add |
|
Yeah we can do it if you prefer we do it like that. So for this current pr do you want me to discard only this filter warning but keep the casting ? |
|
Yeah I think that's best. It hardens the conversion in vLLM and ensures that you don't get the unexpected key warning anywhere that Mistral configs are loaded with Transformers. |
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
7885f61 to
e8f3e46
Compare
|
Made the Transformers PR here :) |
|
Could we instead pass vllm/vllm/transformers_utils/config.py Line 370 in bf9a185 Something like: ignore_keys = {}
if config_format == "mistral" and config.rope_parameters.type == "yarn":
ignore_keys.add("apply_yarn_scaling")
config.validate_rope(ignore_keys=ignore_keys) |
|
Would have been better indeed i'll do a followup pr once i have time |
|
I've implemented this in #37292 |
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: Vinay Damodaran <vrdn@hey.com>
Signed-off-by: juliendenize <julien.denize@mistral.ai> Signed-off-by: EricccYang <yangyang4991@gmail.com>
Signed-off-by: juliendenize <julien.denize@mistral.ai>
Purpose
This PR does the following:
Test Plan
Checked by serving a Mistral model
Test Result
serving worked
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.