-
Notifications
You must be signed in to change notification settings - Fork 0
fix: skip redundant HfFileSystem().glob() calls in loader.py #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
dac1b60
c50591b
30db8f3
b83d57f
1d46c65
54ffa99
6025e9b
a21efff
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -49,7 +49,6 @@ | |
| except: | ||
| # For older versions of huggingface_hub | ||
| from huggingface_hub.utils._token import get_token | ||
| from huggingface_hub import HfFileSystem | ||
| import importlib.util | ||
| from ..device_type import ( | ||
| is_hip, | ||
|
|
@@ -508,7 +507,7 @@ def from_pretrained( | |
| model_type = model_types | ||
|
|
||
| # New transformers need to check manually. | ||
| if SUPPORTS_LLAMA32: | ||
| if SUPPORTS_LLAMA32 and is_model and is_peft: | ||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [5/13 reviewers] Pre-existing issue (not introduced by this PR): The flow is:
The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The condition There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| # Check if folder exists locally | ||
| if os.path.isdir(model_name): | ||
| exist_adapter_config = os.path.exists( | ||
|
|
@@ -517,14 +516,10 @@ def from_pretrained( | |
| exist_config = os.path.exists(os.path.join(model_name, "config.json")) | ||
| both_exist = exist_adapter_config and exist_config | ||
| else: | ||
| # Because HfFileSystem assumes linux paths, we need to set the path with forward slashes, even on Windows. | ||
| files = HfFileSystem(token = token).glob(f"{model_name}/*.json") | ||
| files = list(os.path.split(x)[-1] for x in files) | ||
| if ( | ||
| sum(x == "adapter_config.json" or x == "config.json" for x in files) | ||
| >= 2 | ||
| ): | ||
| both_exist = True | ||
| # Both AutoConfig and PeftConfig loaded successfully from this | ||
| # remote repo, so both config.json and adapter_config.json | ||
| # definitely exist -- no need for an extra HfFileSystem network call. | ||
| both_exist = True | ||
|
github-code-quality[bot] marked this conversation as resolved.
Fixed
|
||
|
|
||
| if not is_model and not is_peft: | ||
| error = autoconfig_error if autoconfig_error is not None else peft_error | ||
|
|
@@ -1282,7 +1277,7 @@ def from_pretrained( | |
| os.environ["UNSLOTH_DISABLE_STATIC_GENERATION"] = "1" | ||
|
|
||
| # New transformers need to check manually. | ||
| if SUPPORTS_LLAMA32: | ||
| if SUPPORTS_LLAMA32 and is_model and is_peft: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Similar to the issue in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
| # Check if folder exists locally | ||
| if os.path.isdir(model_name): | ||
| exist_adapter_config = os.path.exists( | ||
|
|
@@ -1291,13 +1286,10 @@ def from_pretrained( | |
| exist_config = os.path.exists(os.path.join(model_name, "config.json")) | ||
| both_exist = exist_adapter_config and exist_config | ||
| else: | ||
| files = HfFileSystem(token = token).glob(f"{model_name}/*.json") | ||
| files = list(os.path.split(x)[-1] for x in files) | ||
| if ( | ||
| sum(x == "adapter_config.json" or x == "config.json" for x in files) | ||
| >= 2 | ||
| ): | ||
| both_exist = True | ||
| # Both AutoConfig and PeftConfig loaded successfully from this | ||
| # remote repo, so both config.json and adapter_config.json | ||
| # definitely exist -- no need for an extra HfFileSystem network call. | ||
| both_exist = True | ||
|
github-code-quality[bot] marked this conversation as resolved.
Fixed
|
||
|
|
||
| if not is_model and not is_peft: | ||
| error = autoconfig_error if autoconfig_error is not None else peft_error | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The optimization to guard the manual check with
is_model and is_peftis a good improvement for performance. However, there is a logic bug in the surrounding code: theboth_existflag set within this block (at line 527) is never checked or used to raise an error for Llama 3.2+ models. This means the conflict detection is currently non-functional for these models.Furthermore, if
is_modelandis_peftare bothTrue, it already indicates that bothAutoConfigandPeftConfigwere successfully loaded for the samemodel_name, which confirms the conflict. You could potentially avoid the expensiveHfFileSystem().glob()call entirely by raising theRuntimeErrorimmediately when both areTrue, which would also fix the bug and prevent hangs in conflict scenarios.