Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
In _format_entry_not_found, the label appeared mid-sentence ('File not
found in Dataset ...') but was capitalized. Apply the rule consistently:
capitalize when starting a sentence, lowercase otherwise.
Update tests to match the corrected casing.
Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
Use TypeVar to make _format() return the specific HfHubHTTPError subclass type, so mypy recognizes attributes like repo_type, repo_id, and bucket_id on the returned error objects. Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
…r_status Mypy narrows the type of 'err' on the first assignment and then rejects reassignments to different subclass types in later elif branches. Use unique names (revision_err, entry_err, gated_err, bucket_err, repo_err) to avoid cross-branch type conflicts. Co-authored-by: Lucain <Wauplin@users.noreply.github.com>
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Wauplin
commented
Mar 6, 2026
Comment on lines
+174
to
+180
| _REPO_ID_FROM_URL_REGEX = re.compile(r"^https?://[^/]+/api/(models|datasets|spaces)/([^/]+)(?:/([^/]+))?") | ||
|
|
||
| # Regex to extract bucket_id (namespace/name) from bucket API URLs. | ||
| _BUCKET_ID_FROM_URL_REGEX = re.compile(r"^https?://[^/]+/api/buckets/([^/]+/[^/]+)") | ||
|
|
||
| # Sub-paths that follow a repo_id in API URLs (not part of the repo name). | ||
| _REPO_URL_SUBPATHS = {"resolve", "tree", "blob", "raw", "refs", "commit", "discussions", "settings", "revision"} |
Contributor
Author
There was a problem hiding this comment.
Note: not 100% bullet-proof but parsing doesn't have to be perfect (it's just a convenience field for better errors)
hanouticelina
approved these changes
Mar 6, 2026
Contributor
hanouticelina
left a comment
There was a problem hiding this comment.
Thank you! this is a real and a very nice UX improvement
yg7445
added a commit
to yg7445/huggingface_hub
that referenced
this pull request
Apr 10, 2026
Commit 098091f ("huggingface#3889") changed hf_raise_for_status() from inline raises to storing exceptions in local variables before raising: entry_err = _format(RemoteEntryNotFoundError, message, response) entry_err.repo_type = repo_type raise entry_err from e This creates a CPython reference cycle: entry_err.__cause__ -> e, and e.__traceback__ -> frame -> f_locals['entry_err'] -> entry_err. The cycle prevents the exception from being freed when except blocks exit. When this exception propagates through callers (e.g. transformers' cached_files -> LLM.__init__), the traceback chain holds a reference to `self`, preventing refcount-based cleanup. In vllm, this means `del llm` doesn't trigger the weakref finalizer that sends SIGTERM to the EngineCore subprocess, so GPU memory is never released. Fix by moving repo_type/repo_id/bucket_id assignment into helper functions (_format_with_repo_info, _format_with_bucket_info) so the exception is never stored as a local in hf_raise_for_status. Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add repo_id, repo_type, bucket_id, etc. wherever possible when an error is raised in the CLI and printed gracefully.
Mostly human generated except the tests.
Note
Medium Risk
Touches
hf_raise_for_statusand error class definitions used across the library, so mistakes in URL parsing/attribute assignment could subtly change raised error details. The behavioral changes are mostly additive (better messages/metadata) and are covered by new unit tests.Overview
Improves CLI-facing error output by replacing generic lambda formatters with dedicated helpers that include
repo_type,repo_id,bucket_id, and (for missing entries) the request URL when available.Enriches
hf_raise_for_statusby parsing repo/bucket identifiers from request URLs and attaching them to raisedRepositoryNotFoundError,GatedRepoError,RevisionNotFoundError,RemoteEntryNotFoundError, andBucketNotFoundErrorinstances; also adds these optional attributes to the corresponding error classes.Adds focused tests for the new CLI formatters and for URL parsing helpers (
_parse_repo_info_from_url,_parse_bucket_id_from_url) to validate the new messaging/context behavior.Written by Cursor Bugbot for commit 9911947. This will update automatically on new commits. Configure here.