-
Notifications
You must be signed in to change notification settings - Fork 675
implemented auth_check
#2497
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implemented auth_check
#2497
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @cjfghk5697, thank you for this PR 🤗 I left few comments, otherwise everything looks good to me. The indentation fix should resolve both the documentation build and the code quality check in the CI pipelines
src/huggingface_hub/hf_api.py
Outdated
If the repository is gated or does not exist, the respective error is raised. | ||
""" | ||
headers = self._build_hf_headers(token=token) | ||
path = f"{self.endpoint}/api/datasets/{repo_id}/auth-check" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation only handles the datasets
repo type. It should handle all repo types supported by the Hugging Face Hub (i.e. datasets
, models
and spaces
). You can add a repo_type
argument to the method.
path = f"{self.endpoint}/api/datasets/{repo_id}/auth-check" | |
if repo_type is None: | |
repo_type = constants.REPO_TYPE_MODEL | |
if repo_type not in constants.REPO_TYPES: | |
raise ValueError( | |
f"Invalid repo type, must be one of {constants.REPO_TYPES}" | |
) | |
path = f"{self.endpoint}/api/{repo_type}s/{repo_id}/auth-check" |
src/huggingface_hub/hf_api.py
Outdated
@@ -9525,6 +9529,51 @@ def list_user_following(self, username: str) -> Iterable[User]: | |||
for followed_user in r.json(): | |||
yield User(**followed_user) | |||
|
|||
def auth_check(self, repo_id: str, token: Union[bool, str, None] = None) -> None: | |||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small indentation fix needed on this line
src/huggingface_hub/hf_api.py
Outdated
@@ -9525,6 +9529,51 @@ def list_user_following(self, username: str) -> Iterable[User]: | |||
for followed_user in r.json(): | |||
yield User(**followed_user) | |||
|
|||
def auth_check(self, repo_id: str, token: Union[bool, str, None] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more convenient if the method returns True
when the user has access. This change would allow users to use the boolean return in conditional statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While a boolean value is convenient to use in a if/else
statement, it will nonetheless be hiding some information to the end user. For now there are 3 cases:
- passing correctly
RepositoryNotFoundError
error if private or not existingGatedRepoError
error if exists but wrong permission
The end user will eventually want to deal with these errors differently depending on their use cases. Letting the correct error be raised makes it flexible.
However what would be nice is to document how to use the method:
from huggingface_hub import auth_check
from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError
try:
auth_check(repo_id, repo_type=repo_type)
except GatedRepoError:
# Handle gated repo
...
except RepositoryNotFoundError:
# Handle missing repo
...
This snippet can be added as an example in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, auth_check
is for more advanced usage of the lib'. For power users it's fine to have slightly more verbose boilerplate code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree!
tests/test_hf_api.py
Outdated
|
||
@patch.object(HfApi, "auth_check", return_value=None) | ||
def test_auth_check_success(self, mock_auth_check): | ||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If auth_check
raises any exception, the test will automatically fail without needing self.fail()
, The try/except block with self.fail()
is redundant in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR @cjfghk5697! Agree with most comments @hanouticelina above :) I've added a few comment especially related to how we should test this feature. Let us know if you have any question!
src/huggingface_hub/hf_api.py
Outdated
Example: | ||
>>> api = HfApi(token="your_token") | ||
>>> api.auth_check(repo_id="user/my-cool-model") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example: | |
>>> api = HfApi(token="your_token") | |
>>> api.auth_check(repo_id="user/my-cool-model") | |
Example: | |
```py | |
>>> from huggingface_hub import HfApi | |
>>> api = HfApi(token="your_token") | |
>>> api.auth_check(repo_id="user/my-cool-model") | |
``` |
src/huggingface_hub/hf_api.py
Outdated
@@ -9525,6 +9529,51 @@ def list_user_following(self, username: str) -> Iterable[User]: | |||
for followed_user in r.json(): | |||
yield User(**followed_user) | |||
|
|||
def auth_check(self, repo_id: str, token: Union[bool, str, None] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While a boolean value is convenient to use in a if/else
statement, it will nonetheless be hiding some information to the end user. For now there are 3 cases:
- passing correctly
RepositoryNotFoundError
error if private or not existingGatedRepoError
error if exists but wrong permission
The end user will eventually want to deal with these errors differently depending on their use cases. Letting the correct error be raised makes it flexible.
However what would be nice is to document how to use the method:
from huggingface_hub import auth_check
from huggingface_hub.utils import GatedRepoError, RepositoryNotFoundError
try:
auth_check(repo_id, repo_type=repo_type)
except GatedRepoError:
# Handle gated repo
...
except RepositoryNotFoundError:
# Handle missing repo
...
This snippet can be added as an example in the docstring.
src/huggingface_hub/hf_api.py
Outdated
@@ -9525,6 +9529,51 @@ def list_user_following(self, username: str) -> Iterable[User]: | |||
for followed_user in r.json(): | |||
yield User(**followed_user) | |||
|
|||
def auth_check(self, repo_id: str, token: Union[bool, str, None] = None) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be honest, auth_check
is for more advanced usage of the lib'. For power users it's fine to have slightly more verbose boilerplate code.
@@ -4239,3 +4231,27 @@ def test_upload_large_folder(self, repo_url: RepoUrl) -> None: | |||
for j in range(N_FILES_PER_FOLDER): | |||
assert f"subfolder_{i}/file_lfs_{i}_{j}.bin" in uploaded_files | |||
assert f"subfolder_{i}/file_regular_{i}_{j}.txt" in uploaded_files | |||
|
|||
|
|||
class TestHfApiAuthCheck(HfApiCommonTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding tests as well. Mocked tests are useful in many cases but for the core HfApi
features, we want to test the behavior "for real" on the staging environment. Here is how you can do that!
class TestHfApiAuthCheck(HfApiCommonTest):
@use_tmp_repo(repo_type="dataset")
def test_auth_check_success(self, repo_url: RepoUrl) -> None:
self._api.auth_check(repo_id=repo_url.repo_id, repo_type=repo_url.repo_type)
def test_auth_check_repo_missing(self) -> None:
with self.assertRaises(RepositoryNotFoundError):
self._api.auth_check(repo_id="username/missing_repo_id")
In the first case, you test what happens if the repo exists (using @use_tmp_repo
decorator). In the second case, you test what happens if the repo is missing (with a fake url). Finally, you'll need a last case where you create a new repo + set it as gated (as done here) + try auth_check
with a different token. Since a user has always access to its own repo, no matter if it's gated, you need to create a repo with user A and then test access with user B. You can use TOKEN
and OTHER_TOKEN
for that.
@Wauplin @hanouticelina Thank you so much for your reviews! I've refactored the code based on your feedback. Could you please review it again?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @cjfghk5697, thanks for the refactoring :) I've left a comment related to the return type of auth_check
, otherwise everything else looks good to me
src/huggingface_hub/hf_api.py
Outdated
@@ -9525,6 +9529,83 @@ def list_user_following(self, username: str) -> Iterable[User]: | |||
for followed_user in r.json(): | |||
yield User(**followed_user) | |||
|
|||
def auth_check(self, repo_id: str, repo_type: Optional[str] = None, token: Union[bool, str, None] = None) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed by @Wauplin here, using a boolean for auth_check combines different error types into a single 'False' result. Instead, allowing specific exceptions (GatedRepoError, RepositoryNotFoundError) to propagate provides users with more detailed information and enables more precise error handling in their code.
@Wauplin @hanouticelina |
Hi @cjfghk5697, thanks for the update. As discussed in #2497 (comment) and #2497 (comment) we would prefer for |
Hi @Wauplin, I apologize for the mistake, and thank you for pointing it out. I've made the necessary adjustments as per your feedback. Could you please review it again when you have a moment? Thank you! |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @cjfghk5697, thanks a lot for the iterations! seems good to me 👍
cc @Wauplin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @cjfghk5697! I have left final comments regarding code styling and docstrings that I will merge right away. Other than that, we are good to merge! 🎉
with self.assertRaises(RepositoryNotFoundError): | ||
self._api.auth_check(repo_id="username/missing_repo_id") | ||
|
||
def test_auth_check_gated_repo(self) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test looks good!
All green! Thanks a lot @cjfghk5697 🤗 🎉 |
@Wauplin @hanouticelina Thank you so much for all your help. If there's anything else that needs to be handled, please let me know! |
Related to #2466
I have implemented
auth_check
related to the issue #2466.auth_check
auth_check
auth_check
Please review the implementation and let me know if any further changes are required.