Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ODSC-6682] Delete HF cache by default while registering models #1044

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

VipulMascarenhas
Copy link
Member

@VipulMascarenhas VipulMascarenhas commented Jan 23, 2025

Description

Issue

When a user registered a model in AI Quick Actions, the /dev/sdb drive space would get occupied and take up 100% of the space for some users. Currently, a cached-model folder is created when user does not set the local directory.

Solution

In this PR, we delete the model cache by default after a model is registered. The cached-model folder will also not be created, instead the artifacts of model named oracle/aqua-1t-mega-model will be written to the .cache/huggingface/hub/models--oracle--aqua-1t-mega-model/ folder. Once a model is successfully registered, the artifacts are removed from two places:

  • local_dir if specified
  • .cache/huggingface/hub/models--*/ folder that removes other relevant metadata and artifacts (if local_dir was not specifed)

If a user wishes to keep the artifacts, then they need to set delete_from_local as False explicitly.

Testing [WIP]

In Progress, will add some examples here.

Unit Tests

> python -m pytest -q tests/unitary/with_extras/aqua/test_cli.py tests/unitary/with_extras/aqua/test_model.py tests/unitary/with_extras/aqua/test_model_handler.py

======================================== test session starts ========================================
platform darwin -- Python 3.9.21, pytest-7.4.4, pluggy-1.5.0
rootdir: /Users/user/workspace/git/accelerated-data-science
configfile: pytest.ini
plugins: cov-6.0.0, Faker-33.1.0, anyio-4.6.2, dash-2.18.2, xdist-3.6.1
collected 62 items

tests/unitary/with_extras/aqua/test_cli.py ..............                                     [ 22%]
tests/unitary/with_extras/aqua/test_model.py ............................                     [ 67%]
tests/unitary/with_extras/aqua/test_model_handler.py ....................                     [100%]

======================================== 62 passed in 9.12s =========================================

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jan 23, 2025
Copy link

📌 Cov diff with main:

Coverage-92%

📌 Overall coverage:

Coverage-57.26%

darenr
darenr previously approved these changes Jan 23, 2025
mrDzurb
mrDzurb previously approved these changes Jan 23, 2025
@@ -128,6 +128,10 @@ def post(self, *args, **kwargs): # noqa: ARG002
download_from_hf = (
str(input_data.get("download_from_hf", "false")).lower() == "true"
)
local_dir = input_data.get("local_dir")
delete_from_local = (
str(input_data.get("delete_from_local", "true")).lower() == "true"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WOuldn't it be better to name it something like - cleanup_model_cache? I don't have a strong opinion, but delete_from_local doesn't sound intuitive for me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now that you called it out, it does sounds a bit vague. I've updated the field to cleanup_model_cache.

@VipulMascarenhas VipulMascarenhas dismissed stale reviews from mrDzurb and darenr via 3569ba7 January 23, 2025 18:19
Copy link

📌 Cov diff with main:

Coverage-92%

📌 Overall coverage:

Coverage-57.26%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants