[CI] Fix LoRA downloading issues and respect offline flag#15813
[CI] Fix LoRA downloading issues and respect offline flag#15813mickqian merged 19 commits intosgl-project:mainfrom
Conversation
Summary of ChangesHello @Prozac614, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request refines the model downloading utility to be more robust and flexible, particularly for non-standard Hugging Face models like LoRA adapters. It addresses previous failures caused by overly strict verification checks and enhances network resilience through an improved retry mechanism, ensuring smoother model acquisition in various environments, including CI. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the maybe_download_model function to enhance its robustness and error handling when loading models. The changes introduce a cache-first strategy to check for locally available Hugging Face models, followed by a download strategy that includes a retry mechanism for network-related failures. It also adds specific exception handling for RepositoryNotFoundError, RevisionNotFoundError, and RequestException during downloads, while removing the previous _verify_model_complete function. A review comment points out that a raise ValueError statement at the end of the function is unreachable, as the preceding retry loop will always exit either by returning a value or raising an exception.
| f"Could not find model at {model_name_or_path} and failed to download from HF Hub: {e}" | ||
| ) from e | ||
|
|
||
| raise ValueError(f"Failed to load model {model_name_or_path}") |
|
/tag-and-rerun-ci |
8f1548d to
bc9e029
Compare
…oad_model function in hf_diffusers_utils.py
…to simplify local model verification and enhance error handling during download process.
…ng ValueError when model is not found in cache and download is disabled, improving robustness of the model downloading process in hf_diffusers_utils.py.
… support LoRA models by adding an is_lora parameter, enhancing local model verification and download process.
bc9e029 to
a34e442
Compare
…dle cases where a model is found in cache but incomplete, raising ValueError when download is disabled, and logging appropriate messages for better clarity.
…low_patterns, increasing MAX_RETRIES to 5, and improving download parameters for better performance.
…formance baseline JSON by removing unnecessary fields.
Motivation
Currently, the maybe_download_model utility in hf_diffusers_utils.py fails when downloading LoRA adapters. This is because it enforces a strict verification check (expecting transformer/ and vae/ directories) which LoRAs do not possess. This leads to false positives where the code assumes the download failed and triggers a forced re-download, often causing ConnectionErrors in CI environments.
Modifications
CI fix: Fixed the cache problem for CI pip/uv .
Robust HF Downloads:
Increased retry limits (from 3 to 5) and extended etag_timeout to handle handshake delays better.
Refined snapshot_download to specifically target *.safetensors and *.json for LoRAs using allow_patterns, significantly reducing request volume.
LoRA Logic Fix: Skipped strict structural verification (transformer/vae folder checks) for LoRA models, as they follow a different file structure.
OOM Retry: Implemented a try-catch block for torch.cuda.OutOfMemoryError, allowing the system to attempt a retry (potentially after clearing cache) instead of crashing immediately.
Accuracy Tests
Benchmarking and Profiling
Checklist