-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Show downloaded models, improve error handling #456
base: main
Are you sure you want to change the base?
Show downloaded models, improve error handling #456
Conversation
I am seeing now some of the robust error handling was intentional and merged in here. I can change back if that is preferred. |
I fixed these issue in https://github.com/cadenmackenzie/exo/pull/1/files which you can merge into this PR |
fix safari issue
Thanks for finding that! I went ahead and merged it in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What a great idea!
Definitely worthy of a $200 retrospective bounty.
I think the implementation right now is not great -- I'd prefer if we lean on the existing HFShardDownloader
since that knows how to do downloads. You might need to add a new abstract method to ShardDownloader
something like get_shard_download_status
and it should use the same functionality already in hf_helpers.py
to check what the percentage completion is.
Also, a few suggest changes to the UI: if the model is partially downloaded (percentage > 0) then display the % next to the model name in the UI.
Awesome! Got it working with the suggested changes. Want to do some testing in the morning then will update this PR. |
working versions
Added new abstract method in ShardDownloader, implemented get_shard_download_status in HFShardDownloader leaning on get_local_snapshot_dir, get_weight_map, get_allow_patterns helper functions and then checks the percentage of model downloaded for models where local files are found. Removed old function for checking percentage. Currently shows "downloaded" for fully downloaded models and "X% downloaded" for models not fully downloaded in dropdown. I don't love how this is being refreshed using the modelPoolInterval because it lags a little for models that are actively being downloaded so might work on how to improve that. I think it could be worth moving this logic as well as active downloads to a sidebar. I like how active downloads are being shown in the chat when initiated but if we moved that to a sidebar, it could be a centralized place to view all models, choose a model, see activity of downloaded models, and being able to remove local downloads to free up space. Similar to how you can remove local models in LM Studio. Open to suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix the download status code. Otherwise good.
exo/download/hf/hf_shard_download.py
Outdated
local_sizes = {} | ||
|
||
for pattern in patterns: | ||
if pattern.endswith('safetensors') or pattern.endswith('mlx'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't quite work. A pattern can match multiple files.
@@ -77,3 +87,55 @@ async def wrapped_progress_callback(event: RepoProgressEvent): | |||
@property | |||
def on_progress(self) -> AsyncCallbackSystem[str, Tuple[Shard, RepoProgressEvent]]: | |||
return self._on_progress | |||
|
|||
async def get_shard_download_status(self) -> Optional[Dict[str, float]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally there seems to be a lot of duplication between this and other functions. I think a refactor should be done here.
…ownloading models
Found an issue in handle_model_support that was creating HFShardDownloader without quick_check=true so it was starting download of models when being checked for download percentage. Will work on get_shard_download_status refactor |
Hi @AlexCheema can you give me some more insight into what you want to be refactored? I initially thought we could reuse some of the download percentage logic that is happening during download but as far as I can tell, that is only checking against the remote during download in download_file(). Would you want to pull some of that logic out into another helper function to use in get_shard_download_status() or am I missing something? |
Yeah I think pulling some of that logic out would make sense. I think it's in general in need of a good refactor if you want to take a go at that and will bump up the bounty to $300. |
…centage in hf_shard_download
…d_file to use that helper
…ensor files to properly check the size with GET request
Hf helper refactor
Hi Alex, please review. I didn't modify download_file much but added a check using the new helper method. Modified the get_shard_download_status to lean on the helper methods to calculate the overall percentage of files and return that. Also updated chatgpt_api to use that overall instead doing percent calculation there. Should fix the pattern matching issue that you identified as well. |
@AlexCheema pinging for your review |
Please resolve conflicts and ping me again. @cadenmackenzie |
@AlexCheema resolved |
updates from main
Please assign me when fixed and ready for me to review @cadenmackenzie |
Hi @AlexCheema , my apologies. I did test it and it was working for me even without the inference_engine_classes defined. Not sure why. I fixed it and installed Pylance. |
|
I wanted a way to show which models have already been downloaded. Ultimately, I would like to add the ability to manage those models such as deleting them. Might make more sense in a sidebar where you chose the model and can manage it.
I also added more robust handling of errors in index.html to safely access them as I saw some warnings in the main branch when no errors were set.
In index.js I combined error handling into a single flow for the populateSelector method using response.json() instead of manually parsing. I also added a helper method to set the error to reduce duplicate code. Something I should have done with my last PR.
Open to suggestions on these changes.