Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support loading model into pipeline from local filesystem #308

Merged
merged 9 commits into from
May 15, 2024

Conversation

Jeadie
Copy link
Contributor

@Jeadie Jeadie commented May 14, 2024

Motivation

  • Support loading models from local filesystem to make it easier for productionising using mistral.rs.

Changes

  • Change trait Loader. From load_model
    • load_model_from_hf
    • load_model_from_path
  • Both of these can use the same logic once HF files have been downloaded locally.

Copy link
Owner

@EricLBuehler EricLBuehler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding this! I think it's a great addition to the Rust API and just clears up the code a bit. I left a few minor comments regarding docs and one small question.

mistralrs-server/src/main.rs Show resolved Hide resolved
mistralrs-core/src/pipeline/mod.rs Outdated Show resolved Hide resolved
mistralrs-core/src/pipeline/speculative.rs Show resolved Hide resolved
@EricLBuehler
Copy link
Owner

@Jeadie, I think there are some formatting/linting issues. After those are fixed, I think this should be ready to merge.

Copy link

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 Dockerfile              1           34           25            0            9
 Happy                   1          442          369            0           73
 JSON                    5            9            9            0            0
 Python                 21          741          622           21           98
 TOML                   16          419          378            1           40
-------------------------------------------------------------------------------
 Jupyter Notebooks       1            0            0            0            0
 |- Markdown             1           60           30           22            8
 |- Python               1           96           87            1            8
 (Total)                            156          117           23           16
-------------------------------------------------------------------------------
 Markdown               16         1026            0          758          268
 |- BASH                 6          205          192            0           13
 |- Python               6          121          110            0           11
 |- Rust                 3          185          172            9            4
 (Total)                           1537          474          767          296
-------------------------------------------------------------------------------
 Rust                   81        26428        24327          334         1767
 |- Markdown            38          359            0          354            5
 (Total)                          26787        24327          688         1772
===============================================================================
 Total                 143        29099        25730         1114         2255
===============================================================================
  

@Jeadie
Copy link
Contributor Author

Jeadie commented May 15, 2024

Fixed and ran both cargo fmt --all --check and cargo clippy --workspace --tests --examples -- -D warnings locally.

@EricLBuehler EricLBuehler merged commit 3a79bc8 into EricLBuehler:master May 15, 2024
11 checks passed
@EricLBuehler
Copy link
Owner

Thank you!

@polarathene
Copy link
Contributor

@Jeadie Is the local filesystem support only a few days old? Can you explain if HF API is mandatory?

Active issue: #326

  • It's much simpler to run a local GGUF model via llama-cpp.
  • The expected parameters (redundant?) can also be a tad confusing. This GGUF presumably needs the additional files sourced from here, but there is no tokenizer.json? From what I've read all the relevant metadata should already be available in the GGUF file itself, and llama-cpp happily runs with only the GGUF file provided.
  • Without a HF token provided (--token-source none), the HF API is still hit but encounters an "Unauthorized" response (401), while only 404 responses have an exception which triggers a panic.

Perhaps there is some benefit with local filesystem going through HF API? Otherwise it'd probably be better to avoid that, or at least offer a way to opt-out if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants