Support local model with inference-engine mlx #475

OKHand-Zy · 2024-11-20T06:10:04Z

Enhancement: support local and custom models #165

This is a modified version of the code that I've made functional, although the code quality may not be ideal. It supports running local path models using mlx for both CLI and ChatAPI. However, I've encountered an issue where the response quality from the CLI is subpar. I would appreciate any suggestions on how to improve or optimize the code for better results.

Changes:

Added a "How to use local models" section to the README.
Added the --add-local-model args.
Implemented init_exo_env to configure local model cards and the local model store.
Added bypass logic for local models using if...else statements.

blindcrone · 2024-11-21T15:37:39Z

I like the idea here, but think rather than rely on a folder structure this should use config files or command line arguments to specify paths to model implementations and populate things like the model card list at runtime. I'm considering refactoring the inference engine to take model implementations by default and use the shard downloader as one of a few possible routes to get weights, and I think automatically instantiating and parsing a default directory structure for this purpose creates a lot of potential for issues down the line

OKHand-Zy · 2024-11-22T01:39:57Z

@blindcrone
Recently, while implementing the HTTP functionality for the local model, I realized what you meant. I've switched to using aiohttp to establish an HTTP service on each node. When needed, I'll check which node has the necessary data and use an internal network to download it in chunks (similar to how exo does it). Afterward, I'll rely on the inference_engine in the command to use the model, instead of configuring it through a config file. I'm wondering if this aligns with your thoughts? If there's a better approach, I'm open to suggestions.

OKHand-Zy and others added 12 commits November 8, 2024 09:29

add inference mlx run local model in single node

89665df

Merge exo f1eec9f commit version

b2bcc12

filter merge erro

0b87eb9

filter f1eec9f model change

e2f0723

futuer:(i-e:mlx)suppoert local model and HF model terminal complet

2a2e3b2

futur:support cli and chatapi local model complet

d8bbb2b

filter run_model_cli

9ab3513

add init_exo_function (helpers.py)

a9c345a

Merge branch 'main' 1fa42f3 into support-local-model

1ae2648

filter read me

cdba915

filter cli local model error

0b06fe1

filter some mark

438eae4

OKHand-Zy force-pushed the support-local-model branch from f6fc665 to 438eae4 Compare November 22, 2024 01:12

Merge exo 93d38e2 commits

32574b9

OKHand-Zy added 2 commits November 22, 2024 10:10

filter name miss

535cb44

Merge exo commit 7013041 fix 'CLI reply error'

1065242

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support local model with inference-engine mlx #475

Support local model with inference-engine mlx #475

OKHand-Zy commented Nov 20, 2024

blindcrone commented Nov 21, 2024

OKHand-Zy commented Nov 22, 2024 •

edited

Loading

Support local model with inference-engine mlx #475

Are you sure you want to change the base?

Support local model with inference-engine mlx #475

Conversation

OKHand-Zy commented Nov 20, 2024

Changes:

blindcrone commented Nov 21, 2024

OKHand-Zy commented Nov 22, 2024 • edited Loading

OKHand-Zy commented Nov 22, 2024 •

edited

Loading