-
Notifications
You must be signed in to change notification settings - Fork 1.7k
package exo as installable #470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.10.2 to 3.10.11. - [Release notes](https://github.com/aio-libs/aiohttp/releases) - [Changelog](https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst) - [Commits](aio-libs/aiohttp@v3.10.2...v3.10.11) --- updated-dependencies: - dependency-name: aiohttp dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]>
LGTM |
I've pulled the latest main branch 1fa42f3 and the code runs, but I'm getting a "not defined" in exit exo. ❯ exo --inference-engine mlx --run-model llama-3.2-3b
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
Selected inference engine: mlx
_____ _____
/ _ \ \/ / _ \
| __/> < (_) |
\___/_/\_\___/
Detected system: Apple Silicon Mac
Inference engine name after selection: mlx
Using inference engine: MLXDynamicShardInferenceEngine with shard downloader: HFShardDownloader
[59392, 62317, 62890, 61755, 51505, 54822, 59529, 58544, 58825, 58707, 54319, 59382, 57740, 55399, 62061, 56510, 61677, 54465, 58521]
Chat interface started:
- http://172.20.10.8:52415
- http://127.0.0.1:52415
ChatGPT API endpoint served at:
- http://172.20.10.8:52415/v1/chat/completions
- http://127.0.0.1:52415/v1/chat/completions
has_read=True, has_write=True
Processing prompt: <|begin_of_text|><|start_header_id|>system<|end_header_id|>
Cutting Knowledge Date: December 2023
Today Date: 26 Jul 2024
<|eot_id|><|start_header_id|>user<|end_header_id|>
Who are you?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Removing download task for Shard(model_id='llama-3.2-3b', start_layer=0, end_layer=27, n_layers=28): True
Generated response:
I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."<|eot_id|>
Received exit signal SIGTERM...
Thank you for using exo.
_____ _____
/ _ \ \/ / _ \
| __/> < (_) |
\___/_/\_\___/
Cancelling 4 outstanding tasks
Traceback (most recent call last):
File "/Users/ziyu/miniconda3/envs/exo/bin/exo", line 33, in <module>
sys.exit(load_entry_point('exo', 'console_scripts', 'exo')())
File "/Users/ziyu/RemoteFolder/ziyu-pr/exo/exo/main.py", line 247, in run
loop.run_until_complete(shutdown(signal.SIGTERM, loop))
File "/Users/ziyu/miniconda3/envs/exo/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
return future.result()
File "/Users/ziyu/RemoteFolder/ziyu-pr/exo/exo/helpers.py", line 249, in shutdown
await server.stop()
NameError: name 'server' is not defined
╭────────────────────────────────────────────────────────────────────── Exo Cluster (1 node) ──────────────────────────────────────────────────────────────────────╮
╭──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ 💬️ Who are you? │
│ │
│ 🤖 I'm an artificial intelligence model known as Llama. Llama stands for "Large Language Model Meta AI."<|eot_id|> │
..... |
this is addressed on pr #473 |
Hi, I'd like to know if the change made on line 186 of exo/inference/mlx/sharded_utils.py is crucial. tokenizer = load_tokenizer(model_path, tokenizer_config) => tokenizer = await resolve_tokenizer(model_path) When I try to support local models, I encounter an issue, but it's resolved after reverting the change. So I'm wondering if this change is really necessary. If it's not that important, I'd prefer to use the old method. |
change was made by @dtnewman , for now I will revert it back to the old method |
This change is correct. We should keep using @OKHand-Zy you will need to fix your code to work with |
Okay, I'll try to modify my code to make |
#302