You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -25,14 +25,12 @@ exo: Run your own AI cluster at home with everyday devices. Maintained by [exo l
25
25
Forget expensive NVIDIA GPUs, unify your existing devices into one powerful GPU: iPhone, iPad, Android, Mac, Linux, pretty much any device!
26
26
27
27
<divalign="center">
28
-
<h2>Update: Exo Supports Llama 3.1</h2>
29
-
<p>Run 8B, 70B and 405B parameter Llama 3.1 models on your own devices</p>
30
-
<p><ahref="https://github.com/exo-explore/exo/blob/main/exo/inference/mlx/models/llama.py">See the code</a></p>
28
+
<h2>Update: exo is hiring. See <ahref="https://exolabs.net">here</a> for more details.</h2>
31
29
</div>
32
30
33
31
## Get Involved
34
32
35
-
exo is **experimental** software. Expect bugs early on. Create issues so they can be fixed. The [exo labs](https://x.com/exolabs_) team will strive to resolve issues quickly.
33
+
exo is **experimental** software. Expect bugs early on. Create issues so they can be fixed. The [exo labs](https://x.com/exolabs) team will strive to resolve issues quickly.
36
34
37
35
We also welcome contributions from the community. We have a list of bounties in [this sheet](https://docs.google.com/spreadsheets/d/1cTCpTIp48UnnIvHeLEUNg1iMy_Q6lRybgECSFCoVJpE/edit?usp=sharing).
38
36
@@ -52,7 +50,7 @@ exo will [automatically discover](https://github.com/exo-explore/exo/blob/945f90
52
50
53
51
### ChatGPT-compatible API
54
52
55
-
exo provides a [ChatGPT-compatible API](exo/api/chatgpt_api.py) for running models. It's a [one-line change](examples/chatgpt_api.py) in your application to run models on your own hardware using exo.
53
+
exo provides a [ChatGPT-compatible API](exo/api/chatgpt_api.py) for running models. It's a [one-line change](examples/chatgpt_api.sh) in your application to run models on your own hardware using exo.
56
54
57
55
### Device Equality
58
56
@@ -108,8 +106,6 @@ python3 main.py
108
106
109
107
That's it! No configuration required - exo will automatically discover the other device(s).
110
108
111
-
The native way to access models running on exo is using the exo library with peer handles. See how in [this example for Llama 3](examples/llama3_distributed.py).
112
-
113
109
exo starts a ChatGPT-like WebUI (powered by [tinygrad tinychat](https://github.com/tinygrad/tinygrad/tree/master/examples/tinychat)) on http://localhost:8000
114
110
115
111
For developers, exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000/v1/chat/completions. Example with curls:
### Example Usage on Multiple Heterogenous Devices (MacOS + Linux)
150
+
151
+
#### Device 1 (MacOS):
152
+
153
+
```sh
154
+
python3 main.py --inference-engine tinygrad
155
+
```
156
+
157
+
Here we explicitly tell exo to use the **tinygrad** inference engine.
158
+
159
+
#### Device 2 (Linux):
160
+
```sh
161
+
python3 main.py
162
+
```
163
+
164
+
Linux devices will automatically default to using the **tinygrad** inference engine.
165
+
166
+
You can read about tinygrad-specific env vars [here](https://docs.tinygrad.org/env_vars/). For example, you can configure tinygrad to use the cpu by specifying `CLANG=1`.
167
+
168
+
153
169
## Debugging
154
170
155
171
Enable debug logs with the DEBUG environment variable (0-9).
@@ -158,6 +174,12 @@ Enable debug logs with the DEBUG environment variable (0-9).
158
174
DEBUG=9 python3 main.py
159
175
```
160
176
177
+
For the **tinygrad** inference engine specifically, there is a separate DEBUG flag `TINYGRAD_DEBUG` that can be used to enable debug logs (1-6).
178
+
179
+
```sh
180
+
TINYGRAD_DEBUG=2 python3 main.py
181
+
```
182
+
161
183
## Known Issues
162
184
163
185
- 🚧 As the library is evolving so quickly, the iOS implementation has fallen behind Python. We have decided for now not to put out the buggy iOS version and receive a bunch of GitHub issues for outdated code. We are working on solving this properly and will make an announcement when it's ready. If you would like access to the iOS implementation now, please email [email protected] with your GitHub username explaining your use-case and you will be granted access on GitHub.
# note: we only support one image at a time right now. Multiple is possible. See: https://github.com/huggingface/transformers/blob/e68ec18ce224af879f22d904c7505a765fb77de3/docs/source/en/model_doc/llava.md?plain=1#L41
146
+
# note: wae only support one image at time right now. Multiple is possible. See: https://github.com/huggingface/transformers/blob/e68ec18ce224af879f22d904c7505a765fb77de3/docs/source/en/model_doc/llava.md?plain=1#L41
144
147
# follows the convention in https://platform.openai.com/docs/guides/vision
0 commit comments