Skip to content

Commit

Permalink
fix ring topology img
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexCheema committed Jul 17, 2024
1 parent c432871 commit ba7abb9
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,11 @@ Unlike other distributed inference frameworks, exo does not use a master-worker

Exo supports different partitioning strategies to split up a model across devices. The default partitioning strategy is [ring memory weighted partitioning](exo/topology/ring_memory_weighted_partitioning_strategy.py). This runs an inference in a ring where each device runs a number of model layers proportional to the memory of the device.

<picture>
<img alt="ring topology" src="docs/ring-topology.png" width="30%" height="30%">
</picture>
<p>
<picture>
<img alt="ring topology" src="docs/ring-topology.png" width="30%" height="30%">
</picture>
</p>


## Installation
Expand Down Expand Up @@ -98,7 +100,7 @@ That's it! No configuration required - exo will automatically discover the other

The native way to access models running on exo is using the exo library with peer handles. See how in [this example for Llama 3](examples/llama3_distributed.py).

exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000. Note: this is currently only supported by tail nodes (i.e. nodes selected to be at the end of the ring topology). If you want to force a node to be the tail, set its node-id to be sorted last alphabetically on start e.g. `python3 main.py --node-id xxxnode-mac-mini" Example request:
exo also starts a ChatGPT-compatible API endpoint on http://localhost:8000. Note: this is currently only supported by tail nodes (i.e. nodes selected to be at the end of the ring topology). Example request:

```
curl http://localhost:8000/v1/chat/completions \
Expand Down

0 comments on commit ba7abb9

Please sign in to comment.