Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive Flops Partitioning Strategy #346

Closed
wants to merge 2 commits into from

Conversation

DevEmilio96
Copy link
Contributor

Issue: #284

Since I cannot test Exo on a real cluster, I have made the following assumptions, and I would like to know which ones are true and which are false:

  1. Exo does not partition the model across multiple machines.
  2. instead, all machines execute the entire model.
  3. The partitioning strategy applies to inference partitioning, specifically its layers.
  4. All intermediate states of the chat are stored on all machines.
  5. It is not possible to reassign partitions during tensor processing.
  6. Partitions can only be reassigned between prompts.
  7. Compared to the model, inference computation requires much less memory. For example, if the model takes up 16GB,
    after each query the machines retain intermediate states ranging from around 100MB to 2GB, depending on the
    complexity of the query and the length of the output to be generated.
  8. Inter-node latency is not a critical factor in a non-geographically distributed cluster.

AdaptiveFlopsPartitioningStrategy:

  1. Initial partitioning is based on FLOPs (floating point operations) rather than memory.
  2. After each prompt, partitions are reassigned based on the performance of the nodes.
  3. It implements the update_node_performance method, which can be modified to include other relevant metrics.


return partitions

def update_node_performance(self, node_id: str, processing_time: float, shard: Shard):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do the other nodes find out about this node's performance measurements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the comment, I added a commit that should answer your question

@AlexCheema
Copy link
Contributor

Closing as we will rework partitioning strategies.

@AlexCheema AlexCheema closed this Nov 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants