DeepSpeed-FastGen by cmikeh2 · Pull Request #4604 · deepspeedai/DeepSpeed

cmikeh2 · 2023-11-03T19:22:01Z

DeepSpeed-FastGen is built to leverage continuous batching and non-contiguous KV caches to enable increased occupancy and higher responsivity for serving LLMs in the data center, similar to existing frameworks such as TRT-LLM, TGI, and vLLM. In order to achieve a new level of performance, DeepSpeed-FastGen introduces SplitFuse which leverages dynamic prompt and generation decomposition and unification to further improve continuous batching and system throughput.

Corresponding blog: #4607

Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Connor Holmes <connorholmes@microsoft.com> Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

jeffra

🚀🚀🎉🎉

Co-authored-by: Jeff Rasley <jerasley@microsoft.com> Co-authored-by: Michael Wyatt <michaelwyatt@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com> Co-authored-by: Masahiro Tanaka <mtanaka@microsoft.com> Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>

cmikeh2 requested review from RezaYazdaniAminabadi, arashb, awan-10, jeffra, loadams, mrwyattii and tjruwase as code owners November 3, 2023 19:22

Merge branch 'master' into staging-inference-v2-5

74b6f76

jeffra approved these changes Nov 3, 2023

View reviewed changes

jeffra and others added 2 commits November 3, 2023 15:06

Merge branch 'master' into staging-inference-v2-5

7e9d841

Merge branch 'master' into staging-inference-v2-5

debca5f

jeffra merged commit 38b41df into master Nov 3, 2023

weiji14 mentioned this pull request Nov 4, 2023

deepspeed v0.12.0 conda-forge/deepspeed-feedstock#34

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSpeed-FastGen#4604

DeepSpeed-FastGen#4604
jeffra merged 4 commits intomasterfrom
staging-inference-v2-5

cmikeh2 commented Nov 3, 2023 •

edited by jeffra

Loading

Uh oh!

jeffra left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cmikeh2 commented Nov 3, 2023 • edited by jeffra Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeffra left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cmikeh2 commented Nov 3, 2023 •

edited by jeffra

Loading