Skip to content

Single batch specialization for FP8#264

Closed
sunggg wants to merge 4 commits intomlc-serve-v0.2.0from
spark/single-batch-specialization
Closed

Single batch specialization for FP8#264
sunggg wants to merge 4 commits intomlc-serve-v0.2.0from
spark/single-batch-specialization

Conversation

@sunggg
Copy link
Copy Markdown

@sunggg sunggg commented Apr 29, 2024

No description provided.

@sunggg sunggg marked this pull request as draft April 29, 2024 16:56
@sunggg sunggg force-pushed the spark/single-batch-specialization branch from 5f754e4 to c582c5d Compare April 30, 2024 02:51
@sunggg
Copy link
Copy Markdown
Author

sunggg commented May 10, 2024

Don't need anymore since we migrated FP8 code to slm/ with https://github.com/octoml/ollm/pull/857

@sunggg sunggg closed this May 10, 2024
Lunderberg pushed a commit to Lunderberg/mlc-llm that referenced this pull request Jul 25, 2024
This PR enables vulkan for rwkv, removes the vulkan
from local detection as it can cause cross platform issues.
Introduce max_gen_len parameter for chat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant