Skip to content

Vllm upstream fp8#1

Merged
dllehr-amd merged 10 commits intodllehr-amd:vllm_upstreamfrom
gshtras:vllm_upstream_fp8
Mar 17, 2024
Merged

Vllm upstream fp8#1
dllehr-amd merged 10 commits intodllehr-amd:vllm_upstreamfrom
gshtras:vllm_upstream_fp8

Conversation

@gshtras
Copy link
Copy Markdown

@gshtras gshtras commented Feb 14, 2024

No description provided.

zhaoyang-star and others added 10 commits February 14, 2024 21:11
Co-authored-by: zhaoyang <zhao.yang16@zte.com.cn>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Add non-MI300 compatible alternative for bulk conversions
Removed bf8 (e5m2) and renamed f8 to fp8 to explicitly specify that it is e4m3
Removed stochastic rounding for simplicity
Put bulk fp8 conversion hip intrinsics behind a define. Disabled by default
Using types from the proper vllm headers. Added namespace
Move amd specific headers under amd_detail
Rename remaining fp8_e5m2 to general fp8
@dllehr-amd dllehr-amd merged commit f9ee3f3 into dllehr-amd:vllm_upstream Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants