-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[FEAT] [ROCm]: Support AITER Linear #14916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from 9 commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
fdd9cbd
cherry pick 09133e9833811778240b3c2cc4de2390fd08e470; and only add AI…
vllmellm 668ec2f
cherry pick acc27ffa94e677b8f6fce0f5b593430ce6acbfe4; and only add AI…
vllmellm 8d49d6e
bug fixes and pass unit tests
tjtanaa 43af6c0
add AITER setup steps in Dockerfile.rocm_base
tjtanaa 0c30ce9
remove AITER setup steps in Dockerfile.rocm
tjtanaa ab73f97
Merge remote-tracking branch 'origin/main' into aiter-linear
tjtanaa e952b2d
fix missing property from Platform
tjtanaa 6a632ac
skip AITER in AMD CI
tjtanaa 61c92a9
Merge remote-tracking branch 'origin/main' into aiter-linear
tjtanaa 0224eff
merge with main
tjtanaa d2ed934
revert run-amd-test.sh; update Dockerfile.rocm_base aiter version, re…
tjtanaa 3fec588
clean up spaces and newline; fix typo
tjtanaa 3558099
clean up spaces and newline;
tjtanaa 2bf7206
fix typo
tjtanaa 1f979fa
untested refactoring
tjtanaa f13746c
fix bug; validated to work V1 AITER unquantized and quantized
tjtanaa 20139af
relocate the linear helper function into aiter_ops and fix unittest
tjtanaa 700ac73
add test_aiter_ops.py to unit test if the ops are registered correctl…
tjtanaa 7dd2812
fix the test to test fake tensor implementation
tjtanaa d9f0e7b
use current_platform.fp8_dtype(); update aiter commit
tjtanaa dde9157
merge with main; fix dispatcher and unit tests
tjtanaa e34712c
remove is_rocm_aiter_xxxx_enabled flag from _aiter_ops.py
tjtanaa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,7 +2,7 @@ | |
|
|
||
| import itertools | ||
| from abc import abstractmethod | ||
| from typing import Any, Literal, Optional, Union | ||
| from typing import Any, Callable, Literal, Optional, Union | ||
|
|
||
| import torch | ||
| import torch.nn as nn | ||
|
|
@@ -26,6 +26,7 @@ | |
| RowvLLMParameter) | ||
| # yapf: enable | ||
| from vllm.model_executor.utils import set_weight_attrs | ||
| from vllm.platforms import current_platform | ||
|
|
||
| logger = init_logger(__name__) | ||
|
|
||
|
|
@@ -50,6 +51,18 @@ | |
| ] | ||
|
|
||
|
|
||
| def rocm_aiter_tgemm_mm(x: torch.Tensor, weight: torch.Tensor, | ||
| bias: torch.Tensor) -> torch.Tensor: | ||
| from aiter.tuned_gemm import tgemm | ||
| return tgemm.mm(x, weight, bias) | ||
|
|
||
|
|
||
| def dipsatch_unquantized_linear_func() -> Callable[..., torch.Tensor]: | ||
|
||
| if current_platform.is_rocm_aiter_linear_enabled(): | ||
| return rocm_aiter_tgemm_mm | ||
| return F.linear | ||
|
|
||
|
|
||
| def adjust_marlin_shard(param, shard_size, shard_offset): | ||
| marlin_tile_size = getattr(param, "marlin_tile_size", None) | ||
| if marlin_tile_size is None: | ||
|
|
@@ -187,8 +200,7 @@ def apply(self, | |
| layer: torch.nn.Module, | ||
| x: torch.Tensor, | ||
| bias: Optional[torch.Tensor] = None) -> torch.Tensor: | ||
|
|
||
| return F.linear(x, layer.weight, bias) | ||
| return dipsatch_unquantized_linear_func()(x, layer.weight, bias) | ||
|
|
||
|
|
||
| class LinearBase(torch.nn.Module): | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.