-
-
Notifications
You must be signed in to change notification settings - Fork 18k
[ROCm] Cap Triton paged attention block size to fix ROCm shared memory OOM #38502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1fa54b7
9d5b0a0
ca6e2df
3b44ad4
70a327c
483debc
f3e5e4e
3262441
513ada7
311039c
bf8a6f5
5784e80
032f175
f2cfbd4
50ac00f
1d5f15d
1510cc4
2ee08e4
b038408
979ad99
e7eb924
e97ad4a
2cf7d11
e0a7d20
49945d7
0fcf335
3befaed
4743215
6f9f1ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1803,9 +1803,10 @@ steps: | |
| - tests/models/multimodal/generation | ||
| - tests/models/multimodal/test_mapping.py | ||
| commands: | ||
| - pip install git+https://github.com/TIGER-AI-Lab/Mantis.git | ||
| - pytest -v -s models/multimodal/generation -m 'not core_model' --ignore models/multimodal/generation/test_common.py | ||
| - pytest -v -s models/multimodal/test_mapping.py | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/AndreasKaratzas/mamba@rocm-7.0-v2.3.0' | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/Dao-AILab/causal-conv1d@v1.6.0' | ||
| - pytest -v -s models/language/generation -m hybrid_model --num-shards=$$BUILDKITE_PARALLEL_JOB_COUNT --shard-id=$$BUILDKITE_PARALLEL_JOB | ||
|
|
||
|
|
||
| - label: Multi-Modal Models (Extended Generation 2) # TBD | ||
| timeout_in_minutes: 180 | ||
|
|
@@ -1817,8 +1818,10 @@ steps: | |
| - vllm/ | ||
| - tests/models/multimodal/generation | ||
| commands: | ||
| - pip install git+https://github.com/TIGER-AI-Lab/Mantis.git | ||
| - pytest -v -s models/multimodal/generation/test_common.py -m 'split(group=0) and not core_model' | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/AndreasKaratzas/mamba@rocm-7.0-v2.3.0' | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/Dao-AILab/causal-conv1d@v1.6.0' | ||
| - pytest -v -s models/language/generation -m '(not core_model) and (not hybrid_model)' | ||
|
|
||
|
|
||
| - label: Multi-Modal Models (Extended Generation 3) # TBD | ||
| timeout_in_minutes: 180 | ||
|
|
@@ -3043,7 +3046,7 @@ steps: | |
| - vllm/ | ||
| - tests/models/language/generation | ||
| commands: | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/AndreasKaratzas/mamba@fix-rocm-7.0-warp-size-constexpr' | ||
| - uv pip install --system --no-build-isolation 'git+https://github.com/AndreasKaratzas/mamba@rocm-7.0-v2.3.0' | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will this be upstreamed?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know tbh. I'm keeping it for now synced with upstream. In this branch it's basically upstream + my original fix. |
||
| - uv pip install --system --no-build-isolation 'git+https://github.com/Dao-AILab/causal-conv1d@v1.6.0' | ||
| - pytest -v -s models/language/generation -m '(not core_model) and (not hybrid_model)' | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this be upstreamed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know tbh. I'm keeping it for now synced with upstream. In this branch it's basically upstream + my original fix.