pick up tuned prefill configs for FP8 FA3#36265
Conversation
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request updates the pinned commit for the flash-attention dependency to 192c71ae3fb2b474e06f5473bb1a7d41baefbd3f. According to the title, this is to incorporate tuned prefill configurations for FP8 FlashAttention 3. While pinning dependencies to a specific commit is a good practice for reproducibility, using a raw commit hash without any context makes the code harder to maintain. I've suggested adding a comment to clarify the purpose of this specific commit hash.
Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
|
Hi @jmkuebler, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com>
Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com> Signed-off-by: Monishver Chandrasekaran <monishverchandrasekaran@gmail.com>
Signed-off-by: Jonas M. Kübler <44084297+jmkuebler@users.noreply.github.com> Signed-off-by: Jonas Kuebler <kuebj@amazon.com>
Purpose
Run CI for vllm-project/flash-attention#125
Benchmarking results are in FA PR