Commit 1181c5d
authored
More realistic bench for POD Attn (#2013)
<!-- .github/pull_request_template.md -->
## 📌 Description
Use real head sizes, seq lens and add comparison with sequential prefill
+ decode.
Results on H100 (without overlap, which only adds ~150GB/s for
persistent):
<img width="433" height="571" alt="image"
src="https://github.com/user-attachments/assets/50de01cd-e5ca-450c-9cc0-521d83b7e487"
/>
cc @yzh119
## 🔍 Related Issues
<!-- Link any related issues here -->
## 🚀 Pull Request Checklist
Thank you for contributing to FlashInfer! Before we review your pull
request, please make sure the following items are complete.
### ✅ Pre-commit Checks
- [ ] I have installed `pre-commit` by running `pip install pre-commit`
(or used your preferred method).
- [ ] I have installed the hooks with `pre-commit install`.
- [ ] I have run the hooks manually with `pre-commit run --all-files`
and fixed any reported issues.
> If you are unsure about how to set up `pre-commit`, see [the
pre-commit documentation](https://pre-commit.com/).
## 🧪 Tests
- [ ] Tests have been added or updated as needed.
- [ ] All tests are passing (`unittest`, etc.).
## Reviewer Notes
<!-- Optional: anything you'd like reviewers to focus on, concerns, etc.
-->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Added comprehensive performance benchmarking for batch attention
operations with detailed timing measurements.
* Introduced sequential dual-kernel benchmark path with extended memory
bandwidth reporting.
* **Tests**
* Updated benchmark test configurations to use deterministic, fixed
values for improved reproducibility.
* Adjusted benchmark parameters for consistency across test iterations.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->1 parent de4c701 commit 1181c5d
1 file changed
+69
-62
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
72 | 72 | | |
73 | 73 | | |
74 | 74 | | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
75 | 93 | | |
76 | 94 | | |
77 | 95 | | |
| |||
123 | 141 | | |
124 | 142 | | |
125 | 143 | | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
126 | 179 | | |
127 | 180 | | |
128 | 181 | | |
| 182 | + | |
| 183 | + | |
129 | 184 | | |
130 | 185 | | |
131 | 186 | | |
| |||
137 | 192 | | |
138 | 193 | | |
139 | 194 | | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
140 | 203 | | |
141 | 204 | | |
142 | 205 | | |
143 | 206 | | |
144 | 207 | | |
145 | 208 | | |
146 | 209 | | |
147 | | - | |
148 | | - | |
149 | | - | |
150 | | - | |
151 | | - | |
152 | | - | |
153 | | - | |
154 | | - | |
155 | | - | |
156 | | - | |
157 | | - | |
158 | | - | |
159 | | - | |
160 | | - | |
161 | | - | |
162 | | - | |
163 | | - | |
164 | | - | |
165 | | - | |
166 | | - | |
167 | | - | |
168 | | - | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
197 | | - | |
198 | | - | |
199 | | - | |
200 | | - | |
201 | | - | |
202 | | - | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
207 | 214 | | |
208 | 215 | | |
209 | | - | |
210 | | - | |
| 216 | + | |
| 217 | + | |
211 | 218 | | |
212 | 219 | | |
213 | 220 | | |
| |||
0 commit comments