Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
a5fc250
Attempt 1
cakeng Sep 24, 2025
c95041b
Top k works?
cakeng Sep 24, 2025
fe60b22
Top k works?
cakeng Sep 24, 2025
74a18b5
Tenary search
cakeng Sep 24, 2025
7502c06
Quadruple Search
cakeng Sep 25, 2025
360e234
Quadruple Search
cakeng Sep 25, 2025
11bd61f
Added outliers
cakeng Sep 25, 2025
a922b45
Added gather
cakeng Sep 25, 2025
6f39f20
Added gather
cakeng Sep 25, 2025
30033c2
0.00115 for topk
cakeng Sep 25, 2025
2987617
0.00115 for topk
cakeng Sep 25, 2025
ba5b98b
topk working, adding topp:
cakeng Sep 25, 2025
46bcc7d
Wrong results
cakeng Sep 25, 2025
5de5ece
Wrong results
cakeng Sep 25, 2025
cbcf7f5
Fixed?
cakeng Sep 26, 2025
643c21d
Fixed?
cakeng Sep 26, 2025
2737c2d
Maybe?
cakeng Sep 26, 2025
f24d2e1
Duplicate logit issues.
cakeng Sep 26, 2025
a58ca6c
Duplicate logit issues.
cakeng Sep 26, 2025
b87c095
Top-p duplicate handler implemented
cakeng Sep 26, 2025
6e3ca0a
Top-p fixed
cakeng Sep 26, 2025
034e802
Need to implement topp-only, topk and topk-topp works.
cakeng Sep 26, 2025
1114582
Correctness tested for top-p. Duplication handling for top-k remaining.
cakeng Sep 27, 2025
56a615f
Deeseep tests
cakeng Sep 27, 2025
6bea89c
Added env var VLLM_USE_TRITON_SAMPLER and automated test
cakeng Sep 27, 2025
8309b68
Merge remote-tracking branch 'origin/main' into topk_topp
cakeng Sep 27, 2025
5575c67
Linter
cakeng Sep 27, 2025
3342235
Tests
cakeng Sep 27, 2025
9bb0fbb
Added Triton autotune
cakeng Sep 27, 2025
340b6b4
Reduce diff and do fallback when batch size small.
cakeng Sep 28, 2025
54df27f
Merge remote-tracking branch 'origin/main' into topk_topp
cakeng Sep 28, 2025
cf768c2
Test script fix
cakeng Sep 28, 2025
9b3cf75
Added graph generation
cakeng Sep 28, 2025
4235295
Removed fallback
cakeng Sep 28, 2025
1e3ed75
Merge branch 'vllm-project:main' into topk_topp
cakeng Sep 28, 2025
344c3e4
Added Gemini's suggestions, removed triton autotune.
cakeng Sep 28, 2025
ba89c38
Merge branch 'topk_topp' of https://github.com/cakeng/vllm into topk_…
cakeng Sep 28, 2025
da1b1e6
Fixed warps and stages
cakeng Sep 28, 2025
289c2ba
Fixed scratchpads
cakeng Sep 28, 2025
5b0b1e6
Fixed scratchpads
cakeng Sep 28, 2025
865b523
Merge branch 'main' into topk_topp
cakeng Oct 8, 2025
350cbc8
Merge branch 'main' into topk_topp
cakeng Oct 13, 2025
5e6156c
Init Sunga's correct triton top_k top_p implementation
ddsoup0401 Oct 23, 2025
7401ead
initial commit
ddsoup0401 Oct 23, 2025
d8fac6a
init commit
ddsoup0401 Oct 26, 2025
1d349d3
not working.........
ddsoup0401 Oct 27, 2025
b9a0c05
working on it....
ddsoup0401 Oct 27, 2025
71c5978
working........python compare.py
ddsoup0401 Oct 27, 2025
115a98b
...
ddsoup0401 Oct 28, 2025
f9b08f2
...
ddsoup0401 Oct 28, 2025
b8728db
slow but working
ddsoup0401 Nov 1, 2025
953025e
very slow
ddsoup0401 Nov 13, 2025
d1ca674
pushed?
ddsoup0401 Nov 14, 2025
5697d83
Top-k working
cakeng Nov 14, 2025
a2f6ae6
Errors on top p
cakeng Nov 15, 2025
2893ed5
Everything correct but slow
cakeng Nov 15, 2025
6e3c874
Everything correct but slow
cakeng Nov 15, 2025
d0f491e
Fast and working correctly
cakeng Nov 16, 2025
6743e12
Fast and working correctly
cakeng Nov 16, 2025
60b6515
Errors
cakeng Nov 16, 2025
71cbb9e
Filtered logits are wrongs
cakeng Nov 16, 2025
8b0771c
Filtered logits are wrongs
cakeng Nov 16, 2025
20806a2
Floating point associativity errors remain
cakeng Nov 16, 2025
f8cc453
Merge main
cakeng Nov 16, 2025
89443c0
Remove tester
cakeng Nov 16, 2025
204c221
Bugfix
cakeng Nov 16, 2025
e262fcb
Test file removed.
cakeng Nov 16, 2025
5e6dc79
Typos
cakeng Nov 16, 2025
091b518
Typos
cakeng Nov 16, 2025
5fc986e
Typos
cakeng Nov 16, 2025
02d446b
Typos
cakeng Nov 16, 2025
d0f02f6
Typos
cakeng Nov 16, 2025
152bc32
Bugfixes
cakeng Nov 17, 2025
db9859f
Deduplication
cakeng Nov 18, 2025
b936c94
Duplication search bugfix
cakeng Nov 18, 2025
3784e60
Bugfixes
cakeng Nov 18, 2025
b0b6253
PyTorch sort permutes the order of duplicate values when sorting. Whe…
cakeng Nov 18, 2025
cd98ab9
Original pytorch implemntation apply softmax after sorting, which pro…
cakeng Nov 18, 2025
b72e207
Helper scripts
cakeng Nov 19, 2025
b1152c1
Helper scripts removed
cakeng Nov 19, 2025
d2d56a1
Change hyperparameters
cakeng Nov 19, 2025
6421e1e
Merge main
cakeng Nov 19, 2025
7643eab
[Perf] Triton-based top-p/top-k masking
njhill Jan 18, 2026
5a241a6
fix doc
njhill Jan 19, 2026
b017713
fix method name, only use triton when supported
njhill Jan 21, 2026
bd5d241
Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
njhill Jan 21, 2026
e067cbf
fix precision
njhill Jan 23, 2026
a02aee8
Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
njhill Jan 23, 2026
fbeb15f
Merge commit 'refs/pull/32558/head' of https://github.com/vllm-projec…
cakeng Feb 2, 2026
463afa6
Copied topk + topp impl
cakeng Feb 2, 2026
9a5f30d
Copied topk + topp impl
cakeng Feb 2, 2026
65874cc
Topp wrong
cakeng Feb 2, 2026
a671a09
Topp working, topp only
cakeng Feb 2, 2026
cf6ab55
Both Topk Topp working
cakeng Feb 2, 2026
150ccc6
Restored tests
cakeng Feb 2, 2026
ae08705
Bugfix
cakeng Feb 2, 2026
49c3c39
Loosened hyperparameters
cakeng Feb 2, 2026
06565df
Linter
cakeng Feb 2, 2026
acd99d7
Restore
cakeng Feb 2, 2026
ca3fff6
Merge branch 'main' into triton-topk-topp
cakeng Feb 2, 2026
b9d2275
Merge branch 'topk_topp' into triton-topk-topp
cakeng Feb 2, 2026
37f322a
Update vllm/v1/sample/ops/topk_topp_triton.py
cakeng Feb 2, 2026
cb731c5
Bugfix
cakeng Feb 2, 2026
0c61b95
Refactor comments for clarity in topk_topp_triton.py
cakeng Feb 2, 2026
503f0b0
Pre-commit fix
cakeng Feb 2, 2026
576f90e
Update arxiv
cakeng Feb 8, 2026
c18fe71
adjust prob distribution in benchmark, adjust threshold
njhill Feb 12, 2026
dba83d5
Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
njhill Feb 12, 2026
c246c3a
some simplification/cleanup
njhill Feb 12, 2026
4360e92
fix precommit
njhill Feb 13, 2026
612f38f
Merge branch 'main' into triton-topk-topp
njhill Feb 13, 2026
47ef82d
Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
njhill Feb 14, 2026
b917a49
fix -inf edge cases and possible infinite loop
njhill Feb 16, 2026
ef5d06e
Merge remote-tracking branch 'origin/main' into triton-topk-topp
njhill Feb 16, 2026
9dadec1
add async yield in cancellation test
njhill Feb 17, 2026
8e02756
Merge remote-tracking branch 'refs/remotes/origin/main' into triton-t…
njhill Feb 17, 2026
0e46d90
use temperature=0 in cancellation test
njhill Feb 17, 2026
f7873c5
Merge remote-tracking branch 'origin/main' into triton-topk-topp
njhill Feb 17, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading