Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
eb0fc89
[Core] Add compute_encoder_metadata() utility for vision encoders
b-mu Feb 13, 2026
aed0922
[Config] Add CUDA graph config flags for vision encoders
b-mu Feb 13, 2026
69be2ff
[Core] Add EncoderCudaGraphManager infrastructure
b-mu Feb 13, 2026
43dd940
[Model] Add encoder_metadata kwarg to Qwen3 ViT forward
b-mu Feb 13, 2026
76e26a3
[Core] Add per_frame parameter to compute_encoder_metadata
b-mu Feb 13, 2026
ccac2ce
[Core] Implement CUDA graph capture for encoder
b-mu Feb 13, 2026
b9bfe47
[Core] Implement encoder CUDA graph replay
b-mu Feb 13, 2026
53c524d
[Core] Integrate EncoderCudaGraphManager into runtime
b-mu Feb 13, 2026
3442b57
[Core] Add DPVisionShardingMeta dataclass
b-mu Feb 13, 2026
a46a30c
[Core] Add dp_shard_vision_inputs utility
b-mu Feb 13, 2026
7c06f60
[Core] Add dp_gather_vision_outputs utility
b-mu Feb 13, 2026
46f5d06
[Core] Add DP support to EncoderCudaGraphManager
b-mu Feb 13, 2026
f5ecf03
[Fix] Fix runtime issues for CUDA graph capture
b-mu Feb 14, 2026
4191a50
[Fix] Use rectangular grids for exact encoder CUDA graph budgets
b-mu Feb 14, 2026
406d101
[Enhancement] Add stats logging for encoder CUDA graph execution
b-mu Feb 14, 2026
6bf3125
[Fix] Fix max_seqlen and spatial_merge_size for encoder CUDA graph
b-mu Feb 18, 2026
85f8dd4
[Fix] Fix encoder CUDA graph case when num_images > max_batch_size
b-mu Feb 19, 2026
b2abae8
[Refactor] Extract token count helpers in encoder_cudagraph
b-mu Feb 19, 2026
5359ecf
[Test] Add unit tests for encoder CUDA graph manager
b-mu Feb 19, 2026
54e27f4
[Test] Add GPU tests for EncoderCudaGraphManager capture/replay
b-mu Feb 19, 2026
4aad7f9
[Refactor] Rename _find_budget_graph to _find_smallest_fitting_budget…
b-mu Feb 20, 2026
24f5a1d
[Enhancement] Add greedy packing to encoder CUDA graph execution
b-mu Feb 20, 2026
9d1b19b
[Enhancement] Add FlashInfer cuDNN support for encoder CUDA graph
b-mu Mar 3, 2026
3253df2
[Refactor] Split metadata_buffers into embed_buffers and sequence_met…
b-mu Mar 3, 2026
caecc20
[Cleanup] Remove unused compute_encoder_metadata()
b-mu Mar 4, 2026
599eb29
[Refactor] Add encoder CUDA graph dataclasses
b-mu Mar 7, 2026
a1b483f
[Refactor] Add SupportsEncoderCudaGraph protocol
b-mu Mar 7, 2026
c28510d
[Refactor] Implement SupportsEncoderCudaGraph on Qwen3VLForConditiona…
b-mu Mar 7, 2026
257c315
[Refactor] Refactor EncoderCudaGraphManager to use SupportsEncoderCud…
b-mu Mar 7, 2026
b9ec69d
[Bugfix] Unwrap CUDAGraphWrapper before SupportsEncoderCudaGraph check
b-mu Mar 8, 2026
31a1310
[Bugfix] Fix tensor-scalar aliasing and output buffer overwrite in en…
b-mu Mar 8, 2026
ff09e9f
[Refactor] Extract prepare_encoder_metadata() to remove duplicates in…
b-mu Mar 13, 2026
dc25036
[Refactor] Extract helpers to remove duplicates in encoder CUDA graph…
b-mu Mar 13, 2026
c8e7bfd
[CI] Fix ruff formatting and line length
b-mu Mar 13, 2026
8741ea3
[Refactor] Move imports to module level in encoder CUDA graph manager
b-mu Mar 19, 2026
613474a
[Feature] Auto-infer encoder CUDA graph token budgets and max images …
b-mu Mar 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading