Skip to content

Pull requests: HabanaAI/vllm-fork

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

support inc dynamic quantization
#803 opened Feb 8, 2025 by changwangss Loading…
Qwen2 vl
#802 opened Feb 7, 2025 by malkomes Draft
Skip sampling softmax
#801 opened Feb 7, 2025 by attafosu Loading…
mszu/merged scheduler
#799 opened Feb 7, 2025 by szutenberg Draft
[WIP] Updating docs for the vLLM 1.20 release
#798 opened Feb 7, 2025 by PatrykWo Loading…
[WIP]Deepseek r1 reuse kcache
#797 opened Feb 7, 2025 by jikunshang Loading…
Pin triton to v3.1.0 for HPU
#796 opened Feb 7, 2025 by iboiko-habana Loading…
Pin triton to v3.1.0 for HPU
#795 opened Feb 7, 2025 by iboiko-habana Loading…
Support qwenvl model for HPU
#793 opened Feb 7, 2025 by yingjie-han Loading…
Enable roberta embedding
#786 opened Feb 5, 2025 by yeonsily Loading…
Improve RMSNorm to support 2D inputs
#784 opened Feb 5, 2025 by YangQun1 Loading…
[SW-207299] Recalc scales from user
#774 opened Feb 3, 2025 by linoybu Loading…
Updated Troubleshooting section
#766 opened Jan 31, 2025 by MohitIntel Loading…
Fix warmup padding
#759 opened Jan 30, 2025 by mfylcek Draft
Initial enablement for text-embedding
#758 opened Jan 30, 2025 by libinta Loading…
Add basic CI checks for enc dec models question Further information is requested
#741 opened Jan 27, 2025 by jkaniecki Loading…
Allow tests to run in t.compile
#724 opened Jan 22, 2025 by Kacper-Pietkun Loading…
Delayed sampling
#720 opened Jan 22, 2025 by mfylcek Draft
ProTip! Updated in the last three days: updated:>2025-02-05.