vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 4.9k
Star 32.3k

Code
Issues 1.2k
Pull requests 415
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q4 2024

#9006 opened Oct 1, 2024 by simon-mo

Open 26

vLLM's V1 Engine Architecture

#8779 opened Sep 24, 2024 by simon-mo

Open 10

Labels 56 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,233 Open 4,437 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: IBM Granite 3.1 tool parser fails bug

Something isn't working

#11402 opened Dec 22, 2024 by K-Mistele

1 task done

[RFC]: Fully SPMD Execution for Offline Inference RFC

#11400 opened Dec 21, 2024 by eric-haibin-lin

1 task done

[RFC]: Flexible Weight Sync for vLLM Workers RFC

#11399 opened Dec 21, 2024 by ZSL98

1 task done

[Installation]: cannot install vllm with openvino backend installation

Installation problems

#11398 opened Dec 21, 2024 by yuzisun

1 task done

[Feature]: obtain logits feature request

#11397 opened Dec 21, 2024 by zhc7

1 task done

[Bug]: [v0.6.5] Streaming tool call responses with the hermes template is inconsistent with the non-stream version. bug

Something isn't working

#11392 opened Dec 21, 2024 by elementary-particle

1 task done

Where does the default number 43328 of KV cache come from and How can I change it? usage

How to use vllm

#11391 opened Dec 21, 2024 by george66s

1 task done

[RFC]: Hybrid Memory Allocator RFC

#11382 opened Dec 20, 2024 by heheda12345

1 task done

[Usage]: How do I run offline batch inference with Llama 405B BF16 across multinode (via SLURM) usage

How to use vllm

#11379 opened Dec 20, 2024 by aflah02

1 task done

[Bug]: Guided decoding crashes for GLM-4 model bug

Something isn't working

#11377 opened Dec 20, 2024 by frankang

1 task done

[Bug]: vLLM crashes on tokenized embedding input bug

Something isn't working

#11375 opened Dec 20, 2024 by FriedrichBethke

1 task done

[Bug]: vllm serve fails when passing --skip-tokenizer-init flag bug

Something isn't working

#11374 opened Dec 20, 2024 by ishitamed19

1 task done

[Feature]: Will vLLM support flash-attention 3 ? feature request

#11372 opened Dec 20, 2024 by jorgeantonio21

1 task done

[Bug]: Prefix caching doesn't work for LlavaOneVision bug

Something isn't working

#11371 opened Dec 20, 2024 by sleepwalker2017

1 task done

[Bug]: The service operation process results in occasional exception errors RuntimeError: CUDA error: an illegal memory access was encountered bug

Something isn't working

#11366 opened Dec 20, 2024 by pangr

1 task done

[Feature]: Add support for attention score output feature request

#11365 opened Dec 20, 2024 by WoutDeRijck

1 task done

[Misc]: What is 'residual' used for in the IntermediateTensor class? misc

#11364 opened Dec 20, 2024 by oldcpple

1 task done

[Bug]: priority scheduling doesn't work according to token_per_s. The token_per_s of requests with higher priorities is not higher than that of requests without priority settings. bug

Something isn't working

#11361 opened Dec 20, 2024 by kar9999

1 task done

[Feature]: meta-llama/Prompt-Guard-86M Usage Value Error. feature request

#11360 opened Dec 20, 2024 by burakaktas35

1 task done

[Bug]: vllm 0.6.3.post1 crash when deploy qwen2vl 72b bug

Something isn't working

#11356 opened Dec 20, 2024 by xxlight

1 task done

[Bug]: V100 cannot use the -enable-chunked-prefill method with dtype float16, but it can be used with dtype float32 bug

Something isn't working

#11352 opened Dec 20, 2024 by warlockedward

1 task done

[New Model]: answerdotai/ModernBERT-large new model

Requests to new models

#11347 opened Dec 19, 2024 by pooyadavoodi

1 task

[Bug]: no output of profile when VLLM_TORCH_PROFILER_DIR is enabled for vllm serve bug

Something isn't working

#11346 opened Dec 19, 2024 by ziyang-arch

1 task done

[Performance]: 1P1D Disaggregation performance performance

Performance-related issues

#11345 opened Dec 19, 2024 by Jeffwan

1 task done

[Bug]: Paligemma 2 model loading error bug

Something isn't working

#11343 opened Dec 19, 2024 by mmderakhshani

1 task done

Previous 1 2 3 4 5 … 49 50 Next

Previous Next

ProTip! What’s not been updated in a month: updated:<2024-11-21.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly