-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Usage]: Can we extend the context length of gemma2 model or other models?
usage
How to use vllm
#10548
opened Nov 21, 2024 by
hahmad2008
1 task done
[Installation]: can't get the cu118 version of vllm 0.6.3 by https://github.com/vllm-project/vllm/releases/download/v0.6.3/vllm-0.6.3+cu118-cp310-cp310-manylinux1_x86_64.whl
installation
Installation problems
#10540
opened Nov 21, 2024 by
mayfool
1 task done
[Feature]: Support for Registering Model-Specific Default Sampling Parameters
feature request
#10539
opened Nov 21, 2024 by
yansh97
1 task done
[Usage]: How to use ROPE scaling for llama3.1 and gemma2?
usage
How to use vllm
#10537
opened Nov 21, 2024 by
hahmad2008
1 task done
[Usage]: Fail to load params.json
usage
How to use vllm
#10534
opened Nov 21, 2024 by
dequeueing
1 task done
[Bug]: Authorization ignored when root_path is set
bug
Something isn't working
#10531
opened Nov 21, 2024 by
OskarLiew
1 task done
[Usage]: Optimizing TTFT for Qwen2.5-72B Model Deployment on A800 GPUs for RAG Application
usage
How to use vllm
#10527
opened Nov 21, 2024 by
zhanghx0905
1 task done
[Feature]: Additional possible value for
tool_choice
: required
feature request
#10526
opened Nov 21, 2024 by
fahadh4ilyas
1 task done
[Bug]: torch.distributed.DistBackendError: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:1333, unhandled system error (run with NCCL_DEBUG=INFO for details), NCCL version 2.18.1
bug
Something isn't working
#10523
opened Nov 21, 2024 by
QualityGN
1 task done
[Usage]: when i set --tensor-parallel-size 4 ,openai server dose not work . Report a new Exception
usage
How to use vllm
#10521
opened Nov 21, 2024 by
Geek-Peng
1 task done
[Usage]: What's the relationship between KV cache and MAX_SEQUENCE_LENGTH.
usage
How to use vllm
#10517
opened Nov 21, 2024 by
GRuuuuu
1 task done
[Bug]: Model does not split in multiple Gpus instead it occupy same memory on each GPU
bug
Something isn't working
#10516
opened Nov 21, 2024 by
anilkumar0502
1 task done
[Feature]: Manually inject Prefix KV Cache
feature request
#10515
opened Nov 21, 2024 by
toilaluan
1 task done
[Bug]: I'm trying to run Pixtral-Large-Instruct-2411 using vllm, following the documentation at https://huggingface.co/mistralai/Pixtral-Large-Instruct-2411, but I encountered an error.
bug
Something isn't working
#10512
opened Nov 21, 2024 by
eii-lyl
1 task done
Metrics model name when using multiple loras
bug
Something isn't working
#10504
opened Nov 20, 2024 by
mces89
1 task done
[Feature]: Support outlines versions > v0.1.0
feature request
#10489
opened Nov 20, 2024 by
Treparme
1 task done
[Bug]: speculative_draft_tensor_parallel_size=4 cannot be other value than 1
bug
Something isn't working
#10483
opened Nov 20, 2024 by
chenchunhui97
1 task done
[Bug]: LLaMA3.2-1B finetuned w/ Sentence Transformer. --> ValueError: Model architectures ['LlamaModel'] are not supported for now.
bug
Something isn't working
#10481
opened Nov 20, 2024 by
thusinh1969
1 task done
[Bug]: vLLM CPU mode broken Unable to get JIT kernel for brgemm
bug
Something isn't working
#10478
opened Nov 20, 2024 by
samos123
1 task done
[Usage]: Cant use vllm on a multiGPU node
usage
How to use vllm
#10474
opened Nov 20, 2024 by
4k1s
1 task done
[Installation]: VLLM on ARM machine with GH200
installation
Installation problems
#10459
opened Nov 19, 2024 by
Phimanu
1 task done
[Bug]: Unable to configure formatter 'vllm'
bug
Something isn't working
#10457
opened Nov 19, 2024 by
pspdada
1 task done
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.