-
Notifications
You must be signed in to change notification settings - Fork 114
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Missing lookAheadRuntimeConfig in Triton Server with TensorRT-LLM backend HTTP Request
bug
Something isn't working
#711
opened Feb 18, 2025 by
shaylapid
2 of 4 tasks
Tritonserver Fails to Start with TensorRT-LLM Backend with lookahead_decoding mode - Assertion Failure in lookaheadDecodingLayer.cpp
bug
Something isn't working
#710
opened Feb 18, 2025 by
shaylapid
2 of 4 tasks
Failed to build TensorRT-LLM whisper Decoder
bug
Something isn't working
#707
opened Feb 14, 2025 by
muhammad-faizan-122
4 tasks
Inconsistent Batch Index Order in Decoupled Mode with trt-llm and triton trtllm backend
bug
Something isn't working
#705
opened Feb 14, 2025 by
Oldpan
2 of 4 tasks
numpy.ndarray' object is not callable in gpt2/1/lib/triton_decoder.py", line 160, in convert_triton_request
bug
Something isn't working
#702
opened Feb 12, 2025 by
freedom-168
4 tasks
Mllama ignores input image when deployed in triton
bug
Something isn't working
#692
opened Feb 5, 2025 by
mutkach
2 of 4 tasks
Performance of triton+trtllm on llava-onevision compared to vllm and sglang
#689
opened Feb 3, 2025 by
alexemme
Unable to build from source for tag Something isn't working
v0.16.0
.
bug
#686
opened Jan 30, 2025 by
jingzhaoou
2 of 4 tasks
DeepSeek-R1-Distill-Qwen-32B FP16 model does not work with Triton server + tensorrtllm_backend (but it works with just TensorRT-LLM)
bug
Something isn't working
#685
opened Jan 30, 2025 by
kelkarn
2 of 4 tasks
What is the purpose of shm-region-prefix-name and what is the prefix0_ files used for?
#684
opened Jan 28, 2025 by
sugam-nexusflow
"error": "Unable to parse 'inputs': attempt to access non-existing object member 'inputs'"
#683
opened Jan 28, 2025 by
adityarap
Beam search diversity lost with in-flight batching
bug
Something isn't working
#682
opened Jan 24, 2025 by
Grace-YingHuang
2 of 4 tasks
Assertion failed: sizeof(T) <= remaining_buffer_size
bug
Something isn't working
#679
opened Jan 14, 2025 by
gawain000000
2 of 4 tasks
Inference error encountered while using the draft target model.
bug
Something isn't working
#678
opened Jan 13, 2025 by
pimang62
2 of 4 tasks
Why tensorrt_llm_bls backend doesn't support speculative decoding streaming or bsz > 1?
#676
opened Jan 9, 2025 by
meowcoder22
Whisper - Missing parameters for triton deployment using tensorrt_llm backend
bug
Something isn't working
#672
opened Jan 2, 2025 by
eleapttn
2 of 4 tasks
Previous Next
ProTip!
Find all open issues with in progress development work with linked:pr.