Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support soft_prompt or inputs_embeds? #267

Open
jessiewiswjc opened this issue Dec 28, 2023 · 2 comments
Open

Support soft_prompt or inputs_embeds? #267

jessiewiswjc opened this issue Dec 28, 2023 · 2 comments
Assignees
Labels
question Further information is requested triaged Issue has been triaged by maintainers

Comments

@jessiewiswjc
Copy link

Does triton-infernece-server support multi-modal models such as blip2 in trt-llm https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/blip2?

@juney-nvidia
Copy link
Collaborator

@jessiewiswjc

Hi, although we haven't provided such example of BLIP2 pipeline in Triton backend repo, the entire inference sequence can be organized in sequential order so there should not be any blockers preventing you from calling the run.py in Triton backend.
Did you meet any specific issue?

June

@juney-nvidia juney-nvidia self-assigned this Jan 1, 2024
@juney-nvidia juney-nvidia added question Further information is requested triaged Issue has been triaged by maintainers labels Jan 1, 2024
@jessiewiswjc
Copy link
Author

@jessiewiswjc

Hi, although we haven't provided such example of BLIP2 pipeline in Triton backend repo, the entire inference sequence can be organized in sequential order so there should not be any blockers preventing you from calling the run.py in Triton backend. Did you meet any specific issue?

June

I am sorry to hear NVIDIA/TensorRT-LLM#695 (comment). Does the preprocessing in triton not support multi-modal models?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants