Skip to content

Commit 2cd1fa4

Browse files
authored
[Misc] add Haystack integration (#18601)
Signed-off-by: reidliu41 <[email protected]> Co-authored-by: reidliu41 <[email protected]>
1 parent d4c2919 commit 2cd1fa4

File tree

1 file changed

+60
-0
lines changed

1 file changed

+60
-0
lines changed
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
---
2+
title: Haystack
3+
---
4+
[](){ #deployment-haystack }
5+
6+
# Haystack
7+
8+
[Haystack](https://github.com/deepset-ai/haystack) is an end-to-end LLM framework that allows you to build applications powered by LLMs, Transformer models, vector search and more. Whether you want to perform retrieval-augmented generation (RAG), document search, question answering or answer generation, Haystack can orchestrate state-of-the-art embedding models and LLMs into pipelines to build end-to-end NLP applications and solve your use case.
9+
10+
It allows you to deploy a large language model (LLM) server with vLLM as the backend, which exposes OpenAI-compatible endpoints.
11+
12+
## Prerequisites
13+
14+
- Setup vLLM and Haystack environment
15+
16+
```console
17+
pip install vllm haystack-ai
18+
```
19+
20+
## Deploy
21+
22+
- Start the vLLM server with the supported chat completion model, e.g.
23+
24+
```console
25+
vllm serve mistralai/Mistral-7B-Instruct-v0.1
26+
```
27+
28+
- Use the `OpenAIGenerator` and `OpenAIChatGenerator` components in Haystack to query the vLLM server.
29+
30+
```python
31+
from haystack.components.generators.chat import OpenAIChatGenerator
32+
from haystack.dataclasses import ChatMessage
33+
from haystack.utils import Secret
34+
35+
generator = OpenAIChatGenerator(
36+
# for compatibility with the OpenAI API, a placeholder api_key is needed
37+
api_key=Secret.from_token("VLLM-PLACEHOLDER-API-KEY"),
38+
model="mistralai/Mistral-7B-Instruct-v0.1",
39+
api_base_url="http://{your-vLLM-host-ip}:{your-vLLM-host-port}/v1",
40+
generation_kwargs = {"max_tokens": 512}
41+
)
42+
43+
response = generator.run(
44+
messages=[ChatMessage.from_user("Hi. Can you help me plan my next trip to Italy?")]
45+
)
46+
47+
print("-"*30)
48+
print(response)
49+
print("-"*30)
50+
```
51+
52+
Output e.g.:
53+
54+
```console
55+
------------------------------
56+
{'replies': [ChatMessage(_role=<ChatRole.ASSISTANT: 'assistant'>, _content=[TextContent(text=' Of course! Where in Italy would you like to go and what type of trip are you looking to plan?')], _name=None, _meta={'model': 'mistralai/Mistral-7B-Instruct-v0.1', 'index': 0, 'finish_reason': 'stop', 'usage': {'completion_tokens': 23, 'prompt_tokens': 21, 'total_tokens': 44, 'completion_tokens_details': None, 'prompt_tokens_details': None}})]}
57+
------------------------------
58+
```
59+
60+
For details, see the tutorial [Using vLLM in Haystack](https://github.com/deepset-ai/haystack-integrations/blob/main/integrations/vllm.md).

0 commit comments

Comments
 (0)