[Docs] - Vector Store RAG Starter Flow (langflow-ai#1850)

mendonk · web-flow · commit 42714d35f152 · 2024-05-07T18:39:24.000-03:00
* initial-content

* additional-query
diff --git a/docs/docs/starter-projects/vector-store-rag.mdx b/docs/docs/starter-projects/vector-store-rag.mdx
@@ -0,0 +1,108 @@
+import ThemedImage from "@theme/ThemedImage";
+import useBaseUrl from "@docusaurus/useBaseUrl";
+import ZoomableImage from "/src/theme/ZoomableImage.js";
+import ReactPlayer from "react-player";
+import Admonition from "@theme/Admonition";
+
+# Vector store RAG
+
+Retrieval Augmented Generation, or RAG, is a pattern for training LLMs on your data and querying it.
+
+RAG is backed by a **vector store**, a vector database which stores embeddings of the ingested data.
+
+This enables **vector search**, a more powerful and context-aware search.
+
+We've chosen [Astra DB](https://astra.datastax.com/signup?utm_source=langflow-pre-release&utm_medium=referral&utm_campaign=langflow-announcement&utm_content=create-a-free-astra-db-account) as the vector database for this starter project, but you can follow along with any of Langflow's vector database options.
+
+## Prerequisites
+
+<Admonition type="info">
+  Langflow v1.0 alpha is also available in [HuggingFace Spaces](https://huggingface.co/spaces/Langflow/Langflow-Preview?duplicate=true). Try it out or follow the instructions [here](../getting-started/huggingface-spaces) to install it locally.
+</Admonition>
+
+* [Langflow installed](../getting-started/install-langflow.mdx)
+
+* [OpenAI API key](https://platform.openai.com)
+
+* [An Astra DB vector database created](https://docs.datastax.com/en/astra-db-serverless/get-started/quickstart.html) with:
+    * Application token (`AstraCS:WSnyFUhRxsrg…​`)
+    * API endpoint (`https://ASTRA_DB_ID-ASTRA_DB_REGION.apps.astra.datastax.com`)
+
+## Create the vector store RAG project
+
+1. From the Langflow dashboard, click **New Project**.
+2. Select **Vector Store RAG**.
+3. The **Vector Store RAG** flow is created.
+
+<ZoomableImage
+  alt="Docusaurus themed image"
+  sources={{
+    light: "img/vector-store-rag.png",
+    dark: "img/vector-store-rag.png",
+  }}
+  style={{ width: "80%", margin: "20px auto" }}
+/>
+
+The vector store RAG flow is built of two separate flows.
+
+The **ingestion** flow (bottom of the screen) populates the vector store with data from a local file.
+It ingests data from a file (**File**), splits it into chunks (**Recursive Character Text Splitter**), indexes it in Astra DB (**Astra DB**), and computes embeddings for the chunks (**OpenAI Embeddings**).
+This forms a "brain" for the query flow.
+
+The **query**  flow (top of the screen) allows users to chat with the embedded vector store data. It's a little more complex:
+
+* **Chat Input** component defines where to put the user input coming from the Playground.
+* **OpenAI Embeddings** component generates embeddings from the user input.
+* **Astra DB Search** component retrieves the most relevant Records from the Astra DB database.
+* **Text Output** component turns the Records into Text by concatenating them and also displays it in the Playground.
+* **Prompt** component takes in the user input and the retrieved Records as text and builds a prompt for the OpenAI model.
+* **OpenAI** component generates a response to the prompt.
+* **Chat Output** component displays the response in the Playground.
+
+4. To create an environment variable for the **OpenAI** component, in the **OpenAI API Key** field, click the **Globe** button, and then click **Add New Variable**.
+    1. In the **Variable Name** field, enter `openai_api_key`.
+    2. In the **Value** field, paste your OpenAI API Key (`sk-...`).
+    3. Click **Save Variable**.
+
+4. To create environment variables for the **Astra DB** and **Astra DB Search** components:
+    1. In the **Token** field, click the **Globe** button, and then click **Add New Variable**.
+    2. In the **Variable Name** field, enter `astra_token`.
+    3. In the **Value** field, paste your Astra application token (`AstraCS:WSnyFUhRxsrg…​`).
+    4. Click **Save Variable**.
+    5. Repeat the above steps for the **API Endpoint** field, pasting your Astra API Endpoint instead (`https://ASTRA_DB_ID-ASTRA_DB_REGION.apps.astra.datastax.com`).
+    6. Add the global variable to both the **Astra DB** and **Astra DB Search** components.
+
+## Run the vector store RAG flow
+
+1. Click the **Playground** button.
+The **Playground** opens, where you can chat with your data.
+2. Type a message and press Enter. (Try something like "What topics do you know about?")
+3. The bot will respond with a summary of the data you've embedded.
+
+For example, we embedded a PDF of an engine maintenance manual and asked, "How do I change the oil?"
+The bot responds:
+```
+To change the oil in the engine, follow these steps:
+
+Make sure the engine is turned off and cool before starting.
+
+Locate the oil drain plug on the bottom of the engine.
+
+Place a drain pan underneath the oil drain plug to catch the old oil...
+```
+
+We can also get more specific:
+
+```
+User
+What size wrench should I use to remove the oil drain cap?
+
+AI
+You should use a 3/8 inch wrench to remove the oil drain cap.
+```
+
+This is the size the engine manual lists as well. This confirms our flow works, because the query returns the unique knowledge we embedded from the Astra vector store.
+
+
+
+
diff --git a/docs/sidebars.js b/docs/sidebars.js
@@ -29,7 +29,7 @@ module.exports = {
         "starter-projects/blog-writer",
         "starter-projects/document-qa",
         "starter-projects/memory-chatbot",
-        "tutorials/rag-with-astradb"
+        "starter-projects/vector-store-rag"
       ],
     },
     {
diff --git a/docs/static/data/AstraDB-RAG-Flows.json b/docs/static/data/AstraDB-RAG-Flows.json
@@ -3396,7 +3396,7 @@
             "zoom": 0.2687057134854984
         }
     },
-    "description": "Visit https://pre-release.langflow.org/guides/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
+    "description": "Visit https://pre-release.langflow.org/tutorials/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
     "name": "Vector Store RAG",
     "last_tested_version": "1.0.0a0",
     "is_component": false
diff --git a/docs/static/img/vector-store-rag.png b/docs/static/img/vector-store-rag.png
diff --git a/src/backend/base/langflow/initial_setup/starter_projects/VectorStore-RAG-Flows.json b/src/backend/base/langflow/initial_setup/starter_projects/VectorStore-RAG-Flows.json
@@ -3403,7 +3403,7 @@
       "zoom": 0.2687057134854984
     }
   },
-  "description": "Visit https://pre-release.langflow.org/guides/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
+  "description": "Visit https://pre-release.langflow.org/tutorials/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
   "name": "Vector Store RAG",
   "last_tested_version": "1.0.0a0",
   "is_component": false

Original file line number	Diff line number	Diff line change
`@@ -29,7 +29,7 @@ module.exports = {`
`29`	`29`	`"starter-projects/blog-writer",`
`30`	`30`	`"starter-projects/document-qa",`
`31`	`31`	`"starter-projects/memory-chatbot",`
`32`		`- "tutorials/rag-with-astradb"`
	`32`	`+ "starter-projects/vector-store-rag"`
`33`	`33`	`],`
`34`	`34`	`},`
`35`	`35`	`{`
Original file line number	Diff line number	Diff line change
`@@ -3396,7 +3396,7 @@`
`3396`	`3396`	`"zoom": 0.2687057134854984`
`3397`	`3397`	`}`
`3398`	`3398`	`},`
`3399`		- "description": "Visit https://pre-release.langflow.org/guides/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
	`3399`	+ "description": "Visit https://pre-release.langflow.org/tutorials/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
`3400`	`3400`	`"name": "Vector Store RAG",`
`3401`	`3401`	`"last_tested_version": "1.0.0a0",`
`3402`	`3402`	`"is_component": false`
Original file line number	Diff line number	Diff line change
`@@ -3403,7 +3403,7 @@`
`3403`	`3403`	`"zoom": 0.2687057134854984`
`3404`	`3404`	`}`
`3405`	`3405`	`},`
`3406`		- "description": "Visit https://pre-release.langflow.org/guides/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
	`3406`	+ "description": "Visit https://pre-release.langflow.org/tutorials/rag-with-astradb for a detailed guide of this project.\nThis project give you both Ingestion and RAG in a single file. You'll need to visit https://astra.datastax.com/ to create an Astra DB instance, your Token and grab an API Endpoint.\nRunning this project requires you to add a file in the Files component, then define a Collection Name and click on the Play icon on the Astra DB component. \n\nAfter the ingestion ends you are ready to click on the Run button at the lower left corner and start asking questions about your data.",
`3407`	`3407`	`"name": "Vector Store RAG",`
`3408`	`3408`	`"last_tested_version": "1.0.0a0",`
`3409`	`3409`	`"is_component": false`