adding vertex endpoint embedding (run-llama#16351)

jzhao62 · Oct 4, 2024 · d712a87 · d712a87
1 parent 5130aa5
commit d712a87
Show file tree

Hide file tree

Showing 13 changed files with 685 additions and 0 deletions.
diff --git a/docs/docs/examples/embeddings/vertex_embedding_endpoint.ipynb b/docs/docs/examples/embeddings/vertex_embedding_endpoint.ipynb
@@ -0,0 +1,252 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<a href=\"https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/embeddings/sagemaker_embedding_endpoint.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Interacting with Embeddings deployed in Vertex AI Endpoint with LlamaIndex\n",
+    "\n",
+    "A Vertex AI endpoint is a managed resource that enables the deployment of machine learning models, such as embeddings, for making predictions on new data.\n",
+    "\n",
+    "This notebook demonstrates how to interact with embedding endpoints using the `VertexEndpointEmbedding` class, leveraging the LlamaIndex."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Setting Up\n",
+    "If you’re opening this Notebook on colab, you will probably need to install LlamaIndex 🦙. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install llama-index-embeddings-vertex-endpoint"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "! pip install llama-index"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You need to specify the endpoint information (endpoint ID, project ID, and region) to interact with the model deployed in Vertex AI."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "ENDPOINT_ID = \"<-YOUR-ENDPOINT-ID->\"\n",
+    "PROJECT_ID = \"<-YOUR-PROJECT-ID->\"\n",
+    "LOCATION = \"<-YOUR-GCP-REGION->\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Credentials should be provided to connect to the endpoint. You can either:\n",
+    "\n",
+    "- Use a service account JSON file by specifying the `service_account_file` parameter.\n",
+    "- Provide the service account information directly through the `service_account_info` parameter."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Example using a service account file:**"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.embeddings.vertex_endpoint import VertexEndpointEmbedding\n",
+    "\n",
+    "SERVICE_ACCOUNT_FILE = \"<-YOUR-SERVICE-ACCOUNT-FILE-PATH->.json\"\n",
+    "\n",
+    "embed_model = VertexEndpointEmbedding(\n",
+    "    endpoint_id=ENDPOINT_ID,\n",
+    "    project_id=PROJECT_ID,\n",
+    "    location=LOCATION,\n",
+    "    service_account_file=SERVICE_ACCOUNT_FILE,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "**Example using direct service account info:**:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from llama_index.embeddings.vertex_endpoint import VertexEndpointEmbedding\n",
+    "\n",
+    "SERVICE_ACCOUNT_INFO = {\n",
+    "    \"private_key\": \"<-PRIVATE-KEY->\",\n",
+    "    \"client_email\": \"<-SERVICE-ACCOUNT-EMAIL->\",\n",
+    "    \"token_uri\": \"https://oauth2.googleapis.com/token\",\n",
+    "}\n",
+    "\n",
+    "embed_model = VertexEndpointEmbedding(\n",
+    "    endpoint_id=ENDPOINT_ID,\n",
+    "    project_id=PROJECT_ID,\n",
+    "    location=LOCATION,\n",
+    "    service_account_info=SERVICE_ACCOUNT_INFO,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Basic Usage"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Call `get_text_embedding` "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "embeddings = embed_model.get_text_embedding(\n",
+    "    \"Vertex AI is a managed machine learning (ML) platform provided by Google Cloud. It allows data scientists and developers to build, deploy, and scale machine learning models efficiently, leveraging Google's ML infrastructure.\"\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "[0.011612358,\n",
+       " 0.01030837,\n",
+       " -0.04710829,\n",
+       " -0.030719217,\n",
+       " 0.027658276,\n",
+       " -0.031597693,\n",
+       " 0.012065322,\n",
+       " -0.037609763,\n",
+       " 0.02321099,\n",
+       " 0.012868305]"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "embeddings[:10]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Call `get_text_embedding_batch` "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "embeddings = embed_model.get_text_embedding_batch(\n",
+    "    [\n",
+    "        \"Vertex AI is a managed machine learning (ML) platform provided by Google Cloud. It allows data scientists and developers to build, deploy, and scale machine learning models efficiently, leveraging Google's ML infrastructure.\",\n",
+    "        \"Vertex is integrated with llamaIndex\",\n",
+    "    ]\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "2"
+      ]
+     },
+     "execution_count": null,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(embeddings)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}