Multimodal summarizer notebook (#109)

* Multimodal summarizer notebook * Modified readme * removed info() method
Clarifai · Jan 15, 2025 · 1f227e4 · 1f227e4
1 parent 15da800
commit 1f227e4
Show file tree

Hide file tree

Showing 2 changed files with 150 additions and 1 deletion.
diff --git a/Data_Utils/Ingestion pipelines/Multimodal_ingest_RAG_notebook.ipynb b/Data_Utils/Ingestion pipelines/Multimodal_ingest_RAG_notebook.ipynb
@@ -0,0 +1,149 @@
+{
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "## Data ingestion notebook for multimodal ingest\n",
+        "In this notebook, we are taking a PDF consists of image and texts and ingesting into clarifai app with ready to do multimodal RAG over it."
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": 18,
+      "metadata": {
+        "id": "zyF2oVc-txve"
+      },
+      "outputs": [],
+      "source": [
+        "import os\n",
+        "import sys\n",
+        "os.environ['CLARIFAI_PAT'] = 'YOUR_PAT'"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "gduNHxvVxCLJ"
+      },
+      "outputs": [],
+      "source": [
+        "!pip install clarifai-datautils\n",
+        "!pip install 'unstructured[pdf] @ git+https://github.com/clarifai/unstructured.git@support_clarifai_model'\n",
+        "!pip install opencv-python-headless"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "id": "L0WnHvyT_dR3"
+      },
+      "outputs": [],
+      "source": [
+        "!sudo apt-get update\n",
+        "!sudo apt-get install -y poppler-utils tesseract-ocr libgl1-mesa-glx"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "XDNenafv_qej",
+        "outputId": "88fae3d3-ee3b-4005-f3f1-891d44db6d7a"
+      },
+      "outputs": [],
+      "source": [
+        "import nltk\n",
+        "nltk.download('punkt_tab')\n",
+        "nltk.download('averaged_perceptron_tagger_eng')"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {},
+      "source": [
+        "### Initialise the Pipeline and extractors\n",
+        "- By default the ImageSummarizer uses Qwen-VL Vision model for the task. If you want to use different model like **OpenAI gpt-4o**, you can use this URL - https://clarifai.com/openai/chat-completion/models/gpt-4o\n",
+        "- Check out the community page for more models - [Vision models](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22multimodal-to-text%22%5D%7D%5D)"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/"
+        },
+        "id": "GcMC5u6Jtxve",
+        "outputId": "c0633793-e0dc-49ea-a130-af557175f320"
+      },
+      "outputs": [],
+      "source": [
+        "from clarifai_datautils.multimodal import Pipeline\n",
+        "from clarifai_datautils.multimodal.pipeline.cleaners import Clean_extra_whitespace\n",
+        "from clarifai_datautils.multimodal.pipeline.PDF import PDFPartitionMultimodal\n",
+        "from clarifai_datautils.multimodal.pipeline.summarizer import ImageSummarizer\n",
+        "\n",
+        "# Define the pipeline\n",
+        "pipeline = Pipeline(\n",
+        "    name='pipeline-1',\n",
+        "    transformations=[\n",
+        "        PDFPartitionMultimodal(chunking_strategy = \"by_title\",max_characters = 1024),\n",
+        "        Clean_extra_whitespace(),\n",
+        "        ImageSummarizer(model_url = \"CLARIFAI_MODEL_URL\") # Replace CLARIFAI_MODEL_URL with the model URL\n",
+        "    ]\n",
+        ")\n",
+        "pipeline"
+      ]
+    },
+    {
+      "cell_type": "code",
+      "execution_count": null,
+      "metadata": {
+        "colab": {
+          "base_uri": "https://localhost:8080/",
+          "height": 420
+        },
+        "id": "hK3zrrbltxvg",
+        "outputId": "6162d48c-1c32-460e-be18-e6a11b9e0571"
+      },
+      "outputs": [],
+      "source": [
+        "# Using SDK to upload\n",
+        "from clarifai.client import Dataset\n",
+        "\n",
+        "dataset = Dataset(url='YOUR_DATASET_URL', pat=os.environ['CLARIFAI_PAT'])\n",
+        "dataset.upload_dataset(pipeline.run(files=\"YOUR_FILE\", loader=True))"
+      ]
+    }
+  ],
+  "metadata": {
+    "colab": {
+      "provenance": []
+    },
+    "kernelspec": {
+      "display_name": "3.9.10",
+      "language": "python",
+      "name": "python3"
+    },
+    "language_info": {
+      "codemirror_mode": {
+        "name": "ipython",
+        "version": 3
+      },
+      "file_extension": ".py",
+      "mimetype": "text/x-python",
+      "name": "python",
+      "nbconvert_exporter": "python",
+      "pygments_lexer": "ipython3",
+      "version": "3.9.10"
+    }
+  },
+  "nbformat": 4,
+  "nbformat_minor": 0
+}
diff --git a/README.md b/README.md
@@ -91,7 +91,7 @@ dataset.upload_dataset(task="text_clf", split="train", module_dir="path_to_imdb_
 | Image   | [Image Annotation ](Data_Utils/Image%20Annotation/)   | [annotation loader](Data_Utils/Image%20Annotation/image_annotation_loader.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Image%20Annotation/image_annotation_loader.ipynb) |
 | Multimodal   | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/)   | [Ready to Use Pipelines](Data_Utils/Ingestion%20pipelines/Ready_to_use_foundational_pipelines.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Ready_to_use_foundational_pipelines.ipynb) |
 |     | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/)   | [Multimodal Ingestion](Data_Utils/Ingestion%20pipelines/Multimodal_dataloader.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Multimodal_dataloader.ipynb) |
-
+| | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/) |[Advanced Multimodal Ingestion with summarizer](Data_Utils/Ingestion%20pipelines/Multimodal_ingest_RAG_notebook.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Multimodal_ingest_RAG_notebook.ipynb) |
 
 ## Model upload examples