Skip to content

Commit

Permalink
Multimodal summarizer notebook (#109)
Browse files Browse the repository at this point in the history
* Multimodal summarizer notebook

* Modified readme

* removed info() method
  • Loading branch information
mogith-pn authored Jan 15, 2025
1 parent 15da800 commit 1f227e4
Show file tree
Hide file tree
Showing 2 changed files with 150 additions and 1 deletion.
149 changes: 149 additions & 0 deletions Data_Utils/Ingestion pipelines/Multimodal_ingest_RAG_notebook.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data ingestion notebook for multimodal ingest\n",
"In this notebook, we are taking a PDF consists of image and texts and ingesting into clarifai app with ready to do multimodal RAG over it."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"id": "zyF2oVc-txve"
},
"outputs": [],
"source": [
"import os\n",
"import sys\n",
"os.environ['CLARIFAI_PAT'] = 'YOUR_PAT'"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "gduNHxvVxCLJ"
},
"outputs": [],
"source": [
"!pip install clarifai-datautils\n",
"!pip install 'unstructured[pdf] @ git+https://github.com/clarifai/unstructured.git@support_clarifai_model'\n",
"!pip install opencv-python-headless"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "L0WnHvyT_dR3"
},
"outputs": [],
"source": [
"!sudo apt-get update\n",
"!sudo apt-get install -y poppler-utils tesseract-ocr libgl1-mesa-glx"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "XDNenafv_qej",
"outputId": "88fae3d3-ee3b-4005-f3f1-891d44db6d7a"
},
"outputs": [],
"source": [
"import nltk\n",
"nltk.download('punkt_tab')\n",
"nltk.download('averaged_perceptron_tagger_eng')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Initialise the Pipeline and extractors\n",
"- By default the ImageSummarizer uses Qwen-VL Vision model for the task. If you want to use different model like **OpenAI gpt-4o**, you can use this URL - https://clarifai.com/openai/chat-completion/models/gpt-4o\n",
"- Check out the community page for more models - [Vision models](https://clarifai.com/explore/models?page=1&perPage=24&filterData=%5B%7B%22field%22%3A%22model_type_id%22%2C%22value%22%3A%5B%22multimodal-to-text%22%5D%7D%5D)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "GcMC5u6Jtxve",
"outputId": "c0633793-e0dc-49ea-a130-af557175f320"
},
"outputs": [],
"source": [
"from clarifai_datautils.multimodal import Pipeline\n",
"from clarifai_datautils.multimodal.pipeline.cleaners import Clean_extra_whitespace\n",
"from clarifai_datautils.multimodal.pipeline.PDF import PDFPartitionMultimodal\n",
"from clarifai_datautils.multimodal.pipeline.summarizer import ImageSummarizer\n",
"\n",
"# Define the pipeline\n",
"pipeline = Pipeline(\n",
" name='pipeline-1',\n",
" transformations=[\n",
" PDFPartitionMultimodal(chunking_strategy = \"by_title\",max_characters = 1024),\n",
" Clean_extra_whitespace(),\n",
" ImageSummarizer(model_url = \"CLARIFAI_MODEL_URL\") # Replace CLARIFAI_MODEL_URL with the model URL\n",
" ]\n",
")\n",
"pipeline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/",
"height": 420
},
"id": "hK3zrrbltxvg",
"outputId": "6162d48c-1c32-460e-be18-e6a11b9e0571"
},
"outputs": [],
"source": [
"# Using SDK to upload\n",
"from clarifai.client import Dataset\n",
"\n",
"dataset = Dataset(url='YOUR_DATASET_URL', pat=os.environ['CLARIFAI_PAT'])\n",
"dataset.upload_dataset(pipeline.run(files=\"YOUR_FILE\", loader=True))"
]
}
],
"metadata": {
"colab": {
"provenance": []
},
"kernelspec": {
"display_name": "3.9.10",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ dataset.upload_dataset(task="text_clf", split="train", module_dir="path_to_imdb_
| Image | [Image Annotation ](Data_Utils/Image%20Annotation/) | [annotation loader](Data_Utils/Image%20Annotation/image_annotation_loader.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Image%20Annotation/image_annotation_loader.ipynb) |
| Multimodal | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/) | [Ready to Use Pipelines](Data_Utils/Ingestion%20pipelines/Ready_to_use_foundational_pipelines.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Ready_to_use_foundational_pipelines.ipynb) |
| | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/) | [Multimodal Ingestion](Data_Utils/Ingestion%20pipelines/Multimodal_dataloader.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Multimodal_dataloader.ipynb) |

| | [Data Ingestion ](Data_Utils/Ingestion%20pipelines/) |[Advanced Multimodal Ingestion with summarizer](Data_Utils/Ingestion%20pipelines/Multimodal_ingest_RAG_notebook.ipynb) | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Clarifai/examples/blob/main/Data_Utils/Ingestion%20pipelines/Multimodal_ingest_RAG_notebook.ipynb) |

## Model upload examples

Expand Down

0 comments on commit 1f227e4

Please sign in to comment.