Zipstack
diff --git a/‎README.md‎
Lines changed: 29 additions & 20 deletions b/‎README.md‎
Lines changed: 29 additions & 20 deletions
diff --git a/‎backend/account_v2/templates/login.html‎
Lines changed: 2 additions & 4 deletions b/‎backend/account_v2/templates/login.html‎
Lines changed: 2 additions & 4 deletions
diff --git a/‎backend/api_v2/api_deployment_views.py‎
Lines changed: 2 additions & 0 deletions b/‎backend/api_v2/api_deployment_views.py‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎backend/api_v2/constants.py‎
Lines changed: 1 addition & 0 deletions b/‎backend/api_v2/constants.py‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎backend/api_v2/deployment_helper.py‎
Lines changed: 3 additions & 0 deletions b/‎backend/api_v2/deployment_helper.py‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎backend/api_v2/serializers.py‎
Lines changed: 28 additions & 15 deletions b/‎backend/api_v2/serializers.py‎
Lines changed: 28 additions & 15 deletions
diff --git a/‎backend/prompt_studio/prompt_studio_registry_v2/prompt_studio_registry_helper.py‎
Lines changed: 2 additions & 4 deletions b/‎backend/prompt_studio/prompt_studio_registry_v2/prompt_studio_registry_helper.py‎
Lines changed: 2 additions & 4 deletions
diff --git a/‎backend/pyproject.toml‎
Lines changed: 1 addition & 1 deletion b/‎backend/pyproject.toml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎backend/sample.env‎
Lines changed: 2 additions & 2 deletions b/‎backend/sample.env‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎backend/tool_instance_v2/views.py‎
Lines changed: 1 addition & 2 deletions b/‎backend/tool_instance_v2/views.py‎
Lines changed: 1 addition & 2 deletions
@@ -3,9 +3,8 @@
 
 # Unstract
 
-## No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
+## The Data Layer for your Agentic Workflows—Automate Document-based workflows with close to 100% accuracy!
 
-##
 
 ![Python Version from PEP 621 TOML](https://img.shields.io/python/required-version-toml?tomlFilePath=https%3A%2F%2Fraw.githubusercontent.com%2FZipstack%2Funstract%2Frefs%2Fheads%2Fmain%2Fpyproject.toml)
 [![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
@@ -23,26 +22,44 @@
 
 ## 🤖 Prompt Studio
 
-Prompt Studio's primary reason for existence is so you can develop the necessary prompts for document data extraction super efficiently. It is a purpose-built environment that makes this not just easy for you—but, lot of fun! The document sample, its variants, the prompts you're developing, outputs from different LLMs, the schema you're developing, costing details of the extraction and various tools that let you measure the effectiveness of your prompts are just a click away and easily accessible. Prompt Studio is designed for effective and high speed development and iteration of prompts for document data extraction. Welcome to IDP 2.0!
-
+Prompt Studio is a purpose-built environment that supercharges your schema definition efforts. Compare outputs from different LLMs side-by-side, keep tab on costs while you develop generic prompts that work across wide-ranging document variations. And when you're ready, launch extraction APIs with a single click.
 
 ![img Prompt Studio](docs/assets/prompt_studio.png)
 
-## 🧘‍♀️ Three step nirvana with Workflow Studio
+## 🔌 Integrations that suit your environment
+
+Once you've used Prompt Studio to define your schema, Unstract makes it easy to integrate into your existing workflows. Simply choose the integration type that best fits your environment:
+
+| Integration Type | Description | Best For | Documentation |
+|------------------|-------------|----------|---------------|
+| 🖥️ **MCP Servers** | Run Unstract as an MCP Server to provide structured data extraction to Agents or LLMs in your ecosystem. | Developers building **Agentic/LLM apps/tools** that speak MCP. | [Unstract MCP Server Docs](https://docs.unstract.com/unstract/unstract_platform/mcp/unstract_platform_mcp_server/) |
+| 🌐 **API Deployments** | Turn any document into JSON with an API call. Deploy any Prompt Studio project as a REST API endpoint with a single click. | Teams needing **programmatic access** in apps, services, or custom tooling. | [API Deployment Docs](https://docs.unstract.com/unstract/unstract_platform/api_deployment/unstract_api_deployment_intro/) |
+| ⚙️ **ETL Pipelines** | Embed Unstract directly into your ETL jobs to transform unstructured data before loading it into your warehouse / database. | **Engineering and Data engineering teams** that need to batch process documents into clean JSON. | [ETL Pipelines Docs](https://docs.unstract.com/unstract/unstract_platform/etl_pipeline/unstract_etl_pipeline_intro/) |
+| 🧩 **n8n Nodes** | Use Unstract as ready-made nodes in n8n workflows for drag-and-drop automation. | **Low-code users** and **ops teams** automating workflows. | [Unstract n8n Nodes Docs](https://docs.unstract.com/unstract/unstract_platform/api_deployment/unstract_api_deployment_n8n_custom_node/) |
+
+## ☁️ Getting Started (Cloud / Enterprise)
 
-Automate critical business processes that involve complex documents with a human in the loop. Go beyond RPA with the power of Large Language Models.
+The easy-peasy way to try Unstract is to [sign up for a **14-day free trial**](https://unstract.com/start-for-free/). Give Unstract a spin now!  
 
-🌟 **Step 1**: Add documents to no-code Prompt Studio and do prompt engineering to extract required fields <br>
-🌟 **Step 2**: Configure Prompt Studio project as API deployment or configure input source and output destination for ETL Pipeline<br>
-🌟 **Step 3**: Deploy Workflows as unstructured data APIs or unstructured data ETL Pipelines!
+Unstract Cloud also comes with some really awesome features that give serious accuracy boosts to agentic/LLM-powered document-centric workflows in the enterprise.
 
-![img Using Unstract](docs/assets/Using_Unstract.png)
+| Feature | Description | Documentation |
+|---------|-------------|---------------|
+| 🧪 **LLMChallenge** | Uses two Large Language Models to ensure trustworthy output. You either get the right response or no response at all. | [Docs](https://docs.unstract.com/unstract/unstract_platform/features/llm_challenge/llm_challenge_intro/) |
+| ⚡ **SinglePass Extraction** | Reduces LLM token usage by up to **8x**, dramatically cutting costs. | [Docs](https://docs.unstract.com/unstract/editions/cloud_edition/#singlepass-extraction) |
+| 📉 **SummarizedExtraction** | Reduces LLM token usage by up to **6x**, saving costs while keeping accuracy. | [Docs](https://docs.unstract.com/unstract/unstract_platform/features/summarized_extraction/summarized_extraction_intro/) |
+| 👀 **Human-In-The-Loop** | Side-by-side comparison of extracted value and source document, with highlighting for human review and tweaking. | [Docs](https://docs.unstract.com/unstract/unstract_platform/human_quality_review/human_quality_review_intro/) |
+| 🔐 **SSO Support** | Enterprise-ready authentication options for seamless onboarding and off-boarding. | [Docs](https://docs.unstract.com/unstract/editions/cloud_edition/#enterprise-features) |
+
+## ⏩ Quick Start Guide
+
+Unstract comes well documented. You can get introduced to the [basics of Unstract](https://docs.unstract.com/unstract/), and [learn how to connect](https://docs.unstract.com/unstract/unstract_platform/setup_accounts/whats_needed) various systems like LLMs, Vector Databases, Embedding Models and Text Extractors to it. The easiest way to wet your feet is to go through our [Quick Start Guide](https://docs.unstract.com/unstract/unstract_platform/quick_start) where you actually get to do some prompt engineering in Prompt Studio and launch an API to structure varied credit card statements!
 
-## 🚀 Getting started
+## 🚀 Getting started (self-hosted)
 
 ### System Requirements
 
-- 8GB RAM (recommended)
+- 8GB RAM (minimum)
 
 ### Prerequisites
 
@@ -57,7 +74,6 @@ Next, either download a release or clone this repo and do the following:
 ✅ Now visit [http://frontend.unstract.localhost](http://frontend.unstract.localhost) in your browser <br>
 ✅ Use username and password `unstract` to login
 
-
 That's all there is to it!
 
 Follow [these steps](backend/README.md#authentication) to change the default username and password.
@@ -93,10 +109,6 @@ Unstract supports a wide range of file formats for document processing:
 | | TIFF | Tagged Image File Format |
 | | WEBP | Web Picture Format |
 
-## ⏩ Quick Start Guide
-
-Unstract comes well documented. You can get introduced to the [basics of Unstract](https://docs.unstract.com/unstract/), and [learn how to connect](https://docs.unstract.com/unstract/unstract_platform/setup_accounts/whats_needed) various systems like LLMs, Vector Databases, Embedding Models and Text Extractors to it. The easiest way to wet your feet is to go through our [Quick Start Guide](https://docs.unstract.com/unstract/unstract_platform/quick_start) where you actually get to do some prompt engineering in Prompt Studio and launch an API to structure varied credit card statements!
-
 ## 🤝 Ecosystem support
 
 ### LLM Providers
@@ -113,7 +125,6 @@ Unstract comes well documented. You can get introduced to the [basics of Unstrac
 | <img src="docs/assets/3rd_party/anyscale.png" width="32"/>     | Anyscale                    | ✅ Working |
 | <img src="docs/assets/3rd_party/mistral_ai.png" width="32"/>   | Mistral AI                  | ✅ Working |
 
-
 ### Vector Databases
 
 || Provider | Status |
@@ -124,8 +135,6 @@ Unstract comes well documented. You can get introduced to the [basics of Unstrac
 |<img src="docs/assets/3rd_party/postgres.png" width="32"/>| PostgreSQL | ✅ Working |
 |<img src="docs/assets/3rd_party/milvus.png" width="32"/>| Milvus | ✅ Working |
 
-
-
 ### Embeddings
 
 || Provider | Status |
 
@@ -94,8 +94,7 @@
         .logo-box{
             width: 100%;
             text-align: center;
-            margin-bottom: 20px;
-            margin-top: 20px;
+            margin: 20px 0;
         }
         .login-heading{
             font-size: 24px;
@@ -109,9 +108,8 @@
             <!-- Spinner animation -->
             <div class="lds-dual-ring"></div>
         </div>
-        {% load static %}
         <div class="logo-box">
-            <img src="{% static 'logo.svg' %}" alt="My image">
+            <img src="/icons/logo.svg" alt="Unstract Logo">
         </div>
         <h2 class="login-heading">Login</h2>
         {% if error_message %}
 
@@ -70,6 +70,7 @@ def post(
         tag_names = serializer.validated_data.get(ApiExecution.TAGS)
         llm_profile_id = serializer.validated_data.get(ApiExecution.LLM_PROFILE_ID)
         hitl_queue_name = serializer.validated_data.get(ApiExecution.HITL_QUEUE_NAME)
+        custom_data = serializer.validated_data.get(ApiExecution.CUSTOM_DATA)
 
         if presigned_urls:
             DeploymentHelper.load_presigned_files(presigned_urls, file_objs)
@@ -85,6 +86,7 @@ def post(
             tag_names=tag_names,
             llm_profile_id=llm_profile_id,
             hitl_queue_name=hitl_queue_name,
+            custom_data=custom_data,
             request_headers=dict(request.headers),
         )
         if "error" in response and response["error"]:
 
@@ -11,3 +11,4 @@ class ApiExecution:
     LLM_PROFILE_ID: str = "llm_profile_id"
     HITL_QUEUE_NAME: str = "hitl_queue_name"
     PRESIGNED_URLS: str = "presigned_urls"
+    CUSTOM_DATA: str = "custom_data"
@@ -155,6 +155,7 @@ def execute_workflow(
         tag_names: list[str] = [],
         llm_profile_id: str | None = None,
         hitl_queue_name: str | None = None,
+        custom_data: dict[str, Any] | None = None,
         request_headers=None,
     ) -> ReturnDict:
         """Execute workflow by api.
@@ -168,6 +169,7 @@ def execute_workflow(
             tag_names (list(str)): list of tag names
             llm_profile_id (str, optional): LLM profile ID for overriding tool settings
             hitl_queue_name (str, optional): Custom queue name for manual review
+            custom_data (dict[str, Any], optional): JSON data for custom_data variable replacement in prompts
 
         Returns:
             ReturnDict: execution status/ result
@@ -234,6 +236,7 @@ def execute_workflow(
                 use_file_history=use_file_history,
                 llm_profile_id=llm_profile_id,
                 hitl_queue_name=hitl_queue_name,
+                custom_data=custom_data,
             )
             result.status_api = DeploymentHelper.construct_status_endpoint(
                 api_endpoint=api.api_endpoint, execution_id=execution_id
 
@@ -195,21 +195,23 @@ class ExecutionRequestSerializer(TagParamsSerializer):
     """Execution request serializer.
 
     Attributes:
-        timeout (int): Timeout for the API deployment, maximum value can be 300s.
-            If -1 it corresponds to async execution. Defaults to -1
-        include_metadata (bool): Flag to include metadata in API response
-        include_metrics (bool): Flag to include metrics in API response
-        use_file_history (bool): Flag to use FileHistory to save and retrieve
-            responses quickly. This is undocumented to the user and can be
-            helpful for demos.
-        tags (str): Comma-separated List of tags to associate with the execution.
-            e.g:'tag1,tag2-name,tag3_name'
-        llm_profile_id (str): UUID of the LLM profile to override the default profile.
-            If not provided, the default profile will be used.
-        hitl_queue_name (str, optional): Document class name for manual review queue.
-            If not provided, uses API name as document class.
-        presigned_urls (list): List of presigned URLs to fetch files from.
-            URLs are validated for HTTPS and S3 endpoint requirements.
+            timeout (int): Timeout for the API deployment, maximum value can be 300s.
+                If -1 it corresponds to async execution. Defaults to -1
+            include_metadata (bool): Flag to include metadata in API response
+            include_metrics (bool): Flag to include metrics in API response
+            use_file_history (bool): Flag to use FileHistory to save and retrieve
+                responses quickly. This is undocumented to the user and can be
+                helpful for demos.
+            tags (str): Comma-separated List of tags to associate with the execution.
+                e.g:'tag1,tag2-name,tag3_name'
+            llm_profile_id (str): UUID of the LLM profile to override the default profile.
+                If not provided, the default profile will be used.
+            hitl_queue_name (str, optional): Document class name for manual review queue.
+                If not provided, uses API name as document class.
+            presigned_urls (list): List of presigned URLs to fetch files from.
+                URLs are validated for HTTPS and S3 endpoint requirements.
+            custom_data (dict, optional): User-provided data for variable replacement in prompts.
+                Can be accessed in prompts using {{custom_data.key}} syntax for dot notation traversal.
     """
 
     MAX_FILES_ALLOWED = 32
@@ -224,6 +226,7 @@ class ExecutionRequestSerializer(TagParamsSerializer):
     presigned_urls = ListField(child=URLField(), required=False)
     llm_profile_id = CharField(required=False, allow_null=True, allow_blank=True)
     hitl_queue_name = CharField(required=False, allow_null=True, allow_blank=True)
+    custom_data = JSONField(required=False, allow_null=True)
 
     def validate_hitl_queue_name(self, value: str | None) -> str | None:
         """Validate queue name format using enterprise validation if available."""
@@ -244,6 +247,16 @@ def validate_hitl_queue_name(self, value: str | None) -> str | None:
             )
         return value
 
+    def validate_custom_data(self, value):
+        """Validate custom_data is a valid JSON object."""
+        if value is None:
+            return value
+
+        if not isinstance(value, dict):
+            raise ValidationError("custom_data must be a JSON object")
+
+        return value
+
     files = ListField(
         child=FileField(),
         required=False,
 
@@ -125,8 +125,7 @@ def get_tool_by_prompt_registry_id(
         # Suppress all exceptions to allow processing
         except Exception as e:
             logger.warning(
-                "Error while fetching for prompt registry "
-                f"ID {prompt_registry_id}: {e} "
+                f"Error while fetching for prompt registry ID {prompt_registry_id}: {e} "
             )
             return None
         return Tool(
@@ -215,8 +214,7 @@ def update_or_create_psr_tool(
             return obj
         except IntegrityError as error:
             logger.error(
-                "Integrity Error - Error occurred while "
-                f"exporting custom tool : {error}"
+                f"Integrity Error - Error occurred while exporting custom tool : {error}"
             )
             raise ToolSaveError
 
 
@@ -37,7 +37,7 @@ dependencies = [
     "social-auth-core==4.4.2",         # For OAuth
     # TODO: Temporarily removing the extra dependencies of aws and gcs from unstract-sdk
     # to resolve lock file. Will have to be re-looked into
-    "unstract-sdk[azure]~=0.77.1",
+    "unstract-sdk[azure]~=0.77.3",
     "gcsfs==2024.10.0",
     "s3fs==2024.10.0",
     "azure-identity==1.16.0",
 
@@ -78,9 +78,9 @@ PROMPT_STUDIO_FILE_PATH=/app/prompt-studio-data
 
 # Structure Tool Image (Runs prompt studio exported tools)
 # https://hub.docker.com/r/unstract/tool-structure
-STRUCTURE_TOOL_IMAGE_URL="docker:unstract/tool-structure:0.0.86"
+STRUCTURE_TOOL_IMAGE_URL="docker:unstract/tool-structure:0.0.88"
 STRUCTURE_TOOL_IMAGE_NAME="unstract/tool-structure"
-STRUCTURE_TOOL_IMAGE_TAG="0.0.86"
+STRUCTURE_TOOL_IMAGE_TAG="0.0.88"
 
 # Feature Flags
 EVALUATION_SERVER_IP=unstract-flipt
 
@@ -119,8 +119,7 @@ def create(self, request: Any) -> Response:
             self.perform_create(serializer)
         except IntegrityError:
             raise DuplicateData(
-                f"{ToolInstanceErrors.TOOL_EXISTS}, "
-                f"{ToolInstanceErrors.DUPLICATE_API}"
+                f"{ToolInstanceErrors.TOOL_EXISTS}, {ToolInstanceErrors.DUPLICATE_API}"
             )
         instance: ToolInstance = serializer.instance
         ToolInstanceHelper.update_metadata_with_default_values(