Skip to content

Conversation

@jhamon
Copy link
Collaborator

@jhamon jhamon commented Nov 4, 2025

Implement fetch_by_metadata for Index and IndexAsyncio

This PR adds the fetch_by_metadata method to both synchronous and asynchronous Pinecone index clients, allowing users to retrieve vectors based on metadata filters rather than requiring explicit vector IDs.

Overview

The fetch_by_metadata operation enables querying vectors by their metadata attributes, similar to how query works but without requiring a query vector. This is particularly useful for:

  • Retrieving all vectors matching specific metadata criteria
  • Building data pipelines that filter by metadata
  • Implementing metadata-based data retrieval workflows

Usage Examples

Basic Usage (Synchronous)

from pinecone import Pinecone

pc = Pinecone(api_key='your-api-key')
index = pc.Index(host='your-index-host')

# Fetch vectors with simple metadata filter
result = index.fetch_by_metadata(
    filter={"genre": "action"},
    namespace="movies"
)

# Iterate over results
for vec_id, vector in result.vectors.items():
    print(f"ID: {vector.id}, Metadata: {vector.metadata}")

Complex Filtering

# Using multiple filter conditions
result = index.fetch_by_metadata(
    filter={
        "genre": {"$in": ["comedy", "drama"]},
        "year": {"$gte": 2020},
        "rating": {"$gt": 7.5}
    },
    namespace="movies",
    limit=100
)

Pagination

# First page
result = index.fetch_by_metadata(
    filter={"status": "active"},
    namespace="products",
    limit=50
)

# Continue to next page if available
if result.pagination and result.pagination.next:
    next_page = index.fetch_by_metadata(
        filter={"status": "active"},
        namespace="products",
        limit=50,
        pagination_token=result.pagination.next
    )

Asynchronous Usage

import asyncio
from pinecone import Pinecone

async def main():
    pc = Pinecone(api_key='your-api-key')
    async with pc.IndexAsyncio(host='your-index-host') as index:
        result = await index.fetch_by_metadata(
            filter={"category": "electronics", "in_stock": True},
            namespace="inventory",
            limit=100
        )
        
        for vec_id, vector in result.vectors.items():
            print(f"Product {vector.id}: {vector.metadata}")

asyncio.run(main())

gRPC Usage

from pinecone.grpc import PineconeGRPC

pc = PineconeGRPC(api_key='your-api-key')
index = pc.Index(host='your-index-host')

# Synchronous gRPC call
result = index.fetch_by_metadata(
    filter={"tag": "featured"},
    namespace="articles"
)

# Asynchronous gRPC call (returns future)
future = index.fetch_by_metadata(
    filter={"tag": "featured"},
    namespace="articles",
    async_req=True
)

# Wait for result
result = future.result()

Filter Operators

The fetch_by_metadata method supports all standard Pinecone metadata filter operators:

# Equality
filter={"status": "active"}

# Comparison operators
filter={"price": {"$gt": 100}}
filter={"age": {"$gte": 18}}
filter={"score": {"$lt": 0.5}}
filter={"count": {"$lte": 10}}

# Array operators
filter={"tags": {"$in": ["red", "blue", "green"]}}
filter={"categories": {"$nin": ["deprecated"]}}

# Existence check
filter={"description": {"$exists": True}}

# Logical operators
filter={
    "$and": [
        {"status": "active"},
        {"price": {"$lt": 50}}
    ]
}

filter={
    "$or": [
        {"category": "electronics"},
        {"category": "computers"}
    ]
}

Response Structure

The method returns a FetchByMetadataResponse object containing:

class FetchByMetadataResponse:
    namespace: str                    # The namespace queried
    vectors: Dict[str, Vector]        # Dictionary of vector ID to Vector objects
    usage: Usage                      # API usage information
    pagination: Optional[Pagination]  # Pagination token for next page (if available)

Technical Changes

Core Implementation

  • Added fetch_by_metadata method to Index (sync) and _IndexAsyncio (async) classes
  • Added fetch_by_metadata method to GRPCIndex with support for async_req
  • Created FetchByMetadataResponse dataclass with pagination support
  • Added request factory method IndexRequestFactory.fetch_by_metadata_request
  • Added gRPC response parser parse_fetch_by_metadata_response

Protobuf Migration

  • Migrated from db_data_2025_04 protobuf stubs to db_data_2025_10 stubs
  • Updated all gRPC-related imports and references
  • Removed deprecated 2025-04 stub files

Testing

  • Added comprehensive integration tests for sync (test_fetch_by_metadata.py)
  • Added comprehensive integration tests for async (test_fetch_by_metadata.py)
  • Added gRPC futures tests (test_fetch_by_metadata_future.py)
  • Added unit tests for request factory (test_request_factory.py)
  • Added unit tests for Index class (test_index.py)
  • Updated all unit test files to use 2025-10 protobuf stubs

Documentation

  • Added usage examples to docs/db_data/index-usage-byov.md
  • Updated interface docstrings with examples

Breaking Changes

None. This is a new feature addition.

Migration Notes

No migration required. This is a new feature that doesn't affect existing functionality.

@jhamon jhamon marked this pull request as ready for review November 4, 2025 09:49
@jhamon jhamon merged commit b3267a5 into release-candidate/2025-10 Nov 4, 2025
34 checks passed
@jhamon jhamon deleted the jhamon/fetch-by-metadata branch November 4, 2025 09:49
jhamon added a commit that referenced this pull request Nov 18, 2025
⚠️ **Python 3.9 is no longer supported.** The SDK now requires Python 3.10 or later. Python 3.9 reached end-of-life on October 2, 2025. Users must upgrade to Python 3.10+ to continue using the SDK.

⚠️ **Namespace parameter default behavior changed.** The SDK no longer applies default values for the `namespace` parameter in GRPC methods. When `namespace=None`, the parameter is omitted from requests, allowing the API to handle namespace defaults appropriately. This change affects `upsert_from_dataframe` methods in GRPC clients. The API is moving toward `"__default__"` as the default namespace value, and this change ensures the SDK doesn't override API defaults.

Note: The official SDK package was renamed last year from `pinecone-client` to `pinecone` beginning in version 5.1.0.  Please remove `pinecone-client` from your project dependencies and add `pinecone` instead to get  the latest updates if upgrading from earlier versions.

You can now configure dedicated read nodes for your serverless indexes, giving you more control over query performance and capacity planning. By default, serverless indexes use OnDemand read capacity, which automatically scales based on demand. With dedicated read capacity, you can allocate specific read nodes with manual scaling control.

**Create an index with dedicated read capacity:**

```python
from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric
)

pc = Pinecone()

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        read_capacity={
            "mode": "Dedicated",
            "dedicated": {
                "node_type": "t1",
                "scaling": "Manual",
                "manual": {
                    "shards": 2,
                    "replicas": 2
                }
            }
        }
    )
)
```

**Configure read capacity on an existing index:**

You can switch between OnDemand and Dedicated modes, or adjust the number of shards and replicas for dedicated read capacity:

```python
from pinecone import Pinecone

pc = Pinecone()

pc.configure_index(
    name='my-index',
    read_capacity={"mode": "OnDemand"}
)

pc.configure_index(
    name='my-index',
    read_capacity={
        "mode": "Dedicated",
        "dedicated": {
            "node_type": "t1",
            "scaling": "Manual",
            "manual": {
                "shards": 3,
                "replicas": 2
            }
        }
    }
)

pc.configure_index(
    name='my-index',
    read_capacity={
        "mode": "Dedicated",
        "dedicated": {
            "node_type": "t1",
            "scaling": "Manual",
            "manual": {
                "shards": 4,
                "replicas": 3
            }
        }
    }
)
```

When you change read capacity configuration, the index will transition to the new configuration. You can use `describe_index` to check the status of the transition.

See [PR #528](#528) for details.

You can now fetch vectors using metadata filters instead of vector IDs. This is especially useful when you need to retrieve vectors based on their metadata properties.

```python
from pinecone import Pinecone

pc = Pinecone()
index = pc.Index(host="your-index-host")

response = index.fetch_by_metadata(
    filter={'genre': {'$in': ['comedy', 'drama']}, 'year': {'$eq': 2019}},
    namespace='my_namespace',
    limit=50
)
print(f"Found {len(response.vectors)} vectors")

for vec_id, vector in response.vectors.items():
    print(f"ID: {vec_id}, Metadata: {vector.metadata}")
```

**Pagination support:**

When fetching large numbers of vectors, you can use pagination tokens to retrieve results in batches:

```python
response = index.fetch_by_metadata(
    filter={'status': 'active'},
    limit=100
)

if response.pagination and response.pagination.next:
    next_response = index.fetch_by_metadata(
        filter={'status': 'active'},
        pagination_token=response.pagination.next,
        limit=100
    )
```

The update method used to require a vector id to be passed, but now you have the option to pass a metadata filter instead. This is useful for bulk metadata updates across many vectors.

There is also a dry_run option that allows you to preview the number of vectors that would be changed by the update before performing the operation.

```python
from pinecone import Pinecone

pc = Pinecone()
index = pc.Index(host="your-index-host")

response = index.update(
    set_metadata={'status': 'active'},
    filter={'genre': {'$eq': 'drama'}},
    dry_run=True
)
print(f"Would update {response.matched_records} vectors")

response = index.update(
    set_metadata={'status': 'active'},
    filter={'genre': {'$eq': 'drama'}}
)
```

A new `FilterBuilder` utility class provides a type-safe, fluent interface for constructing metadata filters. While perhaps a bit verbose, it can help prevent common errors like misspelled operator names and provides better IDE support.

When you chain `.build()` onto the `FilterBuilder` it will emit a python dictionary representing the filter. Methods that take metadata filters as arguments will continue to accept dictionaries as before.

```python
from pinecone import Pinecone, FilterBuilder

pc = Pinecone()
index = pc.Index(host="your-index-host")

filter1 = FilterBuilder().eq("genre", "drama").build()

filter2 = (FilterBuilder().eq("genre", "drama") &
           FilterBuilder().gt("year", 2020)).build()

filter3 = (FilterBuilder().eq("genre", "comedy") |
           FilterBuilder().eq("genre", "drama")).build()

filter4 = ((FilterBuilder().eq("genre", "drama") &
            FilterBuilder().gte("year", 2020)) |
           (FilterBuilder().eq("genre", "comedy") &
            FilterBuilder().lt("year", 2000))).build()

response = index.fetch_by_metadata(filter=filter2, limit=50)

index.update(
    set_metadata={'status': 'archived'},
    filter=filter3
)
```

The FilterBuilder supports all Pinecone filter operators: `eq`, `ne`, `gt`, `gte`, `lt`, `lte`, `in_`, `nin`, and `exists`. Compound expressions are build with and `&` and or `|`.

See [PR #529](#529) for `fetch_by_metadata`, [PR #544](#544) for `update()` with filter, and [PR #531](#531) for FilterBuilder.

You can now create namespaces in serverless indexes directly from the SDK:

```python
from pinecone import Pinecone

pc = Pinecone()
index = pc.Index(host="your-index-host")

namespace = index.create_namespace(name="my-namespace")
print(f"Created namespace: {namespace.name}, Vector count: {namespace.vector_count}")

namespace = index.create_namespace(
    name="my-namespace",
    schema={
        "fields": {
            "genre": {"filterable": True},
            "year": {"filterable": True}
        }
    }
)
```

**Note:** This operation is not supported for pod-based indexes.

See [PR #532](#532) for details.

For sparse indexes with integrated embedding configured to use the `pinecone-sparse-english-v0` model, you can now specify which terms must be present in search results:

```python
from pinecone import Pinecone, SearchQuery

pc = Pinecone()
index = pc.Index(host="your-index-host")

response = index.search(
    namespace="my-namespace",
    query=SearchQuery(
        inputs={"text": "Apple corporation"},
        top_k=10,
        match_terms={
            "strategy": "all",
            "terms": ["apple", "corporation"]
        }
    )
)
```

The `match_terms` parameter ensures that all specified terms must be present in the text of each search hit. Terms are normalized and tokenized before matching, and order does not matter.

See [PR #530](#530) for details.

**Update API keys, projects, and organizations:**

```python
from pinecone import Admin

admin = Admin() # Auth with PINECONE_CLIENT_ID and PINECONE_CLIENT_SECRET

api_key = admin.api_key.update(
    api_key_id='my-api-key-id',
    name='updated-api-key-name',
    roles=['ProjectEditor', 'DataPlaneEditor']
)

project = admin.project.update(
    project_id='my-project-id',
    name='updated-project-name',
    max_pods=10,
    force_encryption_with_cmek=True
)

organization = admin.organization.update(
    organization_id='my-org-id',
    name='updated-organization-name'
)
```

**Delete organizations:**

```python
from pinecone import Admin

admin = Admin()

admin.organization.delete(organization_id='my-org-id')
```

See [PR #527](#527) and [PR #543](#543) for details.

You can now configure which metadata fields are filterable when creating serverless indexes. This helps optimize performance by only indexing metadata fields that you plan to use for filtering:

```python
from pinecone import (
    Pinecone,
    ServerlessSpec,
    CloudProvider,
    AwsRegion,
    Metric
)

pc = Pinecone()

pc.create_index(
    name='my-index',
    dimension=1536,
    metric=Metric.COSINE,
    spec=ServerlessSpec(
        cloud=CloudProvider.AWS,
        region=AwsRegion.US_EAST_1,
        schema={
            "genre": {"filterable": True},
            "year": {"filterable": True},
            "rating": {"filterable": True}
        }
    )
)
```

When using schemas, only fields marked as `filterable: True` in the schema can be used in metadata filters.

See [PR #528](#528) for details.

The SDK now exposes header information from API responses. This information is available in response objects via the `_response_info` attribute and can be useful for debugging and monitoring.

```python
from pinecone import Pinecone

pc = Pinecone()
index = pc.Index(host="your-index-host")

response = index.query(
    vector=[0.1, 0.2, 0.3, ...],
    top_k=10,
    namespace='my_namespace'
)

for k, v in response._response_info.get('raw_headers').items():
    print(f"{k}: {v}")
```

See [PR #539](#539) for details.

We've replaced Python's standard library `json` module with `orjson`, a fast JSON library written in Rust. This provides significant performance improvements for both serialization and deserialization of request payloads:

- **Serialization (dumps)**: 10-23x faster depending on payload size
- **Deserialization (loads)**: 4-7x faster depending on payload size

These improvements are especially beneficial for:
- High-throughput applications making many API calls
- Applications handling large vector payloads
- Real-time applications where latency matters

No code changes are required - the API remains the same, and you'll automatically benefit from these performance improvements.

See [PR #556](#556) for details.

We've optimized gRPC response parsing by replacing `json_format.MessageToDict` with direct protobuf field access. This optimization provides approximately 2x faster response parsing for gRPC operations.

Special thanks to [@yorickvP](https://github.com/yorickvP) for surfacing the `json_format.MessageToDict` refactor opportunity. While we didn't merge the specific PR, yorick's insight led us to implement a similar optimization that significantly improves gRPC performance.

See [PR #553](#553) for details.

- **Type hints and IDE support**: Comprehensive type hints throughout the SDK improve IDE autocomplete and type checking. The SDK now uses Python 3.10+ type syntax throughout.
- **Documentation**: Updated docstrings with RST formatting and code examples for better developer experience.
- **Dependency updates**: Updated protobuf to 5.29.5 to address security vulnerabilities. Updated `pinecone-plugin-assistant` to version 3.0.1.
- **Build system**: Migrated from poetry to uv for faster dependency management.

- [@yorickvP](https://github.com/yorickvP) - Thanks for surfacing the gRPC response parsing optimization opportunity!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants