Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Support for remaining filters attribute in pinecone #9665

Closed
dheerajiiitv opened this issue Dec 22, 2023 · 5 comments
Closed
Labels
enhancement New feature or request triage Issue needs to be triaged/prioritized

Comments

@dheerajiiitv
Copy link
Contributor

dheerajiiitv commented Dec 22, 2023

Feature Description

If you see, llama-index pinecone implementation, some metadata filters are missing.

Reason

So we are not able to fully use the MetadataFilters functionality of llama-index for pinecone vector store

Value of Feature

Yes, this will make easy to use complex filters on pinecone vector store.

@dheerajiiitv dheerajiiitv added enhancement New feature or request triage Issue needs to be triaged/prioritized labels Dec 22, 2023
Copy link

dosubot bot commented Dec 22, 2023

🤖

Hello @dheerajiiitv!

I'm Dosu, a friendly bot here to assist you while you wait for a real human maintainer. I can help you tackle bugs, answer your questions, and guide you on your contributor journey. Let's make LlamaIndex better together! 😊

Thank you for your feature request. I understand that you're looking for additional metadata filters in the Pinecone implementation of LlamaIndex. However, based on the current implementation of the PineconeVectorStore in LlamaIndex, it already supports the full functionality of the MetadataFilters.

The _to_pinecone_filter function in the PineconeVectorStore class handles the conversion of the condition and operator of the MetadataFilters to Pinecone specific specifications. It supports the following conditions: "and", "or", and the following operators: "!=", "==", ">", "<", ">=", "<=", "in". This function is used in the query method of the PineconeVectorStore class to apply the filters to the query. If the query.filters is not None, it converts the filters to Pinecone specific filters using the _to_pinecone_filter function.

Here is the relevant code:

def _to_pinecone_filter(standard_filters: MetadataFilters) -> dict:
    """Convert from standard dataclass to pinecone filter dict."""
    filters = {}
    filters_list = []
    condition = standard_filters.condition or "and"
    condition = _transform_pinecone_filter_condition(condition)
    if standard_filters.filters:
        for filter in standard_filters.filters:
            if filter.operator:
                filters_list.append(
                    {
                        filter.key: {
                            _transform_pinecone_filter_operator(
                                filter.operator
                            ): filter.value
                        }
                    }
                )
            else:
                filters_list.append({filter.key: filter.value})

    if len(filters_list) == 1:
        # If there is only one filter, return it directly
        return filters_list[0]
    elif len(filters_list) > 1:
        filters[condition] = filters_list
    return filters

# In the query method
if query.filters is not None:
    if "filter" in kwargs:
        raise ValueError(
            "Cannot specify filter via both query and kwargs. "
            "Use kwargs only for pinecone specific items that are "
            "not supported via the generic query interface."
        )
    filter = _to_pinecone_filter(query.filters)
else:
    filter = kwargs.pop("filter", {})

You can find this code in the llama_index/vector_stores/pinecone.py file.

If you're referring to the unimplemented metadata filters mentioned in the comment section of the llama_index/extractors/metadata_extractors.py file, such as "Position of node in subsection hierarchy (and associated subtitles)" and "Hierarchically organized summary", these are not specific to the Pinecone implementation but are general features that are yet to be implemented in the LlamaIndex.

If you have any other specific filters in mind that you believe are missing, could you please provide more details? This will help in understanding your request better and providing a more accurate response.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@hatianzhang
Copy link
Contributor

hatianzhang commented Dec 22, 2023

@dheerajiiitv might I know which set of meta data filters we are missing for pinecone.
please feel free to make a pr to add those missing ones, thanks!

@dheerajiiitv
Copy link
Contributor Author

sure @hatianzhang I'll pick this up

@dheerajiiitv
Copy link
Contributor Author

@hatianzhang created this PR: #9674
I tried to follow the instruction, but might be wrong. PTAL

@hatianzhang
Copy link
Contributor

@dheerajiiitv thanks! it works.
I tried this way codespell-project/codespell#1212 (comment) but somehow did not make it work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request triage Issue needs to be triaged/prioritized
Projects
None yet
Development

No branches or pull requests

2 participants