Skip to content

[Feature]: Allow Redis Semantic caching with custom Embedding models #4001

@anandsriraman

Description

@anandsriraman

The Feature

Currently, I notice that the schema for the redis semantic cache enforces a dimension size of 1536. This works well with OpenAI's text-embedding-ada-002 models but fail for any other model.

schema = {
            "index": {
                "name": "litellm_semantic_cache_index",
                "prefix": "litellm",
                "storage_type": "hash",
            },
            "fields": {
                "text": [{"name": "response"}],
                "text": [{"name": "prompt"}],
                "vector": [
                    {
                        "name": "litellm_embedding",
                        "dims": 1536,
                        "distance_metric": "cosine",
                        "algorithm": "flat",
                        "datatype": "float32",
                    }
                ],
            },
        }

Please make the embedding dims configurable from the model-config.yaml file so that a larger range of embedding models deployed with LiteLLM can be used for the cache.

Motivation, pitch

For caching, much smaller embedding models might be preferred due to their cost and speed. If the dims becomes a configurable item, then this opens up much more interesting modes of caching and cache optimization. It should also reduce costs significantly as this would avoid calls to OpenAI every time a new user query is recieved.

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions