feat: create dynamic model registration for Anthropic remote inferenc…#2879
feat: create dynamic model registration for Anthropic remote inferenc…#2879r3v5 wants to merge 1 commit intollamastack:mainfrom
Conversation
| ] | ||
| dependencies = [ | ||
| "aiohttp", | ||
| "anthropic>=0.58.2", |
There was a problem hiding this comment.
Why is this needed? It's a provider dep, not a server one.
There was a problem hiding this comment.
This should go in llama_stack/providers/registry/inference.py
| from .config import AnthropicConfig | ||
| from .models import MODEL_ENTRIES | ||
|
|
||
| logger = logging.getLogger(__name__) |
There was a problem hiding this comment.
wrong logger
from llama_stack.log import get_logger
logger = get_logger(name=__name__, category="anthropic")
| def client(self) -> AsyncAnthropic: | ||
| if self._client is None: | ||
| api_key = self.config.api_key if self.config.api_key else "no-key" | ||
| self._client = AsyncAnthropic(api_key=api_key) | ||
| return self._client |
There was a problem hiding this comment.
This will only work if the API key for anthropic is provided in the config. However, Llama Stack users can also provide their own API key in each request for this (any many other) providers. Our general pattern for providers that extend LiteLLM and any that support per-request credential passthrough via the x-llamastack-provider-data header is that we do not cache the clients ever, as that could lead to subsequent requests that do not send proper auth using previously sent auth from a different client.
I wonder more generally if the scope of this PR should be adjusted since #2835 landed? It provides a way to fetch clients and check model availability that should work for any of our LiteLLM based providers, I believe?
There was a problem hiding this comment.
Thanks @bbrowning for clarification. However, Anthropic is not fully OpenAI compatible in terms of retrieve specific model or list models api endpoints. OpenAI requires Bearer token while Anthropic has slightly different structure for API Key to use during calling these endpoints. That's why I used AsyncAnthropic.
There was a problem hiding this comment.
@mattf your PR is nice! Great job 👏 Should I again cherry pickup from your PR and use your infrastructure? I didn’t realise litellm is so powerful 😁
|
Hey, this issue #2864 can be closed in favor of #2886 New infrastructure from #2886 use pattern from I registered model that is not in static list for Anthropic. All works as expected:
CC: @mattf |
…e provider
What does this PR do?
Allows to dynamically register models by Anthropic hence automatically detect when new models are added, deprecated, or removed by Anthropic remote inference provider, and automatically update the list of supported models in Llama Stack.
This issue is part of #2504
Closes #2864
Test Plan
Create venv at root llamastack directory:
uv venv .venv --python 3.12 --seedActivate venv:
source .venv/bin/activateInstall dependencies:
uv pip install -e .Create Anthropic distro modifying run.yaml
Navigate to templates/starter folder and build distro:
llama stack build --template starter --image-type venvANTHROPIC_API_KEY="YOUR_KEY" ENABLE_ANTHROPIC=anthropic llama stack run run.yaml --image-type venv{"detail":"Invalid value: 'claude-3-5-dummy' model is not supported. Supported models are: anthropic/claude-3-5-sonnet-latest, anthropic/claude-3-7-sonnet-latest, anthropic/claude-3-5-haiku-latest, anthropic/voyage-3, anthropic/voyage-3-lite, anthropic/voyage-code-3, claude-3-5-sonnet-test, ..."}%In the server logs you would want to see this log:
{"identifier":"claude-3-5-sonnet-test","provider_resource_id":"claude-3-5-sonnet-20241022","provider_id":"anthropic","type":"model","owner":{"principal":"","attributes":{}},"metadata":{},"model_type":"llm"}%In the server logs you would want to see this log: