Skip to content

feat: create dynamic model registration for Anthropic remote inferenc…#2879

Closed
r3v5 wants to merge 1 commit intollamastack:mainfrom
r3v5:dynamic-model-registration-anthropic
Closed

feat: create dynamic model registration for Anthropic remote inferenc…#2879
r3v5 wants to merge 1 commit intollamastack:mainfrom
r3v5:dynamic-model-registration-anthropic

Conversation

@r3v5
Copy link
Contributor

@r3v5 r3v5 commented Jul 23, 2025

…e provider

What does this PR do?

Allows to dynamically register models by Anthropic hence automatically detect when new models are added, deprecated, or removed by Anthropic remote inference provider, and automatically update the list of supported models in Llama Stack.

This issue is part of #2504

Closes #2864

Test Plan

  1. Create venv at root llamastack directory:
    uv venv .venv --python 3.12 --seed

  2. Activate venv:
    source .venv/bin/activate

  3. Install dependencies:
    uv pip install -e .

  4. Create Anthropic distro modifying run.yaml

  5. Navigate to templates/starter folder and build distro:

llama stack build --template starter --image-type venv

  1. Then run LlamaStack server:

ANTHROPIC_API_KEY="YOUR_KEY" ENABLE_ANTHROPIC=anthropic llama stack run run.yaml --image-type venv

  1. Try to register dummy model:
curl -X POST "http://localhost:8321/v1/models" \                                                                                                                                       
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "claude-7-5-sonnet-test",
    "provider_id": "anthropic", 
    "provider_model_id": "claude-3-5-dummy",
    "model_type": "llm",
    "metadata": {}
  }'

{"detail":"Invalid value: 'claude-3-5-dummy' model is not supported. Supported models are: anthropic/claude-3-5-sonnet-latest, anthropic/claude-3-7-sonnet-latest, anthropic/claude-3-5-haiku-latest, anthropic/voyage-3, anthropic/voyage-3-lite, anthropic/voyage-code-3, claude-3-5-sonnet-test, ..."}%

In the server logs you would want to see this log:

INFO     2025-07-23 15:57:49,612 llama_stack.providers.remote.inference.anthropic.anthropic:51 uncategorized: Model claude-3-5-dummy was not found on 
         Anthropic   
  1. Try to register available Anthropic Claude model:
curl -X POST "http://localhost:8321/v1/models" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "claude-3-5-sonnet-test",
    "provider_id": "anthropic", 
    "provider_model_id": "claude-3-5-sonnet-20241022",
    "model_type": "llm",
    "metadata": {}
  }'

{"identifier":"claude-3-5-sonnet-test","provider_resource_id":"claude-3-5-sonnet-20241022","provider_id":"anthropic","type":"model","owner":{"principal":"","attributes":{}},"metadata":{},"model_type":"llm"}%

In the server logs you would want to see this log:

INFO     2025-07-23 16:00:22,141 llama_stack.providers.remote.inference.anthropic.anthropic:47 uncategorized: Model claude-3-5-sonnet-20241022 is     
         available on Anthropic   

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 23, 2025
Copy link
Collaborator

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should go after #2862?

]
dependencies = [
"aiohttp",
"anthropic>=0.58.2",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed? It's a provider dep, not a server one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go in llama_stack/providers/registry/inference.py

from .config import AnthropicConfig
from .models import MODEL_ENTRIES

logger = logging.getLogger(__name__)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong logger

from llama_stack.log import get_logger
logger = get_logger(name=__name__, category="anthropic")

Comment on lines +41 to +45
def client(self) -> AsyncAnthropic:
if self._client is None:
api_key = self.config.api_key if self.config.api_key else "no-key"
self._client = AsyncAnthropic(api_key=api_key)
return self._client
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only work if the API key for anthropic is provided in the config. However, Llama Stack users can also provide their own API key in each request for this (any many other) providers. Our general pattern for providers that extend LiteLLM and any that support per-request credential passthrough via the x-llamastack-provider-data header is that we do not cache the clients ever, as that could lead to subsequent requests that do not send proper auth using previously sent auth from a different client.

I wonder more generally if the scope of this PR should be adjusted since #2835 landed? It provides a way to fetch clients and check model availability that should work for any of our LiteLLM based providers, I believe?

Copy link
Contributor Author

@r3v5 r3v5 Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @bbrowning for clarification. However, Anthropic is not fully OpenAI compatible in terms of retrieve specific model or list models api endpoints. OpenAI requires Bearer token while Anthropic has slightly different structure for API Key to use during calling these endpoints. That's why I used AsyncAnthropic.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@r3v5 what about the pattern in #2886 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @mattf! I will look on that.

Copy link
Contributor Author

@r3v5 r3v5 Jul 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattf your PR is nice! Great job 👏 Should I again cherry pickup from your PR and use your infrastructure? I didn’t realise litellm is so powerful 😁

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's a good idea

@r3v5
Copy link
Contributor Author

r3v5 commented Aug 14, 2025

Hey, this issue #2864 can be closed in favor of #2886

New infrastructure from #2886 use pattern from LiteLLMOpenAIMixin dynamically checking model availability via litellmusing

I registered model that is not in static list for Anthropic. All works as expected:

curl -X POST "http://localhost:8321/v1/models" \                                                                                   
-H "Content-Type: application/json" \
-d '{
  "model_id": "claude-opus",           
  "provider_id": "anthropic", 
  "provider_model_id": "claude-opus-4-1-20250805",
  "model_type": "llm",
  "metadata": {}
}'

{"identifier":"claude-opus","provider_resource_id":"claude-opus-4-1-20250805","provider_id":"anthropic","type":"model","owner":null,"source":"via_register_api","metadata":{},"model_type":"llm"}%

CC: @mattf

@mattf mattf closed this Aug 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: create dynamic model registration for Anthropic remote inference provider

5 participants