feat(registry): make the Stack query providers for model listing by ashwinb · Pull Request #2862 · llamastack/llama-stack

ashwinb · 2025-07-22T21:22:56Z

This flips #2823 and #2805 by making the Stack periodically query the providers for models rather than the providers going behind the back and calling "register" on to the registry themselves. This also adds support for model listing for all other providers via ModelRegistryHelper. Once this is done, we do not need to manually list or register models via run.yaml and it will remove both noise and annoyance (setting INFERENCE_MODEL environment variables, for example) from the new user experience.

In addition, it adds a configuration variable allowed_models which can be used to optionally restrict the set of models exposed from a provider.

ashwinb · 2025-07-22T21:25:21Z

cc @mattf particularly, since you may have thoughts given your recent changes -- the addition of query_model_availability() for example.

ashwinb · 2025-07-22T23:49:27Z

cc @cdoern the post-training tests are failing, what might I be doing wrong?

ehhuang

LG

ashwinb · 2025-07-22T23:55:26Z

I will make a PR after this which removes explicit model registration from templates like starter since it won't be necessary anymore. It will also remove the requirement for handling various kinds of disabled annotations from model IDs, etc.

llama_stack/providers/utils/inference/model_registry.py

llama_stack/providers/remote/inference/vllm/vllm.py

llama_stack/providers/utils/inference/model_registry.py

llama_stack/distribution/routing_tables/models.py

llama_stack/providers/remote/inference/ollama/ollama.py

…ch can create forever running background threads

… fixes

…mastack#2862) This flips llamastack#2823 and llamastack#2805 by making the Stack periodically query the providers for models rather than the providers going behind the back and calling "register" on to the registry themselves. This also adds support for model listing for all other providers via `ModelRegistryHelper`. Once this is done, we do not need to manually list or register models via `run.yaml` and it will remove both noise and annoyance (setting `INFERENCE_MODEL` environment variables, for example) from the new user experience. In addition, it adds a configuration variable `allowed_models` which can be used to optionally restrict the set of models exposed from a provider.

ashwinb requested review from bbrowning, ehhuang, hardikjshah, leseb, mattf, raghotham, reluctantfuturist, terrytangyuan and yanxi0830 as code owners July 22, 2025 21:22

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 22, 2025

ashwinb force-pushed the regis_2 branch from f7f2f23 to 515bde9 Compare July 22, 2025 23:18

ehhuang approved these changes Jul 22, 2025

View reviewed changes

mattf reviewed Jul 23, 2025

View reviewed changes

This was referenced Jul 23, 2025

feat: create dynamic model registration for Anthropic remote inferenc… #2879

Closed

feat: create dynamic model registration for Groq remote inference provider #2872

Closed

ashwinb added 4 commits July 23, 2025 16:18

feat(registry): make the Stack query providers for model listing

2e5ffab

add configuration to control which models are exposed

e339651

cancel refresh task on shutdown

cf629f8

add support embedding models and keeping provider models separate

8fb4fee

ashwinb force-pushed the regis_2 branch from b08d5fc to 8fb4fee Compare July 23, 2025 23:19

ashwinb added 3 commits July 23, 2025 16:36

fix import

3cda82b

library client fix since we need a runloop for stack construction whi…

487e073

…ch can create forever running background threads

make refreshing happen for all routing tables, naming changes, ollama…

0fe110d

… fixes

ashwinb merged commit 1463b79 into llamastack:main Jul 24, 2025
77 checks passed

ashwinb deleted the regis_2 branch July 24, 2025 17:39

r3v5 mentioned this pull request Jul 31, 2025

Dynamically Update SUPPORTED_MODELS list for remote providers #2504

Closed

wukaixingxp mentioned this pull request Sep 15, 2025

Llama stack can not automatically register models for Ollama #3447

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(registry): make the Stack query providers for model listing#2862

feat(registry): make the Stack query providers for model listing#2862
ashwinb merged 7 commits intollamastack:mainfrom
ashwinb:regis_2

ashwinb commented Jul 22, 2025 •

edited

Loading

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

ehhuang left a comment

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ashwinb commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

ehhuang left a comment

Choose a reason for hiding this comment

Uh oh!

ashwinb commented Jul 22, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ashwinb commented Jul 22, 2025 •

edited

Loading