-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Listing the current list of models and the loaded model. #61
Comments
Hi, There is /v1/models which is openAI compatible, but it won’t list any model that is configured with unlisted:true. There’s currently no API that lists the loaded models. Maybe adding a llama-swap propriety API to allow programmatic control would be useful. Take a look at proxy/proxymanager.go. You may be able to add the http handlers you need quickly. |
I've added a PR that adds an additional The endpoint returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping. Example output if the endpoint is called right after llama-swap has been started: Example output if the endpoint is called and a model is still being loaded: Example output if the endpoint is called and a model has been loaded: Example output if the endpoint is called and a model is being unloaded due to TTL: Example output if the endpoint is called and a model has been unloaded due to TTL: Note: It returns an empty JSON object if the model is marked as Unlisted. |
* Adds an endpoint '/running' that returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping. * Improves the `/running` endpoint by allowing multiple entries under the `running` key within the JSON response. Refactors the `/running` method name (listRunningProcessesHandler). Removes the unlisted filter implementation. * Adds tests for: - no model loaded - one model loaded - multiple models loaded * Adds simple comments. * Simplified code structure as per 250313 comments on PR #65. --------- Co-authored-by: FGDumitru|B <[email protected]>
fixed in #65 |
Hi again,
I'm working on a library that leverages llama-swap at remote API level.
In this context:
Is there a way for llama-swap to list the currently configured models in a JSON format?
I could not find a reliable way of detecting the currently loaded model also in JSON format (for now I'm scrubbing the /logs for that which, as you may already know, is not ideally.
I think these two improvements would increasingly add value to llama-swap all around.
Thank you.
The text was updated successfully, but these errors were encountered: