Skip to content

feat: add status and ttl to /v1/models endpoint#569

Open
ghostdevv wants to merge 7 commits intomostlygeek:mainfrom
ghostdevv:models
Open

feat: add status and ttl to /v1/models endpoint#569
ghostdevv wants to merge 7 commits intomostlygeek:mainfrom
ghostdevv:models

Conversation

@ghostdevv
Copy link
Copy Markdown

@ghostdevv ghostdevv commented Mar 5, 2026

Closes #541
Closes #564

Update the /v1/models to return the model state, ttl, and ttlRemaining under the meta.llamaswap field. For example:

{
  "data": [
    {
      "created": 1775097337,
      "id": "llama-3.1-8b-instruct",
      "meta": {
        "llamaswap": {
          "state": "stopped",
          "ttl": 30,
          "ttlRemaining": null
        }
      },
      "name": "Llama 3.1 8b Instruct",
      "object": "model",
      "owned_by": "llama-swap"
    }
  ],
  "object": "list"
}

The following states are used (from proxymanager_api):

  • ready
  • starting
  • stopping
  • stopped
  • shutdown
  • unknown

I also updated the README for /api/models/unload as it was stale

@ghostdevv ghostdevv marked this pull request as draft March 5, 2026 23:20
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a new GET /api/models endpoint (returns models with name, id, state, ttl, ttlRemaining), moves unload paths under /api/models, implements Process.GetTTL and ProxyManager helpers to surface TTLs, and adds unit tests covering listing and TTL behavior.

Changes

Cohort / File(s) Summary
Docs
README.md
Updated API docs to replace /models/unload with /api/models, adding /api/models, /api/models/unload, and /api/models/unload/:model_id.
API implementation
proxy/proxymanager_api.go
Added apiModel struct, ProxyManager.apiGetModels handler, route registration apiGroup.GET("/models", pm.apiGetModels), and helper getModelTTL(modelID string); initialized default "unknown" state in model-status formatting.
Process logic
proxy/process.go
Added func (p *Process) GetTTL() (ttl *int, ttlRemaining *int) to compute TTL and remaining TTL from UnloadAfter and last-request timestamp.
Tests
proxy/proxymanager_test.go
Added TestProxyManager_ApiGetModels and TestProxyManager_ApiGetModels_WithTTL validating /api/models response shape and TTL/ttlRemaining semantics.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

enhancement

Suggested reviewers

  • mostlygeek
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ⚠️ Warning The PR title references '/v1/models endpoint' but the actual implementation adds '/api/models' endpoint; this is a mismatch between stated intent and implementation. Update the title to reflect the actual endpoint being added: 'feat: add /api/models endpoint with status and ttl' or similar.
✅ Passed checks (3 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fulfills both linked issues: #541 requires API to show model loading status, #564 requires a models endpoint with state info. The implementation provides both via /api/models with state, ttl, and ttlRemaining fields.
Out of Scope Changes check ✅ Passed All changes align with the linked issues: new /api/models endpoint, TTL computation in Process.GetTTL(), and test coverage are all within scope. README update for /api/models/unload is a minor related improvement.
Description check ✅ Passed The pull request description accurately relates to the changeset, mentioning the new /api/models endpoint, the state/ttl/ttlRemaining fields, and README updates.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
proxy/proxymanager_test.go (1)

1317-1365: Consider expanding coverage for contract edge cases.

Great baseline test. Please add sub-cases for at least one loaded model (ready) and peer-listed models so /api/models contract stays stable for name/state across all returned entries.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_test.go` around lines 1317 - 1365, Update
TestProxyManager_ApiGetModels to include sub-cases that exercise a loaded model
and peer-listed models: keep the existing setup using New and StopProcesses but
add a subtest where you transition one model to the "ready" state (e.g., start
the model via the proxy's start/process method or set its state in the proxy's
model registry) before calling proxy.ServeHTTP, and another subtest where you
inject peer metadata into the proxy (e.g., add entries to the proxy's peer
list/peer manager) so /api/models returns entries from both local and peer
sources; assert that for all returned entries the "id", "name" and "state"
fields are present and that the loaded model shows "state" == "ready" while
others remain "stopped". Ensure you reuse CreateTestResponseRecorder, Route
through ServeHTTP, and clean up with StopProcesses in each subtest.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/proxymanager_api.go`:
- Around line 52-63: Normalize the output in ProxyManager.apiGetModels: treat
empty model names by falling back to the model Id (use m.Id when m.Name == ""
for the apiModel.Name), and restrict m.State to the documented contract by
mapping only "starting" and "ready" through; coerce any other state (including
"stopping", "shutdown", "unknown" or empty) to "stopped" when setting
apiModel.State. Update the loop that builds result (refer to apiGetModels,
apiModel and getModelStatus and fields m.Name/m.State/m.Id) to apply these
mappings before calling c.JSON.

---

Nitpick comments:
In `@proxy/proxymanager_test.go`:
- Around line 1317-1365: Update TestProxyManager_ApiGetModels to include
sub-cases that exercise a loaded model and peer-listed models: keep the existing
setup using New and StopProcesses but add a subtest where you transition one
model to the "ready" state (e.g., start the model via the proxy's start/process
method or set its state in the proxy's model registry) before calling
proxy.ServeHTTP, and another subtest where you inject peer metadata into the
proxy (e.g., add entries to the proxy's peer list/peer manager) so /api/models
returns entries from both local and peer sources; assert that for all returned
entries the "id", "name" and "state" fields are present and that the loaded
model shows "state" == "ready" while others remain "stopped". Ensure you reuse
CreateTestResponseRecorder, Route through ServeHTTP, and clean up with
StopProcesses in each subtest.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: da787c5b-1563-4438-8cb2-29dcd02f803f

📥 Commits

Reviewing files that changed from the base of the PR and between cc77139 and 6e831a4.

📒 Files selected for processing (3)
  • README.md
  • proxy/proxymanager_api.go
  • proxy/proxymanager_test.go

@ghostdevv ghostdevv marked this pull request as ready for review March 5, 2026 23:39
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@README.md`:
- Line 40: Update the `/api/models` entry in README.md to explicitly document
the full pass-through `state` enum returned by that endpoint: list the values
`ready`, `starting`, `stopping`, `stopped`, `shutdown`, and `unknown`, and state
that the API intentionally passes through these states exactly (no
normalization) to avoid lossy mappings and preserve forward compatibility.
Ensure the description for `/api/models` mentions that clients should handle all
listed states.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ac760b86-05db-4f67-bb28-4719a0f9e7ed

📥 Commits

Reviewing files that changed from the base of the PR and between 6e831a4 and 75ff4ba.

📒 Files selected for processing (3)
  • README.md
  • proxy/proxymanager_api.go
  • proxy/proxymanager_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • proxy/proxymanager_api.go
  • proxy/proxymanager_test.go

@ghostdevv
Copy link
Copy Markdown
Author

I just pushed the ability for this /models endpoint to return the ttl and ttlRemaining which I found I needed for my llama-swap integration! I updated the PR description to include this.

This is ready for review whenever you have time @mostlygeek 🙏

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
proxy/proxymanager_api.go (1)

150-158: ⚠️ Potential issue | 🟡 Minor

Peer models may have empty state exposed through the new /api/models endpoint.

While this code segment is unchanged, the new /api/models endpoint now exposes peer models with state: "" (empty string), which is not in the documented set of valid states (ready, starting, stopping, stopped, shutdown, unknown). This could cause issues for clients like the Svelte UI that expects ModelStatus to be one of the valid string literals.

Consider initializing peer models with state: "unknown" to maintain consistency:

 models = append(models, Model{
 	Id:     modelID,
 	PeerID: peerID,
+	State:  "unknown",
 })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_api.go` around lines 150 - 158, When building the models
slice from peer proxies (pm.peerProxy and its ListPeers()), ensure each created
Model includes a valid state instead of an empty string; update the construction
of Model within the loop that iterates peer.Models to set State: "unknown" (or
map empty/zero values to "unknown") so peer-exposed models conform to the
documented ModelStatus literals for the /api/models endpoint.
🧹 Nitpick comments (1)
proxy/proxymanager_test.go (1)

1376-1424: TTL test coverage looks comprehensive.

The test properly validates:

  • TTL value is correctly set to the configured UnloadAfter value
  • ttlRemaining is within expected bounds after a request
  • Models without UnloadAfter return null for both TTL fields

Minor robustness suggestion: The type assertion on line 1419 could panic if the test setup fails unexpectedly. Consider using a comma-ok idiom or require for preceding assertions:

Optional: Safer type assertion
-	assert.Equal(t, float64(60), modelMap["model1"]["ttl"])
-	ttlRemaining1 := modelMap["model1"]["ttlRemaining"].(float64)
+	require.Equal(t, float64(60), modelMap["model1"]["ttl"])
+	ttlRemaining1, ok := modelMap["model1"]["ttlRemaining"].(float64)
+	require.True(t, ok, "ttlRemaining should be a float64")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_test.go` around lines 1376 - 1424, The direct type
assertion ttlRemaining1 := modelMap["model1"]["ttlRemaining"].(float64) in
TestProxyManager_ApiGetModels_WithTTL can panic if the field is missing or of
the wrong type; change it to a safe check using the comma-ok idiom (v, ok :=
modelMap["model1"]["ttlRemaining"].(float64)) and assert/require that ok is true
before using v, or use require.IsType/require.NotNil from testify to fail the
test cleanly if the value isn't a float64 — update the assertion around
ttlRemaining1 accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@proxy/proxymanager_api.go`:
- Around line 150-158: When building the models slice from peer proxies
(pm.peerProxy and its ListPeers()), ensure each created Model includes a valid
state instead of an empty string; update the construction of Model within the
loop that iterates peer.Models to set State: "unknown" (or map empty/zero values
to "unknown") so peer-exposed models conform to the documented ModelStatus
literals for the /api/models endpoint.

---

Nitpick comments:
In `@proxy/proxymanager_test.go`:
- Around line 1376-1424: The direct type assertion ttlRemaining1 :=
modelMap["model1"]["ttlRemaining"].(float64) in
TestProxyManager_ApiGetModels_WithTTL can panic if the field is missing or of
the wrong type; change it to a safe check using the comma-ok idiom (v, ok :=
modelMap["model1"]["ttlRemaining"].(float64)) and assert/require that ok is true
before using v, or use require.IsType/require.NotNil from testify to fail the
test cleanly if the value isn't a float64 — update the assertion around
ttlRemaining1 accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f0427bbf-ba50-4c37-bc06-9c432000bd6c

📥 Commits

Reviewing files that changed from the base of the PR and between 75ff4ba and 4bcb74d.

📒 Files selected for processing (2)
  • proxy/proxymanager_api.go
  • proxy/proxymanager_test.go

@mostlygeek
Copy link
Copy Markdown
Owner

mostlygeek commented Mar 14, 2026

Hi,

There is the v1/models endpoint that is supported by most clients. Any reason not to use that one?

The handler has some metadata code already and extending that would eliminate having duplicated or similar endpoints.

For retrieving the TTL and remaining time I suggest adding a small function to proxy.Process. For example, Process.getTTL() (configured int, remaining int).

A return of 0,0 would mean no configured TTL. A return of 300,3 would mean 300s configured and 3s until shutdown.

@ghostdevv
Copy link
Copy Markdown
Author

There is the v1/models endpoint that is supported by most clients. Any reason not to use that one?

The handler has some metadata code already and extending that would eliminate having duplicated or similar endpoints.

We could, it's just it's llama-swap specific and so I opted to create another endpoint. The openai openapi spec doesn't give way for extra metadata, but I guess we're technically breaking it anyway since name isn't part of the official "spec". I'll change it to be v1/models - should we add these fields (state, ttl, etc) top level for each model (e.g. {"name": "foo", "state": "..."}) or should it be a nested object (e.g. {"name": "foo", "llamaSwap": {"state": "..."}})? @mostlygeek

For retrieving the TTL and remaining time I suggest adding a small function to proxy.Process. For example, Process.getTTL() (configured int, remaining int).

moved!

A return of 0,0 would mean no configured TTL. A return of 300,3 would mean 300s configured and 3s until shutdown.

The way I have it now with the *int/number | null is so that you semantically know when a ttl is being enforced, and when there isn't a ttl. I guess it'd be more consistent to make the ttl return 0 when it's disabled instead of null as that is how it is everywhere else. Up to you. However, I think the ttlRemaining should be *int/number | null to reduce confusion and effort as then I don't need to check whether ttl is in effect, and whether the state is ready, to understand what a ttlRemaining of 0 actually means. If that makes sense?

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
proxy/proxymanager_api.go (1)

63-87: Avoid repeated group scans per model when building response.

apiGetModels calls getModelTTL for each model, and getModelTTL does another findGroupByModelName lookup. This duplicates traversal already done in status collection and can become expensive as model count grows.

Consider producing state + TTL in one pass (or caching a modelID→process/group map for the request) to avoid repeated lookups.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_api.go` around lines 63 - 87, The current apiGetModels
loop calls getModelTTL which calls findGroupByModelName repeatedly causing
duplicate traversals; modify apiGetModels (or the loop that builds result) to
compute state and TTL in a single pass by either 1) while iterating
groups/processes collect a temporary map[string]*Process (or map[string]ttl
info) keyed by modelID and then use that map when constructing apiModel entries,
or 2) change getModelTTL to accept a precomputed model→process/group map so
findGroupByModelName is not invoked per model; update references to getModelTTL
and ProxyManager.findGroupByModelName/process.GetTTL accordingly so each model
lookup is O(1) per request instead of re-scanning groups.
proxy/process.go (1)

149-149: Use camelCase for the new method name.

Line 149 introduces GetTTL (PascalCase). Please rename to camelCase (for example getTTL) and update the caller accordingly.

Proposed rename
- func (p *Process) GetTTL() (ttl *int, ttlRemaining *int) {
+ func (p *Process) getTTL() (ttl *int, ttlRemaining *int) {
- return process.GetTTL()
+ return process.getTTL()

As per coding guidelines: **/*.go: Use camelCase for function/method naming in Go.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/process.go` at line 149, The method on type Process is named GetTTL
(PascalCase) but Go style expects camelCase; rename the method from GetTTL to
getTTL and update every caller to the new name. Locate the method declaration
"func (p *Process) GetTTL()" and change it to "func (p *Process) getTTL()", then
search for usages of "GetTTL(" and replace them with "getTTL(" to ensure
compilation and consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@proxy/process.go`:
- Around line 155-157: The current branch that checks p.CurrentState() !=
StateReady returns new(int) which produces a zero-valued pointer and leads to
ttlRemaining: 0; change that to return nil for the ttlRemaining pointer so the
API emits null when the model isn't ready (keep returning &ttlSeconds as the
first return value). Update the return expression in the p.CurrentState() !=
StateReady branch to return &ttlSeconds, nil, ensuring the second return value
is a nil *int to satisfy the function signature (symbols: p.CurrentState(),
StateReady, ttlSeconds, ttlRemaining).

---

Nitpick comments:
In `@proxy/process.go`:
- Line 149: The method on type Process is named GetTTL (PascalCase) but Go style
expects camelCase; rename the method from GetTTL to getTTL and update every
caller to the new name. Locate the method declaration "func (p *Process)
GetTTL()" and change it to "func (p *Process) getTTL()", then search for usages
of "GetTTL(" and replace them with "getTTL(" to ensure compilation and
consistency.

In `@proxy/proxymanager_api.go`:
- Around line 63-87: The current apiGetModels loop calls getModelTTL which calls
findGroupByModelName repeatedly causing duplicate traversals; modify
apiGetModels (or the loop that builds result) to compute state and TTL in a
single pass by either 1) while iterating groups/processes collect a temporary
map[string]*Process (or map[string]ttl info) keyed by modelID and then use that
map when constructing apiModel entries, or 2) change getModelTTL to accept a
precomputed model→process/group map so findGroupByModelName is not invoked per
model; update references to getModelTTL and
ProxyManager.findGroupByModelName/process.GetTTL accordingly so each model
lookup is O(1) per request instead of re-scanning groups.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9147de9a-a8aa-423d-81f9-6db728e8e773

📥 Commits

Reviewing files that changed from the base of the PR and between 4bcb74d and 38f773f.

📒 Files selected for processing (2)
  • proxy/process.go
  • proxy/proxymanager_api.go

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
proxy/proxymanager_api.go (1)

65-69: ⚠️ Potential issue | 🟠 Major

Peer models can still emit an invalid empty state value.

On Line 68, State: m.State forwards "" for peer entries that lack state in getModelStatus, which breaks the documented state contract (ready|starting|stopping|stopped|shutdown|unknown).

Proposed fix (boundary-safe default in API mapping)
 func (pm *ProxyManager) apiGetModels(c *gin.Context) {
 	models := pm.getModelStatus()
 	result := make([]apiModel, len(models))
 	for i, m := range models {
 		name := m.Name
 		if name == "" {
 			name = m.Id
 		}
+		state := m.State
+		if state == "" {
+			state = "unknown"
+		}
 
 		ttl, ttlRemaining := pm.getModelTTL(m.Id)
 
 		result[i] = apiModel{
 			Name:         name,
 			Id:           m.Id,
-			State:        m.State,
+			State:        state,
 			TTL:          ttl,
 			TTLRemaining: ttlRemaining,
 		}
 	}
 	c.JSON(http.StatusOK, result)
 }

Based on learnings: In mostlygeek/llama-swap, /api/models intentionally passes through all model states (stopped, starting, ready, stopping, shutdown, unknown) rather than coercing them.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@proxy/proxymanager_api.go` around lines 65 - 69, The API mapping currently
forwards an empty peer model state (result[i] = apiModel{..., State: m.State,
...}) which can violate the documented contract; update the mapping in the code
that builds apiModel (the result[i] assignment that uses m.State) to normalize
blank or invalid values by substituting a safe default (e.g., "unknown") —
ideally check m.State and if it is "" or not one of the allowed tokens
(ready|starting|stopping|stopped|shutdown|unknown) set State to "unknown" before
assigning to apiModel so all API responses conform to the documented state set;
reference the apiModel construction and the getModelStatus/m.State source to
make the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@proxy/proxymanager_api.go`:
- Around line 65-69: The API mapping currently forwards an empty peer model
state (result[i] = apiModel{..., State: m.State, ...}) which can violate the
documented contract; update the mapping in the code that builds apiModel (the
result[i] assignment that uses m.State) to normalize blank or invalid values by
substituting a safe default (e.g., "unknown") — ideally check m.State and if it
is "" or not one of the allowed tokens
(ready|starting|stopping|stopped|shutdown|unknown) set State to "unknown" before
assigning to apiModel so all API responses conform to the documented state set;
reference the apiModel construction and the getModelStatus/m.State source to
make the change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 943e5c59-430f-4e08-9cf3-23e8d80cf95d

📥 Commits

Reviewing files that changed from the base of the PR and between 38f773f and ba70279.

📒 Files selected for processing (2)
  • proxy/process.go
  • proxy/proxymanager_api.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • proxy/process.go

@mostlygeek
Copy link
Copy Markdown
Owner

should we add these fields (state, ttl, etc) top level for each model (e.g. {"name": "foo", "state": "..."}) or should it be a nested object (e.g. {"name": "foo", "llamaSwap": {"state": "..."}})?

Nest it under meta.llamaswap.ttl and meta.llamaswap.ttl_remaining

However, I think the ttlRemaining should be *int/number | null to reduce confusion and effort as then I don't need to check whether ttl is in effect, and whether the state is ready, to understand what a ttlRemaining of 0 actually means. If that makes sense?

It makes sense. I prefer to use an int rather than pointers in the code. I think it's OK to check for ttl=0 to determine if it is enabled. If it is greater than 0, then a ttl_remaining of 0 (or less) would mean the process is going to be shut down or is in the process of shutting down.

@ghostdevv
Copy link
Copy Markdown
Author

I prefer to use an int rather than pointers in the code.

I'm not a fan either but unfortunately I don't know of another way to express int or nil in go, other than with a pointer

I think it's OK to check for ttl=0 to determine if it is enabled. If it is greater than 0, then a ttl_remaining of 0 (or less) would mean the process is going to be shut down or is in the process of shutting down.

Yea but you also need to check if the state is ready because only then does a ttlRemaining of 0 actually mean it's in the process of being shut down.

@ghostdevv ghostdevv changed the title feat: add /api/models endpoint feat: add status and ttl to /v1/models endpoint Apr 2, 2026
@ghostdevv
Copy link
Copy Markdown
Author

@mostlygeek updated 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Models endpoint Feature request: API to show when models are loading

2 participants