Skip to content

Conversation

@mudler
Copy link
Owner

@mudler mudler commented Dec 25, 2025

Description

This PR changes eviction behavior such as by default we don't evict anymore models which currently are busy with other requests. The PR also provides a flag to disable/enable this behavior, and various handles to configure the behavior.

These changes make eviction logic more robust and configurable both via CLI/environment variables and the web UI, and allow dynamic updates through the API without requiring a restart.

(below copilot summary)

Eviction and LRU Retry Settings Enhancements:

  • Added new configuration fields to ApplicationConfig and RuntimeSettings for:

    • Forcing eviction even when models have active API calls (ForceEvictionWhenBusy).
    • Setting the maximum number of retries for LRU eviction (LRUEvictionMaxRetries).
    • Setting the retry interval for LRU eviction attempts (LRUEvictionRetryInterval).
      [1] [2]
  • Updated CLI flags, environment variables, and startup logic to support the new eviction settings, allowing them to be set at launch.
    [1] [2] [3]

  • Modified runtime settings loading and application logic to:

    • Read the new settings from runtime_settings.json.
    • Apply changes dynamically without requiring application restarts.
    • Validate duration formats and provide warnings or errors for invalid inputs.
      [1] [2] [3] [4]

Dynamic Update and Watchdog Integration:

  • Enhanced the watchdog and model loader initialization to use the new eviction settings and allow updating them at runtime through the API.
    [1] [2] [3]

Web UI Improvements:

  • Added UI controls in settings.html for the new eviction settings, including checkboxes and input fields, and ensured these settings are loaded, displayed, and submitted correctly via the frontend.
    [1] [2] [3] [4]

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

…ight

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Otherwise calls that in order to pass would need to shutdown other
backends would just fail.

In this way instead we make the request sit and retry eviction until it
succeeds. The thresholds can be configured by the user.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
@netlify
Copy link

netlify bot commented Dec 25, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 07640ea
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/694d05d933aa6200085fddb2
😎 Deploy Preview https://deploy-preview-7725--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@mudler mudler added the enhancement New feature or request label Dec 25, 2025
@mudler mudler merged commit c844b7a into master Dec 25, 2025
33 checks passed
@mudler mudler deleted the feat/force-eviction branch December 25, 2025 13:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants