Model Manager v2, feature complete #5694

lstein · 2024-02-10T23:51:30Z

What type of PR is this? (check all applicable)

Have you discussed this change with the InvokeAI team?

Yes
No, because:

Have you updated all relevant documentation?

Yes
No

Description

This PR adds model loading and caching functionality to the model manager refactor. Full details can be found in docs/contrib/MODEL_MANAGER. Compared to the original MM version, there are a few API changes.

The initialized model manager service can now be found in the invocation context's context.services.model_manager attribute. This is a container with three parts, each corresponding to a different feature set of the model manager:
** context.services.model_manager.store - The ModelRecordService service for retrieving model configurations.
** context.services.model_manager.install - The ModelInstallService for model installation, deletion and manipulation.
** context.services.model_manager.load - The ModelLoadService for loading models into memory, adjudicating VRAM usage, and managing the conversion cache
There are now three methods for loading models into memory: load_model_by_key(), load_model_by_attr() and load_model_by_config(). The first uses a key returned by the ModelRecordService to load the model uniquely identified by that key. The second uses the familiar base/type/name trio to fetch and load a model. The third loads a model from the model configuration record provided by ModelRecordService. Invocations have been modified to use load_model_by_key() in almost all cases.
The model loading methods return a LoadedModel object, which has the attributes config and model. The config attribute contains a copy of the model's configuration record, from which you can get the name, type, description, base type, etc. The model attribute retrieves the loaded model itself. As previously, you create a context with LoadedModel to load the model into the execution device (e.g. VRAM for a CUDA system) and lock it there while it is in use:

loaded_model = context.services.model_manager.load_model_by_key('83191a81bbf9118945')
with loaded_model as vae:
     vae.decode(...)

By popular request, you can now adjust the size of the conversion cache by setting the InvokeAI configuration option convert_cache to the maximum size (in gigabytes) that the convert disk cache can grow to. Set it to zero to not cache any models not actively in use.

Important Caveats

I have not made any changes to the front end at all, so the model selection popups are non-functional and generations don't work.
I don't have a good way of running invocations from the command line, and so I haven't tested that they are working properly. There were also a number of changes needed for loras and textual inversions, and these haven't been tested. I fully expect there to be breakage.
On the other hand, I did spend quite a lot of time tracking down type checking errors that were pre-existing in the generation and model patching code.
I've fixed a couple of places where there were TODO's in the model generation/patching code. This includes automatic detection and registration of the correct image encoder for IP-Adapters, which I found requested in a bit of Ryan's code.

Related Tickets & Documents

Related Issue #
Closes #

QA Instructions, Screenshots, Recordings

This will require front end work to reenable the model popups and to manage the new multi-threaded background model downloading and installation features. I don't expect things to work fully the first time. When enough changes have been made to the front end to reveal the backend failures, I would be pleased to track down and fix those bugs.

Merge Plan

Currently it won't merge due to conflicts, which I'll address. There will also be conflicts with #5491 , which affects all the load_model() calls. I'm not sure of the optimal merge strategy for these two PRs.

Added/updated tests?

Yes -- but limited to one small embedding
No : please replace this line with details on why tests
have not been included

[optional] Are there any post deployment tasks we need to perform?

- Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error.

- Implement new model loader and modify invocations and embeddings - Finish implementation loaders for all models currently supported by InvokeAI. - Move lora, textual_inversion, and model patching support into backend/embeddings. - Restore support for model cache statistics collection (a little ugly, needs work). - Fixed up invocations that load and patch models. - Move seamless and silencewarnings utils into better location

- Replace legacy model manager service with the v2 manager. - Update invocations to use new load interface. - Fixed many but not all type checking errors in the invocations. Most were unrelated to model manager - Updated routes. All the new routes live under the route tag `model_manager_v2`. To avoid confusion with the old routes, they have the URL prefix `/api/v2/models`. The old routes have been de-registered. - Added a pytest for the loader. - Updated documentation in contributing/MODEL_MANAGER.md

invokeai/app/services/model_manager/model_manager_default.py

- Begin to add SwaggerUI documentation for AnyModelConfig and other discriminated Unions.

invokeai/app/services/model_load/model_load_base.py

invokeai/app/invocations/sdxl.py

invokeai/app/invocations/onnx.py

invokeai/app/invocations/sdxl.py

…on ModelInfo to submodel_type to support new params in model manager

… accept keys, remove invalid assertion

…/InvokeAI into refactor/model-manager2/loader

… ModelLoader

RyanJDick

It's a huge diff, so I didn't try to review with much rigour. I just left a few comments on a few things that I noticed as I skimmed through.

invokeai/app/invocations/compel.py

invokeai/app/invocations/ip_adapter.py

invokeai/app/services/download/download_default.py

invokeai/app/services/invocation_stats/invocation_stats_default.py

RyanJDick · 2024-02-15T15:37:34Z

invokeai/app/services/model_install/model_install_default.py

+        """Block until the indicated job has reached terminal state, or when timeout limit reached."""
+        start = time.time()
+        while not job.in_terminal_state:
+            if self._install_completed_event.wait(timeout=5):  # in case we miss an event


Same comment as earlier regarding the hard-coded timeout. Also applies to wait_for_installs(...), below.

All modified to use an inner timeout of 0.25. Will this be OK?

Good enough 🙂 Could cap it based on the remaining time, but this seems good enough for now

invokeai/backend/install/install_helper.py

invokeai/backend/model_manager/load/__init__.py

invokeai/backend/model_manager/load/load_base.py

invokeai/backend/model_manager/load/load_default.py

- ModelMetadataStoreService is now injected into ModelRecordStoreService (these two services are really joined at the hip, and should someday be merged) - ModelRecordStoreService is now injected into ModelManagerService - Reduced timeout value for the various installer and download wait*() methods - Introduced a Mock modelmanager for testing - Removed bare print() statement with _logger in the install helper backend. - Removed unused code from model loader init file - Made `locker` a private variable in the `LoadedModel` object. - Fixed up model merge frontend (will be deprecated anyway!)

- Rename old "model_management" directory to "model_management_OLD" in order to catch dangling references to original model manager. - Caught and fixed most dangling references (still checking) - Rename lora, textual_inversion and model_patcher modules - Introduce a RawModel base class to simplfy the Union returned by the model loaders. - Tidy up the model manager 2-related tests. Add useful fixtures, and a finalizer to the queue and installer fixtures that will stop the services and release threads.

- Replace AnyModelLoader with ModelLoaderRegistry - Fix type check errors in multiple files - Remove apparently unneeded `get_model_config_enum()` method from model manager - Remove last vestiges of old model manager - Updated tests and documentation resolve conflict with seamless.py

lstein · 2024-02-18T04:07:01Z

This comment records commits that will need to be cherry-picked into next:
09e7d35
ed2d9ae
4ffe672

psychedelicious · 2024-02-19T10:59:57Z

Superseded by next branch

Lincoln Stein added 13 commits January 22, 2024 14:37

add concept of repo variant

6b8a6e1

add ram cache module and support files

a380d1f

merge with main

4c5aedb

model loading and conversion implemented for vaes

9804cb0

loaders for main, controlnet, ip-adapter, clipvision and t2i

420f605

Merge branch 'main' into refactor/model-manager2/loader

26f721d

added textual inversion and lora loaders

37675ee

Multiple refinements on loaders:

531d2c8

- Cache stat collection enabled. - Implemented ONNX loading. - Add ability to specify the repo version variant in installer CLI. - If caller asks for a repo version that doesn't exist, will fall back to empty version rather than raising an error.

fix invokeai_configure script to work with new mm; rename CLIs

1eeca48

probe for required encoder for IPAdapters and add to config

a6508d1

consolidate model manager parts into a single class

1d724bc

lstein requested review from GreggHelt2, Millu, RyanJDick, blessedcoolant, brandonrising, ebr, hipsterusername and psychedelicious as code owners February 10, 2024 23:51

github-actions bot added documentation Improvements or additions to documentation api python PRs that change python files Root PythonDeps invocations PRs that change invocations backend PRs that change backend files services PRs that change app services labels Feb 10, 2024

brandonrising reviewed Feb 12, 2024

View reviewed changes

invokeai/app/services/model_manager/model_manager_default.py Outdated Show resolved Hide resolved

Lincoln Stein added 2 commits February 12, 2024 23:31

add route for model conversion from safetensors to diffusers

433eb73

- Begin to add SwaggerUI documentation for AnyModelConfig and other discriminated Unions.

fix a number of typechecking errors

bd802d1

brandonrising reviewed Feb 13, 2024

View reviewed changes

invokeai/app/services/model_load/model_load_base.py Outdated Show resolved Hide resolved

brandonrising reviewed Feb 13, 2024

View reviewed changes

invokeai/app/invocations/sdxl.py Outdated Show resolved Hide resolved

brandonrising reviewed Feb 13, 2024

View reviewed changes

invokeai/app/invocations/onnx.py Outdated Show resolved Hide resolved

brandonrising reviewed Feb 13, 2024

View reviewed changes

invokeai/app/invocations/sdxl.py Outdated Show resolved Hide resolved

Millu mentioned this pull request Feb 14, 2024

[bug]: Files being sotred in unexpected places #4095

Closed

1 task

Millu linked an issue Feb 14, 2024 that may be closed by this pull request

[bug]: Files being sotred in unexpected places #4095

Closed

1 task

Millu mentioned this pull request Feb 14, 2024

[enhancement]: model config storage & data structure #4043

Closed

1 task

brandonrising and others added 7 commits February 14, 2024 09:36

Remove references to model_records service, change submodel property …

b0d67ea

…on ModelInfo to submodel_type to support new params in model manager

References to context.services.model_manager.store.get_model can only…

d4525e1

… accept keys, remove invalid assertion

Run ruff check

2c1b8c0

improve swagger documentation

ec77599

Merge branch 'refactor/model-manager2/loader' of github.com:invoke-ai…

8ac4b9b

…/InvokeAI into refactor/model-manager2/loader

Update _get_hf_load_class to support clipvision models

5f4ce0b

Raise InvalidModelConfigException when unable to detect load class in…

9758082

… ModelLoader

psychedelicious mentioned this pull request Feb 15, 2024

v4.0.0 #5724

Closed

RyanJDick reviewed Feb 15, 2024

View reviewed changes

Lincoln Stein added 2 commits February 15, 2024 23:25

lstein marked this pull request as ready for review February 18, 2024 03:39

lstein force-pushed the refactor/model-manager2/loader branch from 0bd2900 to 640afa0 Compare February 18, 2024 03:55

lstein force-pushed the refactor/model-manager2/loader branch from 640afa0 to 4ffe672 Compare February 18, 2024 04:04

psychedelicious mentioned this pull request Feb 18, 2024

chore: absorb mm2 commits #5740

Merged

psychedelicious closed this Feb 19, 2024

lstein mentioned this pull request Feb 19, 2024

Reimplement model config storage backend, download and install modules #4252

Closed

10 tasks

Model Manager v2, feature complete #5694

Model Manager v2, feature complete #5694

Uh oh!

Conversation

lstein commented Feb 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Have you discussed this change with the InvokeAI team?

Have you updated all relevant documentation?

Description

Important Caveats

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

Merge Plan

Added/updated tests?

[optional] Are there any post deployment tasks we need to perform?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RyanJDick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

RyanJDick Feb 15, 2024

Choose a reason for hiding this comment

Uh oh!

lstein Feb 16, 2024

Choose a reason for hiding this comment

Uh oh!

RyanJDick Feb 18, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lstein commented Feb 18, 2024

Uh oh!

psychedelicious commented Feb 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lstein commented Feb 10, 2024 •

edited

Loading