Skip to content

Conversation

@psychedelicious
Copy link
Contributor

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update
  • Community Node Submission

Description

Note: this PR is based on the #4758 branch

feat(nodes): WIP restricted invocation context

This is a first draft of a restricted InvocationContext, which is provided to all nodes.

Motivation

The existingInvocationContext class has two issues.

1. Too powerful

It's entirely possible that a node could wreck an installation by accidentally calling a destructive method or something like that.

Also, the application relies on nodes to be well-behaved and never do certain things (like delete images), so we should just enforce this explicitly.

It's probably not possible to prevent a node from doing something naughty if it wants to, but we can reduce the chance of unintentional footgunning to almost zero by providing a restricted API.

2. Too low-level/complicated

Many of the service methods are low-level and present an awkward API.

Consider the method to save an image:

image_dto = context.services.images.create(
    image=image,
    image_origin=ResourceOrigin.INTERNAL,
    image_category=ImageCategory.GENERAL,
    node_id=self.id,
    session_id=context.graph_execution_state_id,
    is_intermediate=self.is_intermediate,
    metadata=self.metadata.model_dump() if self.metadata else Non
    workflow=self.workflow,
)

There's a crapload of fluff here that nodes shouldn't need to worry about. It's mostly boilerplate.

Draft API

The full-fat context is now called AppInvocationContext, and the restricted one is InvocationContext. These names can be changed, I just did this so I didn't have to change a bazillion imports in all nodes while testing.

See base_invocation.py for the full implementation - it's a simple wrapper.

Example improved method:

The above create() method is reduced to this:

image_name = context.save_image(image)

The other methods don't have as drastic an improvement but they are all improved.

Discussion points

In addition to the design and implementation, there are a few points that probably warrant discussion.

  1. Core vs community nodes

This restricted API is currently set up to apply to all nodes - core, InvokeAI-managed nodes and future community nodes.

Do we want to maybe let core nodes get access to the original, full-fat context, and only give the restricted context to community nodes?

  1. Power creep

So far, I've implemented most of what our core nodes use. What if community nodes want more access - say, to star images? How do we decide what makes the cut?

  1. Exposing extra stuff in the context

There are some constants/enums/things that we want to provide to nodes. We can do that via the context, as opposed to via imports.

Example:
Images have categories, which are selected via the ImageCategory enum - see the create() method above for how it works in the current release.

In this PR, I've exposed that enum as context.categories and made this an optional arg:

context.save_image(image, category=context.categories.GENERAL)

This feels neat and tidy but I'm unsure if it's a good idea. Maybe the context should be methods and context attributes only?

Related Tickets & Documents

QA Instructions, Screenshots, Recordings

You can test this now, using a simple txt2img workflow with sd1.5 (other nodes are not migrated). You'll first need to run pip install -e ".[dev,test]" to get pydantic v2 installed.

simple t2i.json

Added/updated tests?

  • Yes
  • No : please replace this line with details on why tests
    have not been included

[optional] Are there any post deployment tasks we need to perform?

Upgrade pydantic and fastapi to latest.

- pydantic~=2.4.2
- fastapi~=103.2
- fastapi-events~=0.9.1

**Big Changes**

There are a number of logic changes needed to support pydantic v2. Most changes are very simple, like using the new methods to serialized and deserialize models, but there are a few more complex changes.

**Invocations**

The biggest change relates to invocation creation, instantiation and validation.

Because pydantic v2 moves all validation logic into the rust pydantic-core, we may no longer directly stick our fingers into the validation pie.

Previously, we (ab)used models and fields to allow invocation fields to be optional at instantiation, but required when `invoke()` is called. We directly manipulated the fields and invocation models when calling `invoke()`.

With pydantic v2, this is much more involved. Changes to the python wrapper do not propagate down to the rust validation logic - you have to rebuild the model. This causes problem with concurrent access to the invocation classes and is not a free operation.

This logic has been totally refactored and we do not need to change the model any more. The details are in `baseinvocation.py`, in the `InputField` function and `BaseInvocation.invoke_internal()` method.

In the end, this implementation is cleaner.

**Invocation Fields**

In pydantic v2, you can no longer directly add or remove fields from a model.

Previously, we did this to add the `type` field to invocations.

**Invocation Decorators**

With pydantic v2, we instead use the imperative `create_model()` API to create a new model with the additional field. This is done in `baseinvocation.py` in the `invocation()` wrapper.

A similar technique is used for `invocation_output()`.

**Minor Changes**

There are a number of minor changes around the pydantic v2 models API.

**Protected `model_` Namespace**

All models' pydantic-provided methods and attributes are prefixed with `model_` and this is considered a protected namespace. This causes some conflict, because "model" means something to us, and we have a ton of pydantic models with attributes starting with "model_".

Forunately, there are no direct conflicts. However, in any pydantic model where we define an attribute or method that starts with "model_", we must tell set the protected namespaces to an empty tuple.

```py
class IPAdapterModelField(BaseModel):
    model_name: str = Field(description="Name of the IP-Adapter model")
    base_model: BaseModelType = Field(description="Base model")

    model_config = ConfigDict(protected_namespaces=())
```

**Model Serialization**

Pydantic models no longer have `Model.dict()` or `Model.json()`.

Instead, we use `Model.model_dump()` or `Model.model_dump_json()`.

**Model Deserialization**

Pydantic models no longer have `Model.parse_obj()` or `Model.parse_raw()`, and there are no `parse_raw_as()` or `parse_obj_as()` functions.

Instead, you need to create a `TypeAdapter` object to parse python objects or JSON into a model.

```py
adapter_graph = TypeAdapter(Graph)
deserialized_graph_from_json = adapter_graph.validate_json(graph_json)
deserialized_graph_from_dict = adapter_graph.validate_python(graph_dict)
```

**Field Customisation**

Pydantic `Field`s no longer accept arbitrary args.

Now, you must put all additional arbitrary args in a `json_schema_extra` arg on the field.

**Schema Customisation**

FastAPI and pydantic schema generation now follows the OpenAPI version 3.1 spec.

This necessitates two changes:
- Our schema customization logic has been revised
- Schema parsing to build node templates has been revised

The specific aren't important, but this does present additional surface area for bugs.

**Performance Improvements**

Pydantic v2 is a full rewrite with a rust backend. This offers a substantial performance improvement (pydantic claims 5x to 50x depending on the task). We'll notice this the most during serialization and deserialization of sessions/graphs, which happens very very often - a couple times per node.

I haven't done any benchmarks, but anecdotally, graph execution is much faster. Also, very larges graphs - like with massive iterators - are much, much faster.
@psychedelicious psychedelicious force-pushed the chore/pydantic-fastapi-upgrade branch from 0aedd6d to d9a2249 Compare October 17, 2023 03:16
Base automatically changed from chore/pydantic-fastapi-upgrade to main October 17, 2023 03:59
@maryhipp
Copy link
Contributor

Has this been updated with pydantic changes?

@RyanJDick
Copy link
Contributor

Overall, I love this initiative! Thanks for putting together a draft.

It's a little hard to review right now because the diff is polluted with pydantic changes. I'll just leave a few high-level comments for now.

@RyanJDick
Copy link
Contributor

Do we want to maybe let core nodes get access to the original, full-fat context, and only give the restricted context to community nodes?

In my opinion, we should aim to have all nodes (core and community) interface with the same API. It's nice to have a clear boundary between nodes and the app backend. I think there are a few downsides to allowing special cases for core nodes:

  • If we allow it once, there will be temptation to do it again and again.
  • It adds new considerations for any general node handling (e.g. adds complexity to node versioning, automatic node migration, etc.)
  • It will become harder to write good node tests. If the node API is fixed, we can easily write a mock context implementation that is shared across all tests.

Put another way, before adding a method to the context we should ask ourselves: "Are we willing to support this method for both core and community nodes indefinitely?"

Of course, my mind could be changed if there was a compelling use case.

@RyanJDick
Copy link
Contributor

There are some constants/enums/things that we want to provide to nodes. We can do that via the context, as opposed to via imports.

What if we defined a standard import path for all node utilities? (e.g. from invokeai.invocation_api import ImageCategory)

I don't love the idea that the context would be responsible for managing types as well.

@RyanJDick
Copy link
Contributor

I think there would be value in having a single file that node developers can reference that cleanly contains the API that they are expected to interact with. (TBD whether this should just be an ABC or the actual implementation). If we went with an ABC, then in my mind it would look something like this:

class InvocationContext(ABC):

	@abstractmethod
	def get_image(self, name: str) -> ImageType:
		"""<insert a very detailed docstring>
		"""
		pass

	@abstractmethod
	def save_image(self, name: str) -> ImageType:
		"""<insert a very detailed docstring>
		"""
		pass

	...

Separately, I think there is an opportunity to simplify the BaseInvocation that nodes implement. Right now the BaseInvocation contains a bunch of methods that node developers really don't need to know anything about. This is arguably out-of-scope for this PR, but I'm bringing it up now because it could be nice to set the node API once and then change it very rarely.

If we were open to re-designing the node API, I think there are other conversations to be had around what this API should look like (@invocation decorator, declarative vs. imperative, etc.).

@JPPhoto
Copy link
Contributor

JPPhoto commented Oct 18, 2023

As somebody interested in color spaces and wider gamuts and how those could be used in the future, I'd like to see load_image and save_image methods become load_PIL_image and save_PIL_image methods. This would allow us to support other image formats and APIs going forward that utilized 16 bits/LAB/etc.

@psychedelicious
Copy link
Contributor Author

Superseded by #5491

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[enhancement]: Invocation Context is too powerful and should have a restricted API

5 participants