animatediff refactor, because I can. with significantly lower VRAM usage.
Also, infinite generation length support! yay!
PRs welcome! 😆😅
This can theoretically run on CPU, but it's not recommended. Should work fine on a GPU, nVidia or otherwise,
but I haven't tested on non-CUDA hardware. Uses PyTorch 2.0 Scaled-Dot-Product Attention (aka builtin xformers)
by default, but you can pass --xformers
to force using xformers if you really want.
I should write some more detailed steps, but here's the gist of it:
git clone https://github.com/neggles/animatediff-cli
cd animatediff-cli
python3.10 -m venv .venv
source .venv/bin/activate
# install Torch. Use whatever your favourite torch version >= 2.0.0 is, but, good luck on non-nVidia...
python -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# install the rest of all the things (probably! I may have missed some deps.)
python -m pip install -e '.[dev]'
# you should now be able to
animatediff --help
# There's a nice pretty help screen with a bunch of info that'll print here.
From here you'll need to put whatever checkpoint you want to use into data/models/sd
, copy
one of the prompt configs in config/prompts
, edit it with your choices of prompt and model (model
paths in prompt .json files are relative to data/
, e.g. models/sd/vanilla.safetensors
), and
off you go.
Then it's something like (for an 8GB card):
animatediff generate -c 'config/prompts/waifu.json' -W 576 -H 576 -L 128 -C 16
You may have to drop -C
down to 8 on cards with less than 8GB VRAM, and you can raise it to 20-24
on cards with more. 24 is max.
N.B. generating 128 frames is slow...
I have added experimental support for rife-ncnn-vulkan
using the animatediff rife interpolate
command. It has fairly self-explanatory help, and it has
been tested on Linux, but I've no idea if it'll work on Windows.
Either way, you'll need ffmpeg installed on your system and present in PATH, and you'll need to
download the rife-ncnn-vulkan release for your OS of choice from the GitHub repo (above). Unzip it, and
place the extracted folder at data/rife/
. You should have a data/rife/rife-ncnn-vulkan
executable, or data\rife\rife-ncnn-vulkan.exe
on Windows.
You'll also need to reinstall the repo/package with:
python -m pip install -e '.[rife]'
or just install ffmpeg-python
manually yourself.
Default is to multiply each frame by 8, turning an 8fps animation into a 64fps one, then encode that to a 60fps WebM. (If you pick GIF mode, it'll be 50fps, because GIFs are cursed and encode frame durations as 1/100ths of a second).
Seems to work pretty well...
In no particular order:
- Infinite generation length support
- RIFE support for motion interpolation (
rife-ncnn-vulkan
isn't the greatest implementation) - Export RIFE interpolated frames to a video file (webm, mp4, animated webp, hevc mp4, gif, etc.)
- Generate infinite length animations on a 6-8GB card (at 512x512 with 8-frame context, but hey it'll do)
- Torch SDP Attention (makes xformers optional)
- Support for
clip_skip
in prompt config - Experimental support for
torch.compile()
(upstream Diffusers bugs slow this down a little but it's still zippy) - Batch your generations with
--repeat
! (e.g.--repeat 10
will repeat all your prompts 10 times) - Call the
animatediff.cli.generate()
function from another Python program without reloading the model every time - Drag remaining old Diffusers code up to latest (mostly)
- Add a webUI (maybe, there are people wrapping this already so maybe not?)
- img2img support (start from an existing image and continue)
- Stop using custom modules where possible (should be able to use Diffusers for almost all of it)
- Automatic generate-then-interpolate-with-RIFE mode
see guoyww/AnimateDiff (very little of this is my work)
n.b. the copyright notice in COPYING
is missing the original authors' names, solely because
the original repo (as of this writing) has no name attached to the license. I have, however,
used the same license they did (Apache 2.0).