Skip to content

Alpha-VLLM/Lumina-Image-2.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

db469d4 ยท Mar 8, 2025

History

39 Commits
Feb 4, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Mar 8, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Jan 25, 2025
Feb 3, 2025
Jan 25, 2025

Repository files navigation


Lumina-Image 2.0: A Unified and Efficient Image Generative Framework

Static Badge Badge 

Static Badge

Static Badge  Static Badge 

๐Ÿ“ฐ News

  • [2025-2-20] Diffusers team released a LoRA fine-tuning script for Lumina2. Find out more here.
  • [2025-2-12] Lumina 2.0 is now available in Diffusers. Check out the docs to know more.
  • [2025-2-10] The official Hugging Face Space for Lumina-Image 2.0 is now available.
  • [2025-2-10] Preliminary explorations of video generation with Lumina-Video 1.0 have been released.
  • [2025-2-5] ComfyUI now supports Lumina-Image 2.0! ๐ŸŽ‰ Thanks to ComfyUI@ComfyUI! ๐Ÿ™Œ Feel free to try it out! ๐Ÿš€
  • [2025-1-31] We have released the latest .pth format weight file Hugging Face.
  • [2025-1-25] ๐Ÿš€๐Ÿš€๐Ÿš€ We are excited to release Lumina-Image 2.0, including:
    • ๐ŸŽฏ Checkpoints, Fine-Tuning and Inference code.
    • ๐ŸŽฏ Website & Demo are live now! Check out the Huiying and Gradio Demo!

๐Ÿ“‘ Open-source Plan

  • Inference
  • Checkpoints
  • Web Demo (Gradio)
  • Finetuning code
  • ComfyUI
  • Diffusers
  • Technical Report
  • Unified multi-image generation
  • Control
  • PEFT (LLaMa-Adapter V2)

๐ŸŽฅ Demo

Demo.mp4

๐ŸŽจ Qualitative Performance

Qualitative Results

๐Ÿ“Š Quantatitive Performance

Quantitative Results

๐ŸŽฎ Model Zoo

Resolution Parameter Text Encoder VAE Download URL
1024 2.6B Gemma-2-2B FLUX-VAE-16CH hugging face

๐Ÿ’ป Finetuning Code

1. Create a conda environment and install PyTorch

conda create -n Lumina2 -y
conda activate Lumina2
conda install python=3.11 pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia -y

2.Install dependencies

pip install -r requirements.txt

3. Install flash-attn

pip install flash-attn --no-build-isolation

4. Prepare data

You can place the links to your data files in ./configs/data.yaml. Your image-text pair training data format should adhere to the following:

{
    "image_path": "path/to/your/image",
    "prompt": "a description of the image"
}

5. Start finetuning

bash scripts/run_1024_finetune.sh

๐Ÿš€ Inference Code

We support multiple solvers including Midpoint Solver, Euler Solver, and DPM Solver for inference.

Note

Both the Gradio demo and the direct inference method use the .pth format weight file, which can be downloaded from Google Drive.

Note

You can also directly download from huggingface. We have uploaded the .pth weight files, and you can simply specify the --ckpt argument as the download directory.

  • Gradio Demo
python demo.py \
    --ckpt /path/to/your/ckpt \
    --res 1024 \
    --port 12123
  • Direct Batch Inference
bash scripts/sample.sh

Citation

If you find the provided code or models useful for your research, consider citing them as:

@misc{lumina2,
    author={Qi Qin and Le Zhuo and Yi Xin and Ruoyi Du and Zhen Li and Bin Fu and Yiting Lu and Xinyue Li and Dongyang Liu and Xiangyang Zhu and Will Beddow and Erwann Millon and Victor Perez,Wenhai Wang and Yu Qiao and Bo Zhang and Xiaohong Liu and Hongsheng Li and Chang Xu and Peng Gao},
    title={Lumina-Image 2.0: A Unified and Efficient Image Generative Framework},
    year={2025},
    howpublished={\url{https://github.com/Alpha-VLLM/Lumina-Image-2.0}},
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published