Skip to content

Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronization. (ICLR 2025)

License

Notifications You must be signed in to change notification settings

KAIST-Visual-AI-Group/StochSync

Repository files navigation

StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces

Kyeongmin Yeo *,   Jaihoon Kim *,   Minhyuk Sung  

KAIST 

*Equal contribution 


   

Teaser Image


Introduction

We propose $\texttt{StochSync}$, a method for generating images in arbitrary spaces—such as 360° panoramas or textures on 3D surfaces—using a pretrained image diffusion model. The main challenge is bridging the gap between the 2D images understood by the diffusion model (instance space $\mathcal{X}$) and the target space for image generation (canonical space $\mathcal{Z}$). Unlike previous methods that struggle without strong conditioning or lack fine details, $\texttt{StochSync}$ combines the strengths of Diffusion Synchronization and Score Distillation Sampling to perform effectively even with weak conditioning. Our experiments show that $\texttt{StochSync}$ outperforms prior finetuning-based methods, especially in 360° panorama generation.


Environment and Requirements

Tested Environment

  • Python: 3.9
  • CUDA: CUDA 12.1
  • GPU: Tested on NVIDIA RTX 3090 and RTX A6000

Installation Steps

  1. Clone the Repository with Submodules:

    git clone --recursive https://github.com/KAIST-Visual-AI-Group/StochSync.git & cd StochSync
  2. Create Conda Environment:

    conda create -n stochsync python=3.9 -y
    conda activate stochsync
  3. Install Core Dependencies:

    First, install PyTorch and xformers compatible with your CUDA environment. For example:

    pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 xformers --index-url https://download.pytorch.org/whl/cu121
  4. Install Python Dependencies:

    Install remaining dependencies using requirements.txt and additional modules in third_party/:

    pip install -r requirements.txt
    pip install third_party/gsplat/
    pip install third_party/nvdiffrast/

Usage and Examples

Running StochSync

We provide several example configurations in the config/ directory. Below are examples for different applications:

  • Format:

    python main.py --config "your_config.yaml" root_dir="root_dir_for_results" tag="run_name" text_prompt="your text prompt here" [other application-specific options]
  • 360° Panorama Generation:

    python main.py --config config/stochsync_panorama.yaml text_prompt="A vibrant urban alleyway filled with colorful graffiti, and stylized lettering on wall"
  • 3D Mesh Texturing:

    python main.py --config config/stochsync_mesh.yaml mesh_path="./data/mesh/face.obj" text_prompt="Kratos bust, God of War, god of power, hyper-realistic and extremely detailed."
  • Sphere & Torus Texture Generation:

    python main.py --config config/stochsync_sphere.yaml text_prompt="Paint splatter texture."
    python main.py --config config/stochsync_torus.yaml text_prompt="Paint splatter texture."

Testing

We provide comprehensive tests to validate the functionality of our modules. To run the tests, execute:

python run_unit_test.py --extensive --devices {list of gpu indices to use}

Test results will be stored in the directory: unit_test_results/{application}.


Citation

If you find our work useful, please consider citing our paper:

@article{yeo2025stochsync,
  title={StochSync: Stochastic Diffusion Synchronization for Image Generation in Arbitrary Spaces},
  author={Yeo, Kyeongmin and Kim, Jaihoon and Sung, Minhyuk},
  journal={arXiv e-prints},
  pages={arXiv--2501},
  year={2025}
}

Acknowledgements

This repository builds upon several outstanding projects and libraries. We would like to express our gratitude to the developers and contributors of:

  • NVDiffrast
  • paint-it
  • gsplat
  • mvdream

Their work has been instrumental in the development of StochSync.

About

Official implementation of StochSync: a zero-shot approach for image generation in arbitrary spaces via stochastic diffusion synchronization. (ICLR 2025)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published