Check our paper Latent Diffusion Approaches for Conditional Generation of Aerial Imagery: A Study (2025), published at the MLBriefs 2024 workshop of the Image Processing On Line journal (IPOL).
This repository allows to explore a conditional latent diffusion model for the generation of aerial images from an input map.
We used the public pix2pix-maps
dataset. Available here.
Based on zyinghua/uncond-image-generation-ldm and huggingface/diffusers.
If you find this code or work helpful, please cite:
@article{mari2025latent,
title={Latent Diffusion Approaches for Conditional Generation of Aerial Imagery: A Study},
author={Mar{\'\i}, Roger and Redondo, Rafael},
journal={Image Processing On Line},
year={2025}
}
Figure 1: Left to right: Real aerial image, conditional map input to the LDM and 2 different synthetic output samples
Use the script setup_ldm-mlbriefs24_venv.sh
to install the necessary conda environment and train the LDM on your own dataset.
bash setup_ldm-mlbriefs24_venv.sh
We release the pre-trained weights of our model.
The script run_demo.py
can be used to run the pre-trained model. Example:
python3 run_demo.py --img_path example_data/206_map.jpg --time_steps 500
The parameter img_path
points towards the input condition (the map image).
The parameter time_steps
refers to the number of steps used in the reverse denoising process for image generation (positive integer).
A lower number of time_steps
results in a lower quality image synthesis, but a higher inference speed.
Set time_steps
< 50 for low quality, < 500 for medium quality, 1000 for max quality.
You can also check our online demo.
The train_maps.py
script can be used to train the diffusion model from scratch.
MLBriefs24_run_exp.sh
runs all the experiments discussed in the paper.