Tomoyuki Suzuki1
Kang-Jun Liu2
Naoto Inoue1
Kota Yamaguchi1
1CyberAgent, 2Tohoku University
This repository is the official implementation of the paper "LayerD: Decomposing Raster Graphic Designs into Layers".
We have verified reproducibility under the following environment.
- Ubuntu 22.04
- Python 3.12.3
- CUDA 12.8 (optional)
- uv 0.8.17
LayerD uses uv to manage the environment and dependencies. You can install this project with the following command:
uv sync
You can decompose an image into layers using the following minimal example:
from PIL import Image
from layerd import LayerD
image = Image.open("./data/test_image_1.png")
layerd = LayerD(matting_hf_card="cyberagent/layerd-birefnet").to("cpu")
layers = layerd.decompose(image)
The output layers
is a list of PIL Image objects in RGBA format.
We provide some test images in the data/
directory.
Note
We recommend PNG images as input to avoid compression artifacts (especially around text edges) that may degrade the inpainting quality. You may mitigate the issue by setting a higher kernel_scale
(default: 0.015) value when initializing LayerD
.
Note
Building LayerD
involves downloading two pre-trained models: the top-layer matting module from the Hugging Face repository cyberagent/layerd-birefnet (~1GB) and the inpainting model from eneshahin/simple-lama-inpainting (~200MB). Please ensure you have a stable internet connection during the first run.
We provide a script to run inference on a dataset.
uv run python ./tools/infer.py \
--input </path/to/input> \
--output-dir </path/to/output> \
--device <device> # e.g., cuda or cpu
--input
can be a file, a directory, or a glob pattern. You can also specify multiple input files like --input img1.png img2.png ...
.
--matting-weight-path
can be used to specify the path to the trained weights of the top-layer matting module. If not specified, it uses the model from cyberagent/layerd-birefnet by default.
We provide code for fine-tuning the top-layer matting part of LayerD on Crello dataset.
You can convert the Crello dataset for top-layer matting training.
uv run python ./tools/generate_crello_matting.py --output-dir </path/to/dataset> --inpainting --save-layers
Note
This script downloads the Crello dataset (<20GB) from Hugging Face. Please ensure you have a stable internet connection and sufficient disk space for the first run.
This will create a dataset with the following structure:
</path/to/save/dataset>
├── train
│ ├── im/ # Input images (full composite or intermediate composite images)
│ ├── gt/ # Ground-truth (top-layer alpha mattes)
│ ├── composite/ # Full composite images (not used for training, but for evaluation)
│ └── layers/ # Ground-truth layers (RGBA) (not used for training, but for evaluation)
├── validation
└── test
You can fine-tune the top-layer matting module on the generated dataset.
We reorganized the training code for this study, based on the original BiRefNet, which is the backbone of the top-layer matting module. Training configuration is managed with Hydra as the training involves a lot of hyperparameters.
Below is an example command to start training with a specified configuration file.
uv run python ./tools/train.py \
config_path=./src/layerd/configs/train.yaml \
data_root=</path/to/dataset> \
out_dir=</path/to/output> \
device=<device> # e.g., cuda or cpu
data_root
is the dataset root path (like </path/to/dataset>
above), out_dir
is the output directory path, and they are mandatory fields in the configuration file that must be specified at runtime.
You also override other hyperparameters in the configuration file by specifying them in the command line arguments.
Training supports distributed mode using both torch.distributed and Hugging Face Accelerate.
To use torch.distributed, launch the training script with torchrun
as follows:
CUDA_VISIBLE_DEVICES=0,1 uv run torchrun --standalone --nproc_per_node 2 \
./tools/train.py \
config_path=./src/layerd/configs/train.yaml \
data_root=</path/to/dataset> \
out_dir=</path/to/output> \
dist=true
For Hugging Face Accelerate, set use_accelerate=true
in the command line arguments.
You can also set the mixed_precision
parameter (options: no
, fp16
, bf16
).
CUDA_VISIBLE_DEVICES=0,1 uv run torchrun --standalone --nproc_per_node 2 \
./tools/train.py \
config_path=./src/layerd/configs/train.yaml \
data_root=</path/to/dataset> \
out_dir=</path/to/output> \
use_accelerate=true \
mixed_precision=bf16
Note
We observe that the training takes about 40 hours using A100 40GB x 4 GPUs with use_accelerate=true
, mixed_precision=bf16
, and the default configuration.
We thank the authors of BiRefNet for releasing their code, which we used as a basis for our matting backbone.
You can calculate the proposed evaluation metrics using the following minimal example:
from layerd.evaluation import LayersEditDist
metric = LayersEditDist()
# Both layers_pred and layers_gt are lists of PIL.Image (RGBA)
result = metric(layers_pred, layers_target)
We also provide a script to run dataset-level evaluation.
uv run python ./tools/evaluate.py \
--pred-dir </path/to/predictions> \
--gt-dir </path/to/groundtruth> \
--output-dir </path/to/output> \
--max-edits 5
--pred-dir
and --gt-dir
need to follow the structure below.
The dataset prepared by the script in Dataset preparation has a layers/
directory (not gt/
) that
follows this structure and is ready for evaluation (e.g., --gt-dir </path/to/crello-matting>/layers
).
</path/to/predictions or groundtruth>
├── {sample_id}
│ ├── 0000.png
│ ├── 0001.png
│ └── ...
└── {sample_id}
├── 0000.png
├── 0001.png
└── ...
This project is licensed under the Apache-2.0 License. See the LICENSE file for details.
This project makes use of several third-party libraries, each of which has its own license:
If you find this project useful in your work, please cite our paper.
@inproceedings{suzuki2025layerd,
title={LayerD: Decomposing Raster Graphic Designs into Layers},
author={Suzuki, Tomoyuki and Liu, Kang-Jun and Inoue, Naoto and Yamaguchi, Kota},
booktitle={ICCV},
year={2025}
}