This repository includes the official project of MatteFormer, presented in our paper: MatteFormer: Transformer-Based Image Matting via Prior-Tokens [CVPR 22]
In this paper, we propose a transformer-based image matting model called MatteFormer, which takes full advantage of trimap information in the transformer block. Our method first introduces a prior-token which is a global representation of each trimap region (e.g. foreground, background and unknown). These prior-tokens are used as global priors and participate in the self-attention mechanism of each block. Each stage of the encoder is composed of PAST (Prior-Attentive Swin Transformer) block, which is based on the Swin Transformer block, but differs in a couple of aspects: 1) It has PA-WSA (Prior-Attentive Window Self-Attention) layer, performing self-attention not only with spatial-tokens but also with prior-tokens. 2) It has prior-memory which saves prior-tokens accumulatively from the previous blocks and transfers them to the next block. We evaluate our MatteFormer on the commonly used image matting datasets: Composition-1k and Distinctions-646. Experiment results show that our proposed method achieves state-of-the-art performance with a large margin.
The codes are tested in the following environment:
- python 3.8
- pytorch 1.10.1
- CUDA 10.2 & CuDNN 8
Models | SAD | MSE (x10^(-3) | Grad | Conn | Link |
---|---|---|---|---|---|
MatteFormer | 23.80 | 4.03 | 8.68 | 18.90 | model |
1] Get DIM dataset on Deep Image Matting.
2] For DIM dataset preparation, please refer to GCA-Matting.
- For Training, merge 'Adobe-licensed images' and 'Other' folder to use all 431 foregrounds and alphas
- For Testing, use 'Composition_code.py' and 'copy_testing_alpha.sh' in GCA-Matting.
3] For background images, Download dataset on PASCAL and COCO.
*If you want to download prepared test set directly : download link
pip3 install -r requirements.txt
1] Run inference code (the predicted alpha will be save to ./predDIM/pred_alpha by default)
python3 infer.py
2] Evaluate the results by the official evaluation MATLAB code ./DIM_evaluation_code/evaluate.m (provided by Deep Image Matting)
3] You can also check out the evaluation result simplicity with the python code (un-official)
python3 evaluate.py
1] You can get (imagenet pretrained) swin-transformer tiny model ('swin_tiny_patch4_window7_224.pth') on Swin Transformer.
2] modify "config/MatteFormer_Composition1k.toml"
3] run main.py
CUDA_VISIBLE_DEVICES=0,1 python3 -m torch.distributed.launch --nproc_per_node=2 main.py
If you find this work or code useful for your research, please use the following BibTex entry:
@article{park2022matteformer,
title={MatteFormer: Transformer-Based Image Matting via Prior-Tokens},
author={Park, GyuTae and Son, SungJoon and Yoo, JaeYoung and Kim, SeHo and Kwak, Nojun},
journal={arXiv preprint arXiv:2203.15662},
year={2022}
}
- Our Codes are mainly originated from MG-Matting
- Also, we build our codes with reference as GCA-Matting and Swin Transformer for Semantic Segmentation
MatteFormer is licensed under Apache-2.0, except utils/logger.py which is adopted from https://github.com/JiahuiYu/generative_inpainting under CC BY-NC 4.0. See LICENSE for the full license text.
MatteFormer
Copyright 2022-present NAVER WEBTOON
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.