Skip to content

Latest commit

 

History

History
116 lines (83 loc) · 5.58 KB

README.md

File metadata and controls

116 lines (83 loc) · 5.58 KB

Multimodal Industrial Anomaly Detection via Hybrid Fusion

piplien

  • The pipeline of Multi-3D-Memory (M3DM). Our M3DM contains three important parts: (1) Point Feature Alignment (PFA) converts Point Group features to plane features with interpolation and project operation, $\text{FPS}$ is the farthest point sampling and $\mathcal{F_{pt}}$ is a pretrained Point Transformer; (2) Unsupervised Feature Fusion (UFF) fuses point feature and image feature together with a patch-wise contrastive loss $\mathcal{L_{con}}$, where $\mathcal{F_{rgb}}$ is a Vision Transformer, $\chi_{rgb},\chi_{pt}$ are MLP layers and $\sigma_r, \sigma_p$ are single fully connected layers; (3) Decision Layer Fusion (DLF) combines multimodal information with multiple memory banks and makes the final decision with 2 learnable modules $\mathcal D_a, \mathcal{D_s}$ for anomaly detection and segmentation, where $\mathcal{M_{rgb}}$, $\mathcal{M_{fs}}$, $\mathcal{M_{pt}}$ are memory banks, $\phi, \psi$ are score function for single memory bank detection and segmentation, and $\mathcal{P}$ is the memory bank building algorithm.

Paper

Setup

We implement this repo with the following environment:

  • Python 3.8
  • Pytorch 1.9.0
  • CUDA 11.3

Install the other package via:

pip install -r requirement.txt
# install knn_cuda
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
# install pointnet2_ops_lib
pip install "git+git://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

Data Download and Preprocess

Dataset

After download, put the dataset in dataset folder.

Datapreprocess

To run the preprocessing

python utils/preprocessing.py datasets/mvtec3d/

It may take a few hours to run the preprocessing.

Checkpoints

The following table lists the pretrain model used in M3DM:

Backbone Pretrain Method
Point Transformer Point-MAE
Point Transformer Point-Bert
ViT-b/8 DINO
ViT-b/8 Supervised ImageNet 1K
ViT-b/8 Supervised ImageNet 21K
ViT-s/8 DINO
UFF UFF Module

Put the checkpoint files in checkpoints folder.

Train and Test

Train and test the double lib version and save the feature for UFF training:

mkdir -p datasets/patch_lib
python3 main.py \
--method_name DINO+Point_MAE \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--save_feature \

Train the UFF:

OMP_NUM_THREADS=1 python3 -m torch.distributed.launch --nproc_per_node=1 fusion_pretrain.py    \
--accum_iter 16 \
--lr 0.003 \
--batch_size 16 \
--data_path datasets/patch_lib \
--output_dir checkpoints \

Train and test the full setting with the following command:

python3 main.py \
--method_name DINO+Point_MAE+Fusion \
--use_uff \
--memory_bank multiple \
--rgb_backbone_name vit_base_patch8_224_dino \
--xyz_backbone_name Point_MAE \
--fusion_module_path checkpoints/{FUSION_CHECKPOINT}.pth \

Note: if you set --method_name DINO or --method_name Point_MAE, set --memory_bank single at the same time.

If you find this repository useful for your research, please use the following.

@inproceedings{wang2023multimodal,
  title={Multimodal Industrial Anomaly Detection via Hybrid Fusion},
  author={Wang, Yue and Peng, Jinlong and Zhang, Jiangning and Yi, Ran and Wang, Yabiao and Wang, Chengjie},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8032--8041},
  year={2023}
}

Thanks

Our repo is built on 3D-ADS and MoCo-v3, thanks their extraordinary works!