GitHub - c-pupil/D2A2: Official PyTorch implementation of "The Devil is in the Details: Boosting Guided Depth Super-Resolution via Rethinking Cross-Modal Alignment and Aggregation"

Boosting Guided Depth Super-Resolution Through Large Depth Estimation Model and Alignment-then-Fusion Strategy

Yuan-lin Zhang ^#,Xin-Ni Jiang^#,Chun-Le Guo^*, Xiong-Xin Tang^*, Guo-Qing Wang, Wei Li, Xun Liu,Chong-Yi Li

Guided Depth Super-Resolution (GDSR) presents two primary challenges: the resolution gap between Low-Resolution (LR) depth maps and High-Resolution (HR) RGB images, and the modality gap between depth and RGB data. In this study, we leverage the powerful zero-shot capabilities of large pre-trained monocular depth estimation models to address these issues. Specifically, we utilize the output of monocular depth estimation as pseudo-depth to mitigate both gaps. The pseudo-depth map is aligned with the resolution of the RGB image, offering more detailed boundary information than the LR depth map, particularly at larger scales. Furthermore, pseudo-depth provides valuable relative positional information about objects, serving as a critical scene prior to enhance edge alignment and reduce texture overtransfer. However, effectively bridging the cross-modal differences between the guidance inputs (RGB and pseudo-depth) and LR depth remains a significant challenge. To tackle this, we analyze the modality gap from three key perspectives: distribution misalignment, geometrical misalignment, and texture inconsistency. Based on these insights, we propose an alignment-then-fusion strategy, introducing a novel and efficient Dynamic Dual-Aligned and Aggregation Network (D2A2). By leveraging large pre-trained monocular depth estimation models, our approach achieves state-of-the-art performance on multiple benchmark datasets, excelling particularly in the challenging ×16 GDSR task.

Setup

Dependencies

The conda environment with all required dependencies can be generated by running

conda env create -f environment.yml
conda activate GDSR-D2A2
cd models/Deformable_Convolution_V2
sh make.sh

Datasets

The NYUv2 dataset can be downloaded here. Your folder structure should look like this:

NYUv2
└───Depth
│   │   0.npy
│   │   1.npy
│   │   2.npy
│   │   ...
│   │   1448.npy 
└───RGB
│   │   0.jpg
│   │   1.jpg
│   │   2.jpg
│   │   ...
│   │   1448.jpg
└───MDE_relative
│   │   0.png
│   │   1.png
│   │   2.png
│   │   ...
│   │   1448.png

Lu, Middlebury and RGBDD datasets are only used for testing and can be downloaded here.

The pseudo lable is obtained from Depth-Anything-V2, use the model checkpoint Depth-Anything-V2-Large , or you can directly download the monocular depth estimate results from [Google Drive], [Baidu Cloud].

Pretrained Model

Download pretrained models from [Google Drive], [Baidu Cloud] and put them in the pretrained folder.

Training

Please modify the '--scale','--dataset_dir' in file 'option.py'.

python train_d2a2.py

Testing

Please modify the '--scale','--dataset_dir' in file 'option.py'.
To resume from a checkpoint file, simply use the '-- net_path' argument in option.py to specify the checkpoint.
Try D2A2 on your images!
```
python test_d2a2.py
```
Check your results in result/testresult/D2A2-dataset-******/!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Boosting Guided Depth Super-Resolution Through Large Depth Estimation Model and Alignment-then-Fusion Strategy

Setup

Dependencies

Datasets

Pretrained Model

Training

Testing

The test results

Acknowledgements

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
datasets		datasets
images		images
load		load
models		models
pretrained		pretrained
result		result
README.md		README.md
environment.yaml		environment.yaml
option.py		option.py
test_d2a2.py		test_d2a2.py
train_d2a2.py		train_d2a2.py
utils.py		utils.py

c-pupil/D2A2

Folders and files

Latest commit

History

Repository files navigation

Boosting Guided Depth Super-Resolution Through Large Depth Estimation Model and Alignment-then-Fusion Strategy

Setup

Dependencies

Datasets

Pretrained Model

Training

Testing

The test results

Acknowledgements

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages