警告:谨慎参考和使用,未完成工作,BUG较多,随意使用有BOOM风险.
comparisons
:对比实验doc
:文档和参考文献evision_model
:本文提出的模型(new-version)evision_net
: 本文提出的模型(old-version)utils
:一些工具
- Windows10 or Ubuntu 18.04
- Python3.6 or 3.7,Anaconda3
- CUDA10.2
- tensorflow 1.14.0 (with cudnn7.6.5, for dfv)
- PyTorch 1.5.0
3.1. depth_from_video_in_the_wild
3.2. SfmLeaner
- Unsupervised Learning, explainability-mask
- 论文 第三方PyTorch实现 原文Tensorflow实现
3.3. struct2depth
[1]. Pyramid stereo matching network. PSMNet.
[2]. TILDE: a temporally invariant learned detector.TILDE.
[3]. Deep Ordinal Regression Network for Monocular Depth Estimation.
[4]. Occlusion-Aware Unsupervised Learning of Monocular Depth, Optical Flow and Camera Pose with Geometric Constraints." Future Internet 10.10 (2018): 92.
[5]. Liu, Qiang, et al. "Using Unsupervised Deep Learning Technique for Monocular Visual Odometry.
[6]. DeepCalib: a deep learning approach for automatic intrinsic calibration of wide field-of-view cameras.[关键词:Camera Calibrate deep learning].
[7]. Depth from Videos in the Wild:Unsupervised Monocular Depth Learning from Unknown Cameras.
[8]. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation.
[1]. middlebury 数据集.
[2]. KITTI 数据集.
[3]. VIsion-SceneFlowDatasets数据集.
[4]. PSMNet解析.
[5]. 中科院自动化所三维重建数据集.
[6]. SfMLearner(Depth and Ego-Motion)解析.
[7]. OpenMVS.
[8]. OpenMVG.
[9]. CVonline,图片数据集汇总.
[10]. VisualData数据集搜索.
[11]. 360D-zenodo Dataset.
[12]. RGB-D Panorama Dataset.
[13]. Deep Depth Completion of a Single RGB-D Image解析.
[14]. Unsupervised Learning of Depth and Ego-Motion解析.
[15]. 视觉里程计 第二部分:匹配、鲁棒、优化和应用.
[16]. 怎样通过照片获得高质量3D模型.
[17]. tqdm.postfix.
[18]. KITTI_odometry_evaluation_tool.
- seq 09和seq 10是ego-motion的指标(smaller the better).
- 其余是单目深度的指标(for Abs Rel,Sq Rel,rms,log_rms,smaller the better;for A1,A2,A3,bigger the better).
- 全部为只使用KITTI数据集的实验结果.
ATE in seq 09 | ATE in seq 10 | Abs Rel | Sq Rel | rms | log_rms | A1 | A2 | A3 | 备注 |
---|---|---|---|---|---|---|---|---|---|
0.0160 ± 0.0090 | 0.0130 ± 0.0090 | 0.183 | 1.595 | 6.700 | 0.270 | 0.734 | 0.902 | 0.959 | SfmLeaner Github1 |
0.0210 ± 0.0170 | 0.0200 ± 0.0150 | 0.208 | 1.768 | 6.856 | 0.283 | 0.678 | 0.885 | 0.957 | SfmLeaner Paper2 |
0.0179 ± 0.0110 | 0.0141 ± 0.0115 | 0.181 | 1.341 | 6.236 | 0.262 | 0.733 | 0.901 | 0.964 | SfmLeaner third party Github3 |
0.0107 ± 0.0062 | 0.0096 ± 0.0072 | 0.2260 | 2.310 | 6.827 | 0.301 | 0.677 | 0.878 | 0.947 | Ours SfmLeaner-Pytorch4 |
0.0312 ± 0.0217 | 0.0237 ± 0.0208 | 0.2330 | 2.4643 | 6.830 | 0.314 | 0.6704 | 0.869 | 0.940 | intri_pred5 |
------ | ------- | 0.1417 | 1.1385 | 5.5205 | 0.2186 | 0.8203 | 0.9415 | 0.9762 | struct2depth baseline 6 |
0.0110 ± 0.0060 | 0.0110 ± 0.0100 | 0.1087 | 0.8250 | 4.7503 | 0.1866 | 0.8738 | 0.9577 | 0.9825 | struct2depth M+R 7 |
0.0090 ± 0.0150 | 0.0080 ± 0.0110 | 0.129 | 0.982 | 5.23 | 0.213 | 0.840 | 0.945 | 0.976 | DFV Given intrinsics 8 |
0.0120 ± 0.0160 | 0.0100 ± 0.0100 | 0.128 | 0.959 | 5.23 | 0.212 | 0.845 | 0.947 | 0.976 | DFV Learned intrinsics 9 |
- SfMLearner文中(参考文献[5])所附Github的readme给出的最好结果,作者说明更改为:增加了数据扩增,移除了BN,一些微调,只用KITTI数据,没有使用explainability regularization.该效果部分略好于论文上的结果
- SfMLearner文中(参考文献[5])给出的KITTI上的最好成绩.
- SfmLeaner-pytorch的Github上给出的最佳结果.与原作者不同的地方为:Smooth loss从应用到视差上改为应用到深度上,loss除以2.3而不是2.
- 我们的SfMLearner-pytorch,
-b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3
. - 不提供内参,使用简单的内参预测手段
-b 4 -m 0.6 -s 0.1 --epoch-size 3000 --sequence-length 3
. - from Table.1 in struct2depth paper.
- from Table.1 and Table.3 in struct2depth paper.
- from Table.1 and Table.6 in the Depth from Video in the wild paper.
- from Table.1 and Table.6 in the Depth from Video in the wild paper.
- struct2depth 和 Depth from Video in the wild 这两个工作除了使用KITTI等训练数据集,还使用了一个目标检测模型来生成“object mask”,其作用是在motion mask的生成上进行边界限定.
- struct2deptht提供了预训练的模型可以进行测试,Depth from Video in the wild的模型下载链接全部都删除了.
- 深度指标:
- ego-motion指标:
ATE(Absolute Trajectory Error,绝对轨迹误差)在测试集上的均值和标准差,RE是旋转误差.(ATE (Absolute Trajectory Error) is computed as long as RE for rotation (Rotation Error). RE between R1 and R2 is defined as the angle of R1*R2^-1 when converted to axis/angle. It corresponds to RE = arccos( (trace(R1 @ R2^-1) - 1) / 2). While ATE is often said to be enough to trajectory estimation, RE seems important here as sequences are only seq_length frames long).
- windows上的anaconda需要
Anaconda3
,Anaconda3/Library/bin
,Anaconda3/Scripts
,Anaconda3/condabin
这四个环境变量. - DFV提到了一种"Randomized Layer Normalization",这种操作在PyTorch中很构造出文中描述的实现效果,我搞了一个似是而非的写法,在
evision_model/_Deprecated.py
中, 事实上这个方法如果真的想文中描述的那样有效,那么症结一定在别的地方. evision_model/_PlayGround.py
用于在开发过程中测试一些函数,其中的代码没有被其他文件依赖,可以所以修改甚至删除.