Self-supervised monocular depth estimation via asymmetric convolution block

doi:10.1049/csy2.12051

IET Cyber-Systems and Robotics

2022, Vol. 4

Issue (2): 131-138 DOI: 10.1049/csy2.12051

Self-supervised monocular depth estimation via asymmetric convolution block

全文: PDF

摘要： Without the dependence of depth ground truth, self-supervised learning is a promising alternative to train monocular depth estimation. It builds its own supervision signal with the help of other tools, such as view synthesis and pose networks. However, more training parameters and time consumption may be involved. This paper proposes a monocular depth prediction framework that can jointly learn the depth value and pose transformation between images in an end-to-end manner. The depth network creatively employs an asymmetric convolution block instead of every square kernel layer to strengthen the learning ability of extracting image features when training. During inference time, the asymmetric kernels are fused and converted to the original network to predict more accurate image depth, thus bringing no extra computations anymore. The network is trained and tested on the KITTI monocular dataset. The evaluated results demonstrate that the depth model outperforms some State of the Arts (SOTA) approaches and can reduce the inference time of depth prediction. Additionally, the proposed model performs great adaptability on the Make3D dataset.

Abstract: Without the dependence of depth ground truth, self-supervised learning is a promising alternative to train monocular depth estimation. It builds its own supervision signal with the help of other tools, such as view synthesis and pose networks. However, more training parameters and time consumption may be involved. This paper proposes a monocular depth prediction framework that can jointly learn the depth value and pose transformation between images in an end-to-end manner. The depth network creatively employs an asymmetric convolution block instead of every square kernel layer to strengthen the learning ability of extracting image features when training. During inference time, the asymmetric kernels are fused and converted to the original network to predict more accurate image depth, thus bringing no extra computations anymore. The network is trained and tested on the KITTI monocular dataset. The evaluated results demonstrate that the depth model outperforms some State of the Arts (SOTA) approaches and can reduce the inference time of depth prediction. Additionally, the proposed model performs great adaptability on the Make3D dataset.

出版日期: 2022-07-22

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	Lingling Hu
	Hao Zhang
	Zhuping Wang
	Chao Huang
	Changzhu Zhang

引用本文:

Lingling Hu, Hao Zhang, Zhuping Wang, Chao Huang, Changzhu Zhang. Self-supervised monocular depth estimation via asymmetric convolution block. IET Cyber-Systems and Robotics, 2022, 4(2): 131-138.

链接本文:

https://www.zjujournals.com/iet-csr/CN/10.1049/csy2.12051 或 https://www.zjujournals.com/iet-csr/CN/Y2022/V4/I2/131

No related articles found!

Viewed

Full text

Abstract

Cited

Shared

Discussed