Please wait a minute...
Journal of Zhejiang University (Science Edition)  2021, Vol. 48 Issue (3): 282-288    DOI: 10.3785/j.issn.1008-9497.2021.03.003
Image Processing Algorithms     
Depth of field videos classification based on image depth prediction
QIAN Lihui1, WANG Bin1, ZHENG Yunfei2, ZHANG Jiajie2, LI Mading2, YU Bing2
1.School of Software, Tsinghua University, Beijing 100084, China
2.Beijing Kuaishou Technology Co., Ltd., Beijing 100085, China
Download: HTML (   PDF(2368KB)
Export: BibTeX | EndNote (RIS)      

Abstract  Depth of field videos are usually beautiful and are very popular.However,it is a problem to classify such videos.There are many research works on the principle of the depth of field and segmentation algorithms,but they are often difficult to be applied to real video classification scenarios.This paper proposes a deep classification network to directly classify a video based on the observation that semantic objects with different depths of field usually have different definitions.According to the depth of field imaging principle,we propose to use the depth map as a guide to reduce the false detection rate and then improve the performance of the network.Furthermore,we design an iterative method to collect the depth of field videos quickly at low cost. Experimental results show that our method outperforms the previous methods,reaching 85.7% in the Kuaishou depth of field videos dataset.

Key wordsvideos classification      deep learning      depth of field      depth prediction     
Received: 26 September 2020      Published: 20 May 2021
CLC:  TP 391.4  
Cite this article:

QIAN Lihui, WANG Bin, ZHENG Yunfei, ZHANG Jiajie, LI Mading, YU Bing. Depth of field videos classification based on image depth prediction. Journal of Zhejiang University (Science Edition), 2021, 48(3): 282-288.

URL:

https://www.zjujournals.com/sci/EN/Y2021/V48/I3/282


基于图像深度预测的景深视频分类算法

景深视频因高清、美观广受大众喜爱。然而,要从海量视频中检出此类视频十分困难。已有较多研究基于景深图像成像原理,开展景深像素分割算法研究,但难以直接应用于实际视频分类场景。本文针对景深视频类型,设计了可预测视频类型的深度网络。根据景深成像原理,各语义物体之间相对相机的景深深度存在一定的逻辑关系。为此提出以图像深度为指导,利用深度预测模块预测图像的景深深度信息,将其合并后输入至分类网络进行训练检测,以降低景深视频误检率,提升网络模型的性能。此外,针对现实需求中该领域有标数据较少,而不同数据集分布会降低性能的问题,设计了迭代式景深视频数据集收集方法,以较低的劳动成本快速收集所需要的视频数据,具有一定的实际应用价值。本文算法在快手线上的景深视频数据集中识别准确率达85.7%。

关键词: 视频分类,  深度预测,  深度学习,  景深 
1 KIM B,SON H,PARK S J,et al. Defocus and motion blur detection with deep contextual features[J]. Computer Graphics Forum, 2018,37(7):277-288.DOI 10.1111/cgf.13567
2 ZHANG S,SHEN X,LIN Z,et al.Learning to understand image blur[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City:IEEE,2018:6586-6595. DOI:10.1109/cvpr.2018.00689
3 WANG L,SHEN X,ZHANG J,et al.DeepLens:Shallow depth of field from a single image[J].ACM Transactions on Graphics, 2018,37(6):1-11.
4 HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778. DOI:10.1109/cvpr.2016.90
5 邹艺方,王艳平.摄像机景深及其运用[J].影像技术,2008,21(3):14-17. ZHOU Y F,WANG Y P. Camcorder field depth and its utilization[J]. Image Technology,2008,21(3):14-17.
6 ICHIGAYA A,KUROZUMI M,HARA N,et al. A method of estimating coding PSNR using quantized DCT coefficients[J]. IEEE Transactions on Circuits and Systems for Video Technology,2006,16(2):251-259. DOI:10.1109/tcsvt.2005.858745
7 WANG Z,SIMONCELLI E P,BOVIK A C.Multiscale structural similarity for image quality assessment[C]//Proceedings of Asilomar Conference on Signals,Systems & Computers.California:IEEE, 2003:1398-1402. DOI:10.1109/acssc.2003.1292216
8 VU P V,CHANDLER D M. Vis3:An algorithm for video quality assessment via analysis of spatial and spatiotemporal slices[J].Journal of Electronic Imaging,2014,23(1):013016. DOI:10.1117/1.jei.23.1.013016
9 MANASA K,CHANNAPPAYYA S S.An optical flow-based no-reference video quality assessment algorithm[C]//Proceedings of the IEEE International Conference on Image Processing. Phoenix:IEEE,2016:2400-2404. DOI:10.1109/icip.2016.7532789
10 SAAD M,BOVIK A C,CHARRIER C. Blind prediction of natural video quality and h.264 applications[J]. IEEE Transactions on Image Processing,2013,23(3):1352-1365. DOI:10.1109/TIP.2014.2299154
11 SHI J,XU L,JIA J.Just noticeable defocus blur detection and estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Boston:IEEE Computer Society,2015:657-665. DOI:10.1109/cvpr.2015.7298665
12 YI X,ERAMIAN M.LBP-based segmentation of defocus blur[J].IEEE Transactions on Image Processing,2016,25(4):1626-1638. DOI:10.1109/tip.2016.2528042
13 MA K,FU H,LIU T,et al.Deep blur mapping:Exploiting high-level semantics by deep neural networks[J].IEEE Transactions on Image Processing,2018,27(10):5155-5166. DOI:10.1109/tip.2018.2847421
14 RONNEBERGER O,FISCHER P,BROX T. U-net:Convolutional net-works for biomedical image segmentation[C]//Proceedings of International Conference on Medical Image Computing and Computer-assisted Intervention.Springer:Cham,2015:234-241. DOI:10.1007/978-3-319-24574-4_28
15 CARUANA R,LAWRENCE S,GILES C L.Overfitting in neural nets:Backpropagation,conjugate gradient,and early stopping[C]//NIPS'00:Proceedings of the 13th International Conference on Neural Information Processing Systems. Denver:NIPS Foundation,2000.
16 KINGMA D P,BA J. Adam:A method for stochastic optimization[EB/OL].(2014-12-22).https://arXiv.org/abs/1412.6980v9
[1] Shengjia XU,Cheng SU,Kongyang ZHU,Xiaocan ZHANG. Automatic identification of mineral in petrographic thin sections based on images using a deep learning method[J]. Journal of Zhejiang University (Science Edition), 2022, 49(6): 743-752.
[2] Hualing LIU,Guoxiang ZHANG,Jun MA. Research progress of graph embedding algorithms[J]. Journal of Zhejiang University (Science Edition), 2022, 49(4): 443-456.
[3] CHEN Yuanqiong, ZOU Beiji, ZHANG Meihua, LIAO Wangmin, HUANG Jiaer, ZHU Chengzhang. A review on deep learning interpretability in medical image processing[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 18-29.
[4] FU Yingying, ZHANG Feng, DU Zhenhong, LIU Renyi. Multi-step prediction of PM2.5 hourly concentration by fusing graph convolution neural network and attention mechanism[J]. Journal of Zhejiang University (Science Edition), 2021, 48(1): 74-83.
[5] LI Junyi, REN Tao, LU Luzheng. A comparative study of big text data mining methods on tourist emotion computing[J]. Journal of Zhejiang University (Science Edition), 2020, 47(4): 507-520.
[6] Shanxiong CHEN, Xiaolong WANG, Xu HAN, Yun LIU, Minggui WANG. A recognition method of Ancient Yi character based on deep learning[J]. Journal of Zhejiang University (Science Edition), 2019, 46(3): 261-269.
[7] Jie HUANG, Feng ZHANG, Zhenhong DU, Renyi LIU, Xiaopei CAO. Hourly concentration prediction of PM2.5 based on RNN-CNN ensemble deep learning model[J]. Journal of Zhejiang University (Science Edition), 2019, 46(3): 370-379.
[8] HU Weijian, CHEN Wei, FENG Haozhe, ZHANG Tianping, ZHU Zhengmao, PAN Qiaoming. A survey of depth learning methods for detecting lung nodules by CT images[J]. Journal of Zhejiang University (Science Edition), 2017, 44(4): 379-384.