李耀宇, 王宏民, 张一帆, 卢汉清. 基于结构化深度学习的单目图像深度估计[J]. 机器人, 2017, 39(6): 812-819. DOI: 10.13973/j.cnki.robot.2017.0812
引用本文: 李耀宇, 王宏民, 张一帆, 卢汉清. 基于结构化深度学习的单目图像深度估计[J]. 机器人, 2017, 39(6): 812-819. DOI: 10.13973/j.cnki.robot.2017.0812
LI Yaoyu, WANG Hongmin, ZHANG Yifan, LU Hanqing. Structured Deep Learning Based Depth Estimation from a Monocular Image[J]. ROBOT, 2017, 39(6): 812-819. DOI: 10.13973/j.cnki.robot.2017.0812
Citation: LI Yaoyu, WANG Hongmin, ZHANG Yifan, LU Hanqing. Structured Deep Learning Based Depth Estimation from a Monocular Image[J]. ROBOT, 2017, 39(6): 812-819. DOI: 10.13973/j.cnki.robot.2017.0812

基于结构化深度学习的单目图像深度估计

Structured Deep Learning Based Depth Estimation from a Monocular Image

  • 摘要: 为从单目图像中提取到丰富的3D结构特征,并用以推测场景的深度信息,针对单目图像深度估计任务提出了一种结构化深度学习模型,该模型将一种新的多尺度卷积神经网络与连续条件随机场统一于一个深度学习框架中.卷积神经网络可以从图像中学习到相关特征表达,而连续条件随机场可以根据图像像素的位置、颜色信息对卷积神经网络输出进行优化,将二者参数以联合优化的方式进行学习可以提升模型的泛化性能.通过在NYU Depth数据集上的实验验证了模型的有效性与优越性,该模型预测结果的平均相对误差为0.187,均方根误差为0.074,对数空间平均误差为0.671.

     

    Abstract: For the purposes of extracting rich 3D structural features from a monocular image and inferring depth information for the scene, a structured deep learning model is proposed for the task of depth estimation from a monocular image. The model combines a novel multi-scale convolutional neural network (CNN) and continuous conditional random field (CCRF) in a unified deep learning framework. CNN can learn related feature representations from an image, and CCRF can optimize the output of CNN according to the position and color information of the image pixels. By jointly learning the parameters of CCRF and CNN, the generalization ability of the model can be improved. Experiments on NYU Depth dataset demonstrate the effectiveness and superiority of the model. The average relative error of the predictions of the model is 0.187, and the root mean squared error is 0.074, the average log10 error is 0.671.

     

/

返回文章
返回