一种由粗至精的室内场景的空间布局估计方法

刘天亮; 顾雁秋; 曹旦旦; 戴修斌; 罗杰波

doi:10.13973/j.cnki.robot.180017

一种由粗至精的室内场景的空间布局估计方法

A Coarse-to-Fine Estimation Method for Spatial Layout of Indoor Scenes

摘要

摘要: 为有效标注室内场景的布局关系，提出一种由粗至精的空间布局估计方法.首先，采用局部不连续自适应阈值检测场景的长直线段，根据直线段的方向将其分为竖直和水平直线段；基于投票机制和正交准则估计垂直与水平消失点，由这两个消失点等角度间隔地引出成对射线生成场景候选布局.其次，采用VGG-16全卷积神经网络估计相应场景的几何上下文和信息化边界，采用softmax分类器决策其fc7层特征以获取布局类别，融合信息化边界和布局类别生成全局特征以粗选取场景候选布局.接着，基于VGG空间多尺度卷积神经网络估计相应场景的法向图和深度图以提取法向特征和深度特征.然后，利用消失点射线夹角参数化3D盒式布局模型，利用几何积分图聚集候选布局中的直线段成员、几何上下文、法向量和深度等区域级特征，采用割平面法学习结构化模型参数.最后，对候选布局的结构化预测得分进行排序，将得分最高者选取为最终空间布局.Hedau和LSUN数据集实验表明，该方法能获得空间布局的精准区域面划分个数和精确边界位置.

Abstract: A coarse-to-fine estimation method for spatial layout is presented to effectively label the layout relationship of indoor scenes. Firstly, the adaptive threshold detection method with local discontinues is exploited to acquire the long straight lines of the given scene, which are splitted into the vertical lines and horizontal ones in terms of the corresponding directions. The vertical and horizontal vanishing points are estimated based on the vote mechanism and orthogonality principle, and the pairs of the rays led from two vanishing points at equal angular interval are used to generate the candidates of the given scene layout. Next, the informative edge and geometric context of the given scene are estimated with VGG-16 full convolution neural network, and the softmax classifier is applied to deciding the given fc7 features to obtain the layout category, while the global features merged with the informative edge and layout category are generated to roughly select the layout candidates. Then, the normal vector and depth map of the given scenes are estimated with the VGG-based spatial multi-scale convolution neural network to extract the related normal vector and geometric depth feature. And next, the 3D box spatial layout model can be parameterized by the angles between the rays from vanishing points, while the line membership, geometric context, normal vector and depth feature are accumulated via geometric integral image to extract the regional features of layout candidates, and the structural model parameter can be learned with cutting-plane method. Finally, the layout candidate with the highest structural prediction score is selected as the final spatial layout. Experimental results on the Hedau and LSUN datasets demonstrate that the presented method can obtain more accurate number of divided polygons and more precise boundary positions of spatial layout.

HTML全文

参考文献(19)

施引文献

资源附件(0)