研究论文

MR中融合语义特征传播模型的前景对象感知定位算法

展开
  • 1.上海大学 特种光纤与光接入网重点实验室, 上海 200444
    2.上海大学 特种光纤与先进通信国际合作联合实验室, 上海 200444
    3.上海三思系统集成研究所, 上海 201100
张金艺(1965—), 男, 研究员, 博士生导师, 博士, 研究方向为通信类SoC 设计与室内无线定位技术. E-mail: zhangjinyi@shu.edu.cn

收稿日期: 2022-05-18

  网络出版日期: 2023-03-28

基金资助

"十三五" 国家重点研发计划资助项目(2017YFB0403500);高等学校学科创新引智计划(111 计划) 项目(D20031);上海市教委重点学科资助项目(J50104)

Foreground object perception and location algorithm based on semantic feature propagation model in MR

Expand
  • 1. Key laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai University, Shanghai 200444, China
    2. Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai University, Shanghai 200444, China
    3. Shanghai Sansi Institute for System Integration, Shanghai 201100, China

Received date: 2022-05-18

  Online published: 2023-03-28

摘要

移动主体获得准确的定位信息是构建稳定的混合现实(mixed reality, MR) 系统的关键, 然而MR 中的前景对象对传统定位算法的精度影响较大. 现阶段基于深度学习的定位算法可以通过识别前景对象来提升精度, 但深度学习模型耗时过高, 导致算法实时性下降. 针对该问题, 提出了一种MR 中融合语义特征传播模型的前景对象感知定位算法. 该算法依托语义分割网络与一种快速旋转的二进制独立稳定描述子特征(oriented fast and rotated binary robust independent elementary feature, ORB) 提取算法构建了语义特征传播模型, 实现高速语义特征提取; 融合该模型和几何特征检测方法实现算法中的前景对象感知层, 并依赖该感知层剔除MR 中前景对象的特征点, 构建了背景特征点集, 实现高精度、高实时性的定位. 实验结果表明: 在慕尼黑工业大学(Technical University of Munich, TUM) 公共数据集的高动态前景对象场景中, 相比动态语义视觉同步定位与建图(dynamic semantic visual simultaneous localization and mapping, DS-SLAM) 算法, 该算法相对位姿误差降低了60.5%, 定位实时性提升了39.5%, 可见该算法在MR 中具有较高的应用价值.

本文引用格式

方哲, 张金艺, 姜玉稀 . MR中融合语义特征传播模型的前景对象感知定位算法[J]. 上海大学学报(自然科学版), 0 : 41 -55 . DOI: 10.12066/j.issn.1007-2861.2413

Abstract

Accurate location information obtained by mobile agents is the key to building a stable mixed reality (MR) system. However, foreground objects in an MR scene have a significant impact on the accuracy of traditional location algorithms. At present, location algorithms based on deep learning show relatively improved accuracy by identifying foreground objects, but the time consumption of a deep learning model is too high, resulting in a decline in the real-time performance of the algorithms. To solve this problem, this paper proposes a foreground object-aware location algorithm based on an MR semantic feature propagation model. The algorithm builds a semantic feature propagation model based on a semantic segmentation network and the oriented FAST and rotated BRIEF feature extraction algorithm to realize high-speed semantic feature extraction. The model and a geometric feature detection method are fused to realize the foreground object perception layer in the algorithm, which eliminates the feature points on the foreground objects in MR, and to construct a background feature point set to realize high precision and high real-time location. Experimental results show that the proposed algorithm reduces the relative pose error by 60.5% and improves the real-time location performance by 39.5% compared to the dynamic scenes simultaneous localization and mapping location algorithm in the high-dynamic foreground object scene of the Technical University of Munich public dataset. Therefore, this algorithm has high application value in MR scenes.

参考文献

[1] 夏铁男, 刘金鑫, 陈挺, 等. 混合现实技术在腹膜后肿瘤手术中的应用[J]. 中国临床研究, 2021, 34(8): 1053-1056.
[2] Wang P, Bai X, Billinghurst M, et al. An MR remote collaborative platform based on 3D CAD models for training in industry[C]// 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct). 2019: 91-92.
[3] Dalim C S C, Piumsomboon T, Dey A, et al. TeachAR: an interactive augmented reality tool for teaching basic english to non-native children[C]// 2016 IEEE International Symposium on Mixed and Augmented Reality (ISMAR-Adjunct). 2016: 344-345.
[4] Younes G, Asmar D, Shammas E, et al. Keyframe-based monocular SLAM: design, survey, and future directions[J]. Robotics and Autonomous Systems, 2017, 98: 67-88.
[5] 高兴波, 史旭华, 葛群峰, 等. 面向动态物体场景的视觉SLAM 综述[J]. 机器人, 2021, 43(6): 733-750.
[6] Mur-Artal R, Tardós J D. ORB-SLAM2: an open-source SLAM system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262.
[7] Wang R, Wan W, Wang Y, et al. A new RGB-D SLAM method with moving object detection for dynamic indoor scenes[J]. Remote Sensing, 2019, 11(10): 1143. 1-1143.19.
[8] Bescos B, FÁcil J M, Civera J, et al. DynaSLAM: tracking, mapping, and inpainting in dynamic scenes[J]. IEEE Robotics and Automation Letters, 2018, 3(4): 4076-4083.
[9] Yu C, Liu Z, Liu X J, et al. DS-SLAM: a semantic visual SLAM towards dynamic environ- ments[C]// 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2018: 1168-1174.
[10] Badrinarayanan V, Kendall A, Cipolla R. Segnet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495.
[11] Rublee E, Rabaud V, Konolige K, et al. Orb: an efficient alternative to sift orsurf[C]// 2011 IEEE International Conference on Computer Vision (ICCV). 2011: 2564-2571.
[12] He K, Gkioxari G, Dollvr P, et al. Mask R-CNN[C]// Proceedings of the IEEE International Conference on Computer Vision. 2017: 2961-2969.
[13] 王榕榕, 徐树公, 黄剑波. 基于深度学习的图像抠图技术[J]. 上海大学学报(自然科学版), 2022, 28(2): 261-269.
[14] Hartley R I. In defense of the eight-point algorithm[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(6): 580-593.
[15] Chum O, Matas J, Kittler J. Locally optimized RANSAC[C]// Proc of Joint Pattern Recognition Symposium. 2003: 236-243.
[16] Sturm J, Engelhard N, Endres F, et al. A benchmark for the evaluation of RGB-D SLAM systems[C]// 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2012: 573-580.
文章导航

/