基于多特征融合和几何感知网络的光场图像编码

doi:10.12066/j.issn.1007-2861.2546

上海大学学报(自然科学版) ›› 2024, Vol. 30 ›› Issue (4): 669-681.doi: 10.12066/j.issn.1007-2861.2546

基于多特征融合和几何感知网络的光场图像编码

刘发国1, 白晓琦2, 张倩1, 王斌1, 司文3

1. 上海师范大学信息与机电工程学院, 上海 201418; 2. 杨浦区教育学院, 上海 200092;3. 上海商学院, 上海 201400

收稿日期:2023-09-24 出版日期:2024-08-30 发布日期:2024-09-13
通讯作者: 张倩(1983—), 女, 副教授, 博士, 研究方向为视频和图像信息处理. E-mail:qianzhang@shnu.edu.cn
基金资助:
国家自然科学基金资助项目 (62301320)

Light ﬁeld image coding based on multi-feature fusion and geometry-aware networks

LIU Faguo1, BAI Xiaoqi2, ZHANG Qian1, WANG Bin1, SI Wen3

1. College of Information and Electromechanical Engineering, Shanghai Normal University, Shanghai 201418, China;2. Yangpu District Education Institute, Shanghai 200092, China;3. Shanghai Business School, Shanghai 201400, China

Received:2023-09-24 Online:2024-08-30 Published:2024-09-13

摘要/Abstract

摘要： 为了克服基于单一视差合成的光场图像编码方法在遮挡区域无法恢复纹理细节的问题, 提出一种基于多特征融合和几何感知网络的光场图像编码方法, 以进一步提升遮挡场景下光场图像的压缩性能. 首先, 对密集光场稀疏采样, 使用通用视频编码器 (versatile video coding, VVC) 对稀疏光场进行压缩; 然后, 在解码端使用2 个关键分支模块, 即视差估计模块和空间角度联合卷积模块, 以获取光场图像全局的几何信息, 确保在密集纹理和遮挡区域能够更充分地恢复特征; 最后, 为了挖掘 2 个分支融合特征的结构信息, 构建了双向视图的堆栈结构, 并运用几何感知的细化网络以重建高质量的密集光场. 实验结果表明, 与已有国际上流行的光场图像编码方法相比, 所提出的方法具有显著优势.

关键词: 光场图像编码, 遮挡区域, 多特征融合, 几何感知

Abstract: To address the inherent limitations of light ﬁeld image coding methods relying on single parallax synthesis, which hinders the recovery of texture details from occluded regions, an innovative light ﬁeld image coding approach grounded in multi-feature fusion and geometric perception networks is introduced in this research. The primary objective is to further enhance the compression performance of light-ﬁeld images captured in occluded scenes. The methodology begins by employing sparse sampling of the dense light ﬁeld, followed by the application of versatile video coding (VVC) to eﬀectively compress the resulting sparse light ﬁeld. Subsequently, two pivotal branch modules, the parallax estimation module and the spatial angle joint convolution module, were deployed during the decoding process. These branches collectively capture the global geometric attributes of the optical ﬁeld image and ensure a more comprehensive recovery of the features, particularly from regions characterized by dense textures and occlusions. To exploit the structural information embedded within the fused features originating from these two branches fully, a stack structure featuring bidirectional views was constructed. Furthermore, a reﬁnement network with geometric perception capabilities was used to reconstruct high-quality dense light ﬁelds. The experimental results demonstrate the signiﬁcant advantages of our pro-posed method over current international light-ﬁeld image coding techniques.

Key words: light ?eld image coding, occlusion region, multi-feature fusion, geometry-aware

中图分类号:

TP 751.1

刘发国, 白晓琦, 张倩, 王斌, 司文. 基于多特征融合和几何感知网络的光场图像编码[J]. 上海大学学报(自然科学版), 2024, 30(4): 669-681.

LIU Faguo, BAI Xiaoqi, ZHANG Qian, WANG Bin, SI Wen. Light ﬁeld image coding based on multi-feature fusion and geometry-aware networks[J]. Journal of Shanghai University（Natural Science Edition）, 2024, 30(4): 669-681.

基于多特征融合和几何感知网络的光场图像编码

Light ﬁeld image coding based on multi-feature fusion and geometry-aware networks

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 1

编辑推荐

Metrics

本文评价