基于深度学习的图像抠图技术

doi:10.12066/j.issn.1007-2861.2287

上海大学学报(自然科学版) ›› 2022, Vol. 28 ›› Issue (2): 261-269.doi: 10.12066/j.issn.1007-2861.2287

基于深度学习的图像抠图技术

王榕榕¹^,², 徐树公², 黄剑波¹^,³()

1.上海大学上海电影学院, 上海 200072
2.上海大学上海先进通信与数据科学研究院, 上海 200444
3.上海大学上海电影特效工程技术研究中心, 上海 200072

收稿日期:2020-03-13 出版日期:2022-04-30 发布日期:2022-04-28
通讯作者: 黄剑波 E-mail:huangjianbo110@shu.edu.cn
作者简介:黄剑波(1980--), 男, 副教授, 博士生导师, 博士, 研究方向为艺术理论、图像处理等. E-mail: huangjianbo110@shu.edu.cn
基金资助:
上海大学电影学高峰学科和上海电影特效工程技术研究中心研究项目(16dz2251300)

Image matting based on deep learning

WANG Rongrong¹^,², XU Shugong², HUANG Jianbo¹^,³()

1. Shanghai Film Academy, Shanghai University, Shanghai 200072, China
2. Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
3. Shanghai Engineering Research Center of Motion Picture Special Effects, Shanghai University, Shanghai 200072, China

Received:2020-03-13 Online:2022-04-30 Published:2022-04-28
Contact: HUANG Jianbo E-mail:huangjianbo110@shu.edu.cn

摘要/Abstract

摘要：

图像抠图(image matting)技术是图像编辑技术的基础, 广泛应用于影视后期制作和日常生活. 基于深度学习的图像抠图网络, 通过输入的原图和三元图来估计每个像素的 $\alpha$ 值. 在原下、上采样的图像抠图技术基础上, 针对抠图数据集图像差异较大容易造成网络收敛较慢的问题, 在每个卷积层后加入了批量标准化(batch normalization, BN)层, 对输入数据进行归一化操作, 加快模型收敛速度, 同时参数更新方向更符合数据集整体特性; 针对抠图任务需要更关注物体边缘部分的特点, 使用可变形卷积(deformable convolution)层替换普通卷积层. 可变形卷积层会根据不同输入数据自适应学习卷积核形状, 有效扩大感受野范围, 在细节部分有更好的预测效果.

关键词: 深度学习, 图像抠图, 语义分割, 预测

Abstract:

Image editing technology, which is widely used in the post-production of film and television and in daily life, is based on image matting. In this study, an image matting network based on deep learning which estimates the value of each pixel by inputting the original image and trimap is proposed. Based on the original down- and up-sampling network and to address the problem of slow network convergence caused by the large difference between matting dataset pictures, batch normalisation (BN) is applied after each convolution layer in this study. In the normalisation layer, the input data are normalised to speed up the convergence of the model. This enables the update direction of the parameters to be more consistent with the overall characteristics of the dataset. Because the edge of the object should be carefully considered in the matting task, a deformable convolution layer is used instead of the custom convolution layer. The deformable convolution layer can adaptively learn the shape of the convolution kernel according to different input data, effectively expand the range of the receptive field, and improve the prediction effect in detailed image parts.

Key words: deep learning, image matting, semantic segmentation, prediction

中图分类号:

TP391.4

王榕榕, 徐树公, 黄剑波. 基于深度学习的图像抠图技术[J]. 上海大学学报(自然科学版), 2022, 28(2): 261-269.

WANG Rongrong, XU Shugong, HUANG Jianbo. Image matting based on deep learning[J]. Journal of Shanghai University（Natural Science Edition）, 2022, 28(2): 261-269.

图/表 7

图1

图2

图3

图4

表1

图5

表2

参考文献 13

[1]	孙国星. 全自动抠图技术的研究[D]. 济南: 山东师范大学, 2017.
[2]	栗大智, 孙冰心, 朱少强, 等. 基于 image matting 的旅游相片处理[J]. 福建电脑, 2017, 33(5): 129, 153.
[3]	Russakovsky O, Deng J, Su H, et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 2015, 115(3): 211-252. doi: 10.1007/s11263-015-0816-y
[4]	Porter T K, Duff T D. Compositing digital images[J]. ACM Siggraph Computer Graphics, 1984, 18(3): 253-259. doi: 10.1145/964965.808606
[5]	Wang J, Michael F. Cohen. Optimized Color Sampling for Robust Matting[C]// 2007 IEEE Conference on Computer Vision and Pattern Recognition. 2007.
[6]	Chen Q F, Ding Z Y, Tang C K. KNN matting[C]// Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2012.
[7]	Levin A, Lischinski D, Weiss Y. A closed form solution to natural image matting[C]// IEEE Computer Society. 2006: 61-68.
[8]	Zheng Y J, Kambhamettu C. Learning based digital matting[C]// IEEE 12th International Conference on Computer Vision. 2009.
[9]	Cho D, Tai Y W, Kweon I. Natural image matting using deep convolutional neural networks[C]// European Conference on Computer Vision. 2016.
[10]	Ning X, Brian P, Scott C, et al. Deep image matting[C]// Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017: 2970-2979.
[11]	Ioffe S, Szegedy C. Batch normalization: accelerating deep network training by reducing internal covariate shift[C]// International Conference on Machine Learning. 2015.
[12]	Dai J F, Qi H Z, Xiong Y W, et al. Deformable convolutional networks[C]// Proceedings of the IEEE International Conference on Computer Version. 2017: 764-773.
[13]	Rhemann C, Rother C, Wang J, et al. A perceptually motivated online benchmark for image matting[C]// IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009). 2009.

[1]	李成范, 赵俊娟. 面向遥感图像的小样本目标检测改进算法研究[J]. 上海大学学报(自然科学版), 2022, 28(2): 314-323.
[2]	许庚林, 冉峰, 邓良, 史华康, 郭爱英. 轻量化神经网络和哈希跟踪算法在嵌入式人脸抓拍系统中的应用[J]. 上海大学学报(自然科学版), 2021, 27(6): 1018-1028.
[3]	邢毅雪, 朱永华, 高海燕, 周金, 张克. 基于注意力机制的远程监督实体关系抽取[J]. 上海大学学报(自然科学版), 2021, 27(5): 983-992.
[4]	高峻峻, 郭鹏. 考虑商品促销关联效应的网络零售商动态定价模型[J]. 上海大学学报(自然科学版), 2021, 27(5): 959-971.
[5]	颜恩点, 高思佳. 影子银行业务、公司治理与分析师盈余预测[J]. 上海大学学报(自然科学版), 2021, 27(5): 972-982.
[6]	邵慧翔, 曾丹. 基于改进 YOLOv3 算法的水下小目标分类与识别[J]. 上海大学学报(自然科学版), 2021, 27(3): 481-491.
[7]	高峻峻, 倪子玥. 考虑产品特征属性的替代性需求预测方法[J]. 上海大学学报(自然科学版), 2021, 27(3): 573-582.
[8]	陈钰, 丁友东, 于冰, 徐敏. 基于像素流的视频彩色化[J]. 上海大学学报(自然科学版), 2021, 27(1): 18-27.
[9]	赵焕丽, 何幼桦. 半参数顺序回归的贝叶斯推断[J]. 上海大学学报(自然科学版), 2021, 27(1): 218-226.
[10]	席殷飞, 刘钟锴, 杨佩云, 郁烨, 张奇, 刘志远. 网约车出行需求预测方法[J]. 上海大学学报(自然科学版), 2020, 26(3): 328-341.
[11]	史云阳, 苗阳, 席殷飞, 张奇, 刘志远. 考虑交通环境和纯电动汽车电池电量的动态路径规划方法[J]. 上海大学学报(自然科学版), 2020, 26(3): 353-366.
[12]	楼志挺, 李春祥. 基于空间多点输入的 LSSVM 非高斯风压预测[J]. 上海大学学报(自然科学版), 2019, 25(6): 1013-1022.
[13]	余芳, 安平, 严徐乐. 基于显著性信息和视点合成预测的3D-HEVC编码方法[J]. 上海大学学报(自然科学版), 2019, 25(5): 679-691.
[14]	胡嘉成, 王向阳, 刘晗. 基于深度学习的连铸坯表面缺陷检测[J]. 上海大学学报(自然科学版), 2019, 25(4): 445-452.
[15]	涂伟平, 李春祥. 基于混合智能算法优化LSSVM的短期风压预测[J]. 上海大学学报(自然科学版), 2019, 25(2): 347-356.

基于深度学习的图像抠图技术

Image matting based on deep learning

RichHTML

PDF

可视化

摘要/Abstract

引用本文

使用本文

图/表 7

参考文献 13

相关文章 15

编辑推荐

Metrics

本文评价