基于改进 YOLOv3 算法的水下小目标分类与识别

邵慧翔, 曾丹

doi:10.12066/j.issn.1007-2861.2279

上海大学学报(自然科学版) >

2021 , Vol. 27 >Issue 3: 481 - 491

DOI: https://doi.org/10.12066/j.issn.1007-2861.2279

研究论文

基于改进 YOLOv3 算法的水下小目标分类与识别

展开

上海大学通信与信息工程学院, 上海 200444

曾丹(1982—), 女, 教授, 博士, 研究方向为计算机视觉与模式识别. E-mail: dzeng@shu.edu.cn

收稿日期: 2020-09-28

网络出版日期: 2021-05-10

收起

Classification and recognition of underwater small targets based on improved YOLOv3 algorithm

Expand

School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China

Received date: 2020-09-28

Online published: 2021-05-10

Fold

摘要

针对声纳图像中小目标检测识别率低、虚警率高的问题, 提出一种改进的 YOLOv3 算法. 改进的 YOLOv3 网络在原始 YOLOv3 的基础上进行优化, 改变网络的层级连接, 融合更浅层的特征与深层特征, 形成新的更大尺度的检测层, 提高了网络对水下小目标检测的能力; 同时, 使用线性缩放的 $K$-means 聚类算法优化计算先验框个数和宽高比, 提高了先验框与 ground truth box 之间的匹配度, 较原始 YOLOv3 算法均值平均精度提高了 7%. 实验结果表明, 所提出的改进 YOLOv3 算法能够有效分类与识别小目标且有更高的准确率和更低的虚警率, 同时保持了原始 YOLOv3 算法的实时性.

关键词： YOLOv3; 小目标检测; 深度学习

本文引用格式

邵慧翔, 曾丹 . 基于改进 YOLOv3 算法的水下小目标分类与识别[J]. 上海大学学报(自然科学版), 2021 , 27(3) : 481 -491 . DOI: 10.12066/j.issn.1007-2861.2279

Abstract

This study proposes an improved YOLOv3 algorithm designed to address the twin issues of low detection and recognition rate and high false alarm rate with respect to the detection of small targets by sonar. The improved YOLOv3 network is optimised on the basis of the original YOLOv3 algorithm, with the hierarchical connection of the network changed and the features of the shallow and deep layers fused to form a new larger-scale detection layer. Concurrently, the linear scaling $K$-means clustering algorithm is used to optimise the calculation of the number of a priori boxes and the aspect ratio, thereby improving the correlation between the a priori and ground truth boxes. These modifications improve the average accuracy of the YOLOv3 algorithm by 7%. Experimental results show that the proposed improvements to the YOLOv3 algorithm result in the effective identification of small targets with higher accuracy and lower false alarm rate, while maintaining the real-time processing capabilities of the YOLOv3 algorithm.

Key words： YOLOv3; small target detection; deep learning

参考文献

[1]	Sherin B M, Supriya M H. SOS based selection and parameter optimization for underwater target classification[C]// IEEE OCEANS 2016 MTS. 2016: 1-4.
[2]	Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017,39(6):1137-1149.
[3]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]// IEEE Conference on Computer Vision and Pattern Recognition. 2016.
[4]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]// European Conference on Computer Vision. 2016.
[5]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[C]// International Conference on Computer Vision and Pattern Recognition. 2017.
[6]	Li Z, Xu X, Shen F, et al. CR-FPN: channel relation feature pyramid network for object detection[J]. Wireless Networks, 2020, DOI: 10.1007/s11276-020-02391-3.
[7]	Li J L, Luo S J, Zhu Z Q, et al. 3D IoU-Net: IoU guided 3D object detector for point clouds[EB/OL]. (2020-04-10)[2020-09-20]. https://arxiv.org/abs/2004.04962.
[8]	Li C L, Ravanbakhsh S, Poczos B. Annealing Gaussian into ReLU: a new sampling strategy for Leaky-ReLU RBM[EB/OL]. (2016-11-11)[2020-09-20]. https://arxiv.org/abs/1611.03879.
[9]	Xu L, Choy C S, Li Y W. Deep sparse rectifier neural networks for speech denoising[C]// 2016 IEEE International Workshop on Acoustic Signal Enhancement. 2016.
[10]	Rathi D, Jain S, Indu S. Underwater fish species classification using convolutional neural network and deep learning [C]// 2017 Ninth International Conference on Advances in Pattern Recognition (ICAPR). 2017.
[11]	Smith L N. Cyclical learning rates for training neural networks[C]// 2017 IEEE Winter Conference on Applications of Computer Vision. 2017.
[12]	Zhang F. Modular configuration of service elements based on the improved $K$-means algorithm[J]. Expert Systems, 2019,36(5):e12344.
[13]	Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, Inception-ResNet and the impact of residual connections on learning[C]// International Conference on Learning Representations. 2016.
[14]	Santurkar S, Tsipras D, Ilyas A, et al. How does batch normalization help optimization?[C]// 32nd Conference on Neural Information Processing Systems. 2018.
[15]	宋刚, 杜宏伟, 王平, 等. 纹理细节保持的图像插值算法[J]. 计算机科学, 2019(增刊 1):169-176.
[15]	Song G, Du H W, Wang P, et al. Image interpolation algorithm for texture detail preservation[J]. Computer Science, 2019(Suppl.1):169-176.
[16]	秦玺淳. 数字图像处理技术 [J]. 数字通信世界, 2017(12): 174, 207.
[16]	Qin X C. Digital image processing technology [J]. Digital Communication World, 2017(12): 174, 207.
[17]	赵永强, 饶元, 董世鹏, 等. 深度学习目标检测方法综述[J]. 中国图象图形学报, 2020,25(4):5-30.
[17]	Zhao Y Q, Rao Y, Dong S P, et al. Overview of deep learning target detection methods[J]. Journal of Image and Graphics, 2020,25(4):5-30.

Options

文章导航

模态框（Modal）标题

摘要

本文引用格式

Abstract

参考文献