Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (10): 2011-2017    DOI: 10.3785/j.issn.1008-973X.2023.10.010
计算机技术、自动化技术     
基于门控特征融合与中心损失的目标识别
莫建文(),李晋,蔡晓东*(),陈锦威
桂林电子科技大学 信息与通信学院,广西 桂林 541004
Target recognition based on gated feature fusion and center loss
Jian-wen MO(),Jin LI,Xiao-dong CAI*(),Jin-wei CHEN
School of Information and Communication, Guilin University of Electronic Technology, Guilin 541004, China
 全文: PDF(1598 KB)   HTML
摘要:

针对目标活动、光线及摄像头距离等问题,提出一种基于门控特征融合与中心损失的目标识别方法. 门控特征融合是为了弥补单一特征信息丢失时,身份识别准确率下降的缺陷. 门控结构指导网络对输入的人脸、行人特征进行贡献量评估, 再根据贡献量去分配权值,组合产生识别性更强的身份特征. 通过添加中心损失函数,在引导网络下减少了特征的类内距离,使得特征更具判别性. 实验结果表明,在自建数据集上所提方法的最终识别准确率最高可以达到76.35%,优于单特征识别方法以及多种融合方法,使用所提的融合损失函数后,平均识别准确率可提高2.63%.

关键词: 身份识别监控场景特征融合门控机制中心距离损失    
Abstract:

A target identification method based on gated feature fusion with center loss was proposed, aiming at the problems of target activity, light and camera distance. Gated feature fusion was designed to compensate for the decrease in identity recognition accuracy when the single feature information was lost. Gated structure guidance network evaluated the contribution of input facial and pedestrian features, and weights were assigned according to the contribution to produce a more recognizable identity feature. By adding a center loss function, the intra-class distance of the features was reduced under the guidance network, making the features more discriminative. The final recognition accuracy of the proposed method on the self-constructed dataset could reach up to 76.35%, which was better than that of single-feature recognition methods and multiple fusion methods. The average recognition accuracy could be improved by 2.63% with the proposed fusion loss function.

Key words: identification    surveillance scene    feature fusion    gated mechanism    center distance loss
收稿日期: 2022-09-22 出版日期: 2023-10-18
CLC:  TP 391  
基金资助: 国家自然科学基金资助项目(62001133, 62177012); 广西创新驱动发展专项项目(AA20302001);广西无线宽带通信与信号处理重点实验室基金资助项目(GXKL06200114)
通讯作者: 蔡晓东     E-mail: Mo_jianwen@126.com;caixiaodong@guet.edu.cn
作者简介: 莫建文(1972—) ,男,副教授,博士,从事机器视觉及图像处理等研究. orcid.org/0000-0002-1729-1284. E-mail: Mo_jianwen@126.com
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
莫建文
李晋
蔡晓东
陈锦威

引用本文:

莫建文,李晋,蔡晓东,陈锦威. 基于门控特征融合与中心损失的目标识别[J]. 浙江大学学报(工学版), 2023, 57(10): 2011-2017.

Jian-wen MO,Jin LI,Xiao-dong CAI,Jin-wei CHEN. Target recognition based on gated feature fusion and center loss. Journal of ZheJiang University (Engineering Science), 2023, 57(10): 2011-2017.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.10.010        https://www.zjujournals.com/eng/CN/Y2023/V57/I10/2011

图 1  GFFN模型框架图
图 2  GFFN的输入模块
图 3  多种特征融合方法图
图 4  门控分类损失和中心距离损失的连接图
图 5  G-campus1 392数据集样例
数据集 $ {N_{\text{u}}} $
训练集 测试集
Randomdata1 15 138 16 486 3 480
Randomdata2 15 846 15 778 3 480
Randomdata3 15 354 16 270 3 480
表 1  G-campus1392数据集的图片数量
方法 Randomdata1 Randomdata2 Randomdata3
ACC/% mAP/% ACC/% mAP/% ACC/% mAP/%
人脸分类 40.659 35.532 41.615 36.089 39.447 34.389
行人分类 55.265 52.275 55.451 51.527 53.737 50.626
特征相加融合 59.878 55.585 60.235 55.961 57.367 54.146
首尾拼接融合 61.749 57.890 61.313 57.091 59.939 55.851
软注意力融合 64.582 59.835 63.519 58.936 62.698 56.261
门控特征融合 73.893 69.342 73.305 68.583 71.807 67.280
表 2  多种识别方法的结果对比
方法 Randomdata1 Randomdata2 Randomdata3
L1 L2 L1 L2 L1 L2
人脸分类 40.659 43.989 41.615 44.219 39.447 42.612
行人分类 55.265 61.197 55.451 60.698 53.737 59.213
特征相加融合 59.878 65.235 60.235 67.593 57.367 66.326
首尾拼接融合 61.749 71.430 61.313 70.681 59.939 69.490
软注意力融合 64.582 72.298 63.519 71.796 62.698 71.008
门控特征融合 73.893 75.798 73.305 76.347 71.807 74.714
表 3  分类网络增加中心距离损失后的ACC值
方法 Randomdata1 Randomdata2 Randomdata3
L1 L2 L1 L2 L1 L2
人脸分类 35.532 37.925 36.089 37.993 34.389 36.665
行人分类 52.275 56.777 51.527 56.182 50.626 54.934
特征相加融合 55.585 61.623 55.961 61.962 54.146 59.247
首尾拼接融合 57.890 65.642 57.091 64.684 55.851 62.271
软注意力融合 59.835 67.234 58.936 66.039 56.261 64.915
门控特征融合 69.342 71.461 68.583 71.257 67.280 69.715
表 4  分类网络增加中心距离损失后的mAP值
图 6  所提方法的错误样本
1 WANG K, WANG S, ZHANG P, et al. An efficient training approach for very large scale face recognition [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4083-4092.
2 ZHU H, KE W, LI D, et al. Dual cross-attention learning for fine-grained visual categorization and object re-identification [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 2022: 4692-4702.
3 YE F, YANG J A deep neural network model for speaker identification[J]. Applied Sciences, 2021, 11 (8): 3603
doi: 10.3390/app11083603
4 YE M, SHEN J, SHAO L Visible-infrared person re-identification via homogeneous augmented tri-modal learning[J]. IEEE Transactions on Information Forensics and Security, 2020, 16: 728- 739
5 QIAN Y, CHEN Z, WANG S Audio visual deep neural network for robust person verification[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 29: 1079- 1092
doi: 10.1109/TASLP.2021.3057230
6 SARANGI P P, NAYAK D R, PANDA M, et al A feature-level fusion based improved multimodal biometric recognition system using ear and profile face[J]. Journal of Ambient Intelligence and Humanized Computing, 2022, 13: 1867- 1898
7 GUO W, WANG J, WANG S Deep multimodal representation learning: a survey[J]. IEEE Access, 2019, 7: 63373- 63394
doi: 10.1109/ACCESS.2019.2916887
8 AREVALO J, SOLORIO T, MONTESYGOMEZ M, et al Gated multimodal networks[J]. Neural Computing and Applications, 2020, 32: 10209- 10228
doi: 10.1007/s00521-019-04559-1
9 DICKSON M C, BOSMAN A S, MALAN K M. Hybridised loss functions for improved neural network generalisation [C]// Pan African Artificial Intelligence and Smart Systems: First International Conference. Cham: SIP, 2022: 169-181.
10 WEN Y, ZHANG K, LI Z, et al. A discriminative feature learning approach for deep face recognition [C]// Computer Vision ECCV 2016: 14th European Conference. Netherlands: SIP, 2016: 499-515.
11 DENG J, GUO J, XUE N, et al. Arcface: Additive angular margin loss for deep face recognition [C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 4690-4699.
12 SUN Y, ZHENG L, YANG Y, et al. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline) [C]// Proceedings of the European Conference on Computer Vision. Munich: [s. n. ], 2018: 480-496.
[1] 韩俊,袁小平,王准,陈烨. 基于YOLOv5s的无人机密集小目标检测算法[J]. 浙江大学学报(工学版), 2023, 57(6): 1224-1233.
[2] 艾青林,崔景瑞,吕冰海,童桐. 基于损伤区域融合变换的轴承鼓形滚子表面损伤检测方法[J]. 浙江大学学报(工学版), 2023, 57(5): 1009-1020.
[3] 张剑钊,郭继昌,汪昱东. 基于融合逆透射率图的水下图像增强算法[J]. 浙江大学学报(工学版), 2023, 57(5): 921-929.
[4] 杨帆,宁博,李怀清,周新,李冠宇. 基于语义增强特征融合的多模态图像检索模型[J]. 浙江大学学报(工学版), 2023, 57(2): 252-258.
[5] 艾青林,杨佳豪,崔景瑞. 基于自适应增殖数据增强与全局特征融合的小目标行人检测[J]. 浙江大学学报(工学版), 2023, 57(10): 1933-1944.
[6] 张娜,戚旭磊,包晓安,吴彪,涂小妹,金瑜婷. 基于优化预测定位的单阶段目标检测算法[J]. 浙江大学学报(工学版), 2022, 56(4): 783-794.
[7] 陈巧红,裴皓磊,孙麒. 基于视觉关系推理与上下文门控机制的图像描述[J]. 浙江大学学报(工学版), 2022, 56(3): 542-549.
[8] 黄新宇,游帆,张沛,张昭,张柏礼,吕建华,徐立臻. 基于多分类及特征融合的静默活体检测算法[J]. 浙江大学学报(工学版), 2022, 56(2): 263-270.
[9] 于楠晶,范晓飚,邓天民,冒国韬. 基于多头自注意力的复杂背景船舶检测算法[J]. 浙江大学学报(工学版), 2022, 56(12): 2392-2402.
[10] 张云佐,郭威,蔡昭权,李文博. 联合多尺度与注意力机制的遥感图像目标检测[J]. 浙江大学学报(工学版), 2022, 56(11): 2215-2223.
[11] 李明,段立娟,王文健,恩擎. 基于显著稀疏强关联的脑功能连接分类方法[J]. 浙江大学学报(工学版), 2022, 56(11): 2232-2240.
[12] 徐泽鑫,段立娟,王文健,恩擎. 基于上下文特征融合的代码漏洞检测方法[J]. 浙江大学学报(工学版), 2022, 56(11): 2260-2270.
[13] 杨栋杰,高贤君,冉树浩,张广斌,王萍,杨元维. 基于多重多尺度融合注意力网络的建筑物提取[J]. 浙江大学学报(工学版), 2022, 56(10): 1924-1934.
[14] 陈智超,焦海宁,杨杰,曾华福. 基于改进MobileNet v2的垃圾图像分类算法[J]. 浙江大学学报(工学版), 2021, 55(8): 1490-1499.
[15] 陈岳林,田文靖,蔡晓东,郑淑婷. 基于密集连接网络和多维特征融合的文本匹配模型[J]. 浙江大学学报(工学版), 2021, 55(12): 2352-2358.