Please wait a minute...
浙江大学学报(工学版)  2020, Vol. 54 Issue (11): 2128-2137    DOI: 10.3785/j.issn.1008-973X.2020.11.008
计算机与控制工程     
基于过渡帧概念训练的微表情检测深度网络
付晓峰(),牛力,胡卓群,李建军,吴卿
杭州电子科技大学 计算机学院,浙江 杭州 310018
Deep micro-expression spotting network training based on concept of transition frame
Xiao-feng FU(),Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
 全文: PDF(1231 KB)   HTML
摘要:

为了更准确地从视频中检测面部微表情,针对微表情数据库样本规模较小的特点,采用迁移学习方法将深度卷积神经网络应用于微表情检测问题. 选取预训练过的深度卷积神经网络模型,保留卷积层及预训练参数,添加全连接层和分类器,构造一个二分类的微表情检测深度网络(MesNet). 为了去除微表情数据库中影响网络训练的噪声标签,提出过渡帧的概念和自适应识别过渡帧算法. MesNet在CASME II、SMIC-E-HS与 CAS(ME)2数据库上的曲线下面积(AUC)分别达到0.955 6、0.933 8与0.785 3,其中在CASME II短视频数据库和 CAS(ME)2长视频数据库上均取得最优结果,表明MesNet具有高精度和广适用范围的特点;过渡帧对比实验结果表明,构造训练集时从原始视频中去除过渡帧能够有效提高MesNet微表情检测性能.

关键词: 微表情检测迁移学习深度卷积神经网络二分类过渡帧    
Abstract:

A deep convolutional neural network was applied in view of the small sample size of micro-expression databases, in order to spot facial micro-expressions more accurately from videos through transfer learning. A pre-trained deep convolutional neural network model was selected, and the convolutional layers and the pre-trained parameters were reserved. The full connected layer and the classifier were added after these layers to construct a deep binary classification micro-expression spotting network (MesNet). The concept of transition frame and an adaptive recognition algorithm of transition frames were proposed to remove the noisy labels from micro-expression databases that disturbed the network training. Experimental results show that the AUC values of MesNet on CASME II, SMIC-E-HS and CAS(ME)2 reach 0.955 6, 0.933 8 and 0.785 3, respectively. Among three databases, MesNet achieves state-of-the-art results both on CASME II which is a short video database and CAS(ME)2 which is a long video database. It shows that the proposed MesNet has the characteristics of high accuracy and wide application range. Comparison experiment results of the transition frame show that removing the transition frames from original videos when constructing the training set can effectively improve the micro-expression spotting performance of MesNet.

Key words: micro-expression spotting    transfer learning    deep convolutional neural network    binary classification    transition frame
收稿日期: 2019-10-19 出版日期: 2020-12-15
CLC:  TP 301.6  
基金资助: 国家自然科学基金资助项目(61672199);浙江省科技计划资助项目(2018C01030);浙江省自然科学基金资助项目(Y1110232)
作者简介: 付晓峰(1981—),女,副教授,博士,从事计算机视觉、模式识别、人工智能等研究. orcid.org/0000-0003-4903-5266. E-mail: fuxiaofeng@hdu.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
付晓峰
牛力
胡卓群
李建军
吴卿

引用本文:

付晓峰,牛力,胡卓群,李建军,吴卿. 基于过渡帧概念训练的微表情检测深度网络[J]. 浙江大学学报(工学版), 2020, 54(11): 2128-2137.

Xiao-feng FU,Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU. Deep micro-expression spotting network training based on concept of transition frame. Journal of ZheJiang University (Engineering Science), 2020, 54(11): 2128-2137.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2020.11.008        http://www.zjujournals.com/eng/CN/Y2020/V54/I11/2128

图 1  MesNet训练流程图
图 2  CASME II视频片段示例
图 3  自适应识别过渡帧算法
图 4  待去除过渡帧样本概率分布图
图 5  人脸特征点检测
图 6  人脸对齐坐标设置
图 7  微表情区域裁剪
图 8  CASME II数据库图像预处理示例
图 9  二分类混淆矩阵
模型 图片输入尺寸 训练迭代次数 模型大小/MB L AUC
MesNet-VGG-19 224×224×3 767 1 096 0.029 0 0.837 7
MesNet-Inception V3 299×299×3 1 239 115 0.028 9 0.920 0
MesNet-Inception V4 299×299×3 1 558 184 0.000 1 0.887 6
MesNet-Res V2-50 224×224×3 124 109 0.008 5 0.787 1
MesNet-Res V2-101 224×224×3 319 182 0.001 5 0.844 8
MesNet-Res V2-152 224×224×3 231 242 0.006 9 0.844 8
MesNet-Inception-Res-V2 299×299×3 5 362 235 0.011 5 0.952 6
表 1  CASME II数据库上各种版本MesNet网络训练细节及AUC
图 10  CASME II上5种MesNet网络的ROC曲线
方法 AUC
CASME II SMIC-E-HS CAS(ME)2
0% 0.938 6 0.906 2 0.752 8
10% 0.952 6 0.918 0 0.763 9
20% 0.944 9 0.901 7 0.742 1
30% 0.916 2 0.888 3 0.736 1
自适应 0.955 6 0.933 8 0.785 3
表 2  MesNet采用不同去除过渡帧方法的AUC
图 11  SMIC-E-HS和CAS(ME)2数据库上不同去除过渡帧方法的ROC曲线
预处理阶段 AUC
CASME II SMIC-E-HS CAS(ME)2
图8(a) 0.613 4 0.574 5 0.532 8
图8(b) 0.770 5 0.721 1 0.603 1
图8(c) 0.938 6 0.906 2 0.752 8
表 3  预处理不同阶段的AUC对比
方法 Precision Recall F-Measure Accuracy AUC
3D HOG – XT[29] 0.534 1 0.623 5 0.575 4 0.735 5 0.726 1
Frame differences[14] ? ? ? 0.817 5 ?
HOOF[30] ? ? ? ? 0.649 9
LBP[30] ? ? ? ? 0.929 8
CFD[28] ? ? ? ? 0.941 9
MesNet 0.939 6 0.947 8 0.943 7 0.914 6 0.955 6
表 4  CASME II数据库上MesNet与已有方法的性能对比
方法 Precision Recall F-Measure Accuracy AUC
HOOF[30] ? ? ? ? 0.694 1
LBP[30] ? ? ? ? 0.833 2
Riesz Pyramid[31] ? ? ? ? 0.898 0
CFD[28] ? ? ? ? 0.970 6
MesNet 0.969 0 0.988 5 0.978 7 0.959 6 0.933 8
表 5  SMIC-E-HS数据库上MesNet与已有方法的性能对比
方法 Precision Recall F-Measure Accuracy AUC
MDMD[32] 0.352 1 0.319 0 0.334 8 0.735 5 0.654 8
LBP[18] ? ? ? ? 0.663 9
MesNet 0.960 2 0.996 2 0.977 9 0.957 0 0.785 3
表 6  CAS(ME)2数据库上MesNet与已有方法的性能对比
视频 数据库 时长/s 帧总数 Precision Recall F-Measure Accuracy AUC
14_EP09_03 CASME II 0.26 51 0.837 2 1.000 0 0.911 4 0.862 7 0.983 2
31_0401girlcrashing CAS(ME)2 123.7 3 712 0.978 4 1.000 0 0.989 1 0.978 4 0.789 3
表 7  长视频与短视频的性能对比
1 HAPPY S L, ROUTRAY A Fuzzy histogram of optical flow orientations for micro-expression recognition[J]. IEEE Transactions on Affective Computing, 2017, 10 (3): 394- 406
2 KHOR H Q, SEE J, PHAN R C W, et al. Enriched long-term recurrent convolutional network for facial micro-expression recognition [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 667-674.
3 贲晛烨, 杨明强, 张鹏, 等 微表情自动识别综述[J]. 计算机辅助设计与图形学学报, 2014, 26 (9): 1385- 1395
BEN Xian-ye, YANG Ming-qiang, ZHANG Peng, et al Survey on automatic micro expression recognition methods[J]. Journal of Computer-Aided Design and Computer Graphics, 2014, 26 (9): 1385- 1395
4 EKMAN P, FRIESEN W V Nonverbal leakage and clues to deception[J]. Psychiatry, 1969, 32 (1): 88- 106
doi: 10.1080/00332747.1969.11023575
5 EKMAN P. The philosophy of deception [M]. Oxford: Oxford University Press, 2009: 118-133.
6 BRINKE, PORTER L T Reading between the lies: identifying concealed and falsified emotions in universal facial expressions[J]. Psychological Science, 2008, 19 (5): 508- 514
doi: 10.1111/j.1467-9280.2008.02116.x
7 BERNSTEIN D M, LOFTUS E F How to tell if a particular memory is true or false[J]. Perspectives on Psychological Science, 2009, 4 (4): 370- 374
doi: 10.1111/j.1745-6924.2009.01140.x
8 RUSSELL T A, CHU E, PHILLIPS M L A pilot study to investigate the effectiveness of emotion recognition remediation in schizophrenia using the micro-expression training tool[J]. British Journal of Clinical Psychology, 2006, 45 (4): 579- 583
doi: 10.1348/014466505X90866
9 SALTER F, GRAMMER K, RIKOWSKI A Sex differences in negotiating with powerful males[J]. Human Nature, 2005, 16 (3): 306- 321
doi: 10.1007/s12110-005-1013-4
10 PENG M, WU Z, ZHANG Z, et al. From macro to micro expression recognition: deep learning on small datasets using transfer learning [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 657-661.
11 付晓峰, 吴俊, 牛力 小数据样本深度迁移网络自发表情分类[J]. 中国图象图形学报, 2019, 24 (5): 753- 761
FU Xiao-feng, WU Jun, NIU Li Classification of small spontaneous expression database based on deep transfer learning network[J]. Journal of Image and Graphics, 2019, 24 (5): 753- 761
12 LIONG S T, SEE J, PHAN R C W, et al Spontaneous subtle expression detection and recognition based on facial strain[J]. Signal Processing: Image Communication, 2016, 47: 170- 182
doi: 10.1016/j.image.2016.06.004
13 LI X, YU J, ZHAN S. Spontaneous facial micro-expression detection based on deep learning [C] // 2016 IEEE 13th International Conference on Signal Processing. Chengdu: IEEE, 2016: 1130-1134.
14 DIANA B, RADU D, RAZVAN I, et al High-speed video system for micro-expression detection and recognition[J]. Sensors, 2017, 17 (12): 2913- 2931
doi: 10.3390/s17122913
15 ZHANG Z, CHEN T, MENG H, et al SMEConvNet: a convolutional neural network for spotting spontaneous facial micro-expression from long videos[J]. IEEE Access, 2018, 6: 71143- 71151
doi: 10.1109/ACCESS.2018.2879485
16 YAN W J, LI X, WANG S J, et al CASME II: an improved spontaneous micro-expression database and the baseline evaluation[J]. PLOS ONE, 2014, 9 (1): 1- 8
17 LI X, PFISTER T, HUANG X, et al. A spontaneous micro-expression database: inducement, collection and baseline [C] // 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Shanghai: IEEE, 2013: 1-6.
18 QU F, WANG S J, YAN W J, et al CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition [J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 424- 436
doi: 10.1109/TAFFC.2017.2654440
19 SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
20 HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
21 SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818-2826.
22 SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning [C] // AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017: 4-12.
23 DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C] // 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248-255.
24 PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359
doi: 10.1109/TKDE.2009.191
25 GLOROT X, BENGIO Y Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9: 249- 256
26 EKMAN P, FRIESEN W V. Facial action coding system: a technique for the measurement of facial movement [M]. Palo Alto: Consulting Psychologists Press, 1978.
27 FAWCETT T An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27 (8): 861- 874
doi: 10.1016/j.patrec.2005.10.010
28 HAN Y, LI B J, LAI Y K, et al. CFD: A collaborative feature difference method for spontaneous micro-expression spotting [C] // 2018 25th IEEE International Conference on Image Processing. Athens: IEEE, 2018: 1942-1946.
29 DAVISON A K, LANSLEY C, NG C C, et al. Objective micro-facial movement detection using facs-based regions and baseline evaluation [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 642-649.
30 LI X, HONG X, MOILANEN A, et al Towards reading hidden emotions: a comparative study of spontaneous micro-expression spotting and recognition methods[J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 563- 577
doi: 10.1109/TAFFC.2017.2667642
31 DUQUE C A, ALATA O, EMONET R, et al. Micro-expression spotting using the Riesz pyramid [C] // 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 66-74.
[1] 郑英杰,吴松荣,韦若禹,涂振威,廖进,刘东. 基于目标图像FCM算法的地铁定位点匹配及误报排除方法[J]. 浙江大学学报(工学版), 2021, 55(3): 586-593.
[2] 康庄,杨杰,郭濠奇. 基于机器视觉的垃圾自动分类系统设计[J]. 浙江大学学报(工学版), 2020, 54(7): 1272-1280.
[3] 沈宗礼,余建波. 基于迁移学习与深度森林的晶圆图缺陷识别[J]. 浙江大学学报(工学版), 2020, 54(6): 1228-1239.