A deep convolutional neural network was applied in view of the small sample size of micro-expression databases, in order to spot facial micro-expressions more accurately from videos through transfer learning. A pre-trained deep convolutional neural network model was selected, and the convolutional layers and the pre-trained parameters were reserved. The full connected layer and the classifier were added after these layers to construct a deep binary classification micro-expression spotting network (MesNet). The concept of transition frame and an adaptive recognition algorithm of transition frames were proposed to remove the noisy labels from micro-expression databases that disturbed the network training. Experimental results show that the AUC values of MesNet on CASME II, SMIC-E-HS and CAS(ME)2 reach 0.955 6, 0.933 8 and 0.785 3, respectively. Among three databases, MesNet achieves state-of-the-art results both on CASME II which is a short video database and CAS(ME)2 which is a long video database. It shows that the proposed MesNet has the characteristics of high accuracy and wide application range. Comparison experiment results of the transition frame show that removing the transition frames from original videos when constructing the training set can effectively improve the micro-expression spotting performance of MesNet.
Xiao-feng FU,Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU. Deep micro-expression spotting network training based on concept of transition frame. Journal of ZheJiang University (Engineering Science), 2020, 54(11): 2128-2137.
Fig.3Adaptive algorithm of transition frames recognition
Fig.4Probability distribution of samples waiting to remove transition frames
Fig.5Facial landmarks detection
Fig.6Face alignment coordinates setting
Fig.7Micro-expression region cropping
Fig.8Image preprocessing examples of CASME II
Fig.9Binary classification confusion matrix
模型
图片输入尺寸
训练迭代次数
模型大小/MB
L
AUC
MesNet-VGG-19
224×224×3
767
1 096
0.029 0
0.837 7
MesNet-Inception V3
299×299×3
1 239
115
0.028 9
0.920 0
MesNet-Inception V4
299×299×3
1 558
184
0.000 1
0.887 6
MesNet-Res V2-50
224×224×3
124
109
0.008 5
0.787 1
MesNet-Res V2-101
224×224×3
319
182
0.001 5
0.844 8
MesNet-Res V2-152
224×224×3
231
242
0.006 9
0.844 8
MesNet-Inception-Res-V2
299×299×3
5 362
235
0.011 5
0.952 6
Tab.1Training details and AUC values of various MesNet models on CASME II
Fig.10ROC curves of 5 MesNet models on CASME II
方法
AUC
CASME II
SMIC-E-HS
CAS(ME)2
0%
0.938 6
0.906 2
0.752 8
10%
0.952 6
0.918 0
0.763 9
20%
0.944 9
0.901 7
0.742 1
30%
0.916 2
0.888 3
0.736 1
自适应
0.955 6
0.933 8
0.785 3
Tab.2AUC values of MesNet using different methods of removing transition frames
Fig.11ROC curves of different methods to remove transition frames on SMIC-E-HS and CAS(ME)2
预处理阶段
AUC
CASME II
SMIC-E-HS
CAS(ME)2
图8(a)
0.613 4
0.574 5
0.532 8
图8(b)
0.770 5
0.721 1
0.603 1
图8(c)
0.938 6
0.906 2
0.752 8
Tab.3Comparison of AUC values in different stages of preprocessing
方法
Precision
Recall
F-Measure
Accuracy
AUC
3D HOG – XT[29]
0.534 1
0.623 5
0.575 4
0.735 5
0.726 1
Frame differences[14]
?
?
?
0.817 5
?
HOOF[30]
?
?
?
?
0.649 9
LBP[30]
?
?
?
?
0.929 8
CFD[28]
?
?
?
?
0.941 9
MesNet
0.939 6
0.947 8
0.943 7
0.914 6
0.955 6
Tab.4Performance comparison among MesNet and existing methods on CASME II
方法
Precision
Recall
F-Measure
Accuracy
AUC
HOOF[30]
?
?
?
?
0.694 1
LBP[30]
?
?
?
?
0.833 2
Riesz Pyramid[31]
?
?
?
?
0.898 0
CFD[28]
?
?
?
?
0.970 6
MesNet
0.969 0
0.988 5
0.978 7
0.959 6
0.933 8
Tab.5Performance comparison among MesNet and existing methods on SMIC-E-HS
方法
Precision
Recall
F-Measure
Accuracy
AUC
MDMD[32]
0.352 1
0.319 0
0.334 8
0.735 5
0.654 8
LBP[18]
?
?
?
?
0.663 9
MesNet
0.960 2
0.996 2
0.977 9
0.957 0
0.785 3
Tab.6Performance comparison among MesNet and existing methods on CAS(ME)2
视频
数据库
时长/s
帧总数
Precision
Recall
F-Measure
Accuracy
AUC
14_EP09_03
CASME II
0.26
51
0.837 2
1.000 0
0.911 4
0.862 7
0.983 2
31_0401girlcrashing
CAS(ME)2
123.7
3 712
0.978 4
1.000 0
0.989 1
0.978 4
0.789 3
Tab.7Performance comparison between a long video and a short video
[1]
HAPPY S L, ROUTRAY A Fuzzy histogram of optical flow orientations for micro-expression recognition[J]. IEEE Transactions on Affective Computing, 2017, 10 (3): 394- 406
[2]
KHOR H Q, SEE J, PHAN R C W, et al. Enriched long-term recurrent convolutional network for facial micro-expression recognition [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 667-674.
[3]
贲晛烨, 杨明强, 张鹏, 等 微表情自动识别综述[J]. 计算机辅助设计与图形学学报, 2014, 26 (9): 1385- 1395 BEN Xian-ye, YANG Ming-qiang, ZHANG Peng, et al Survey on automatic micro expression recognition methods[J]. Journal of Computer-Aided Design and Computer Graphics, 2014, 26 (9): 1385- 1395
[4]
EKMAN P, FRIESEN W V Nonverbal leakage and clues to deception[J]. Psychiatry, 1969, 32 (1): 88- 106
doi: 10.1080/00332747.1969.11023575
[5]
EKMAN P. The philosophy of deception [M]. Oxford: Oxford University Press, 2009: 118-133.
[6]
BRINKE, PORTER L T Reading between the lies: identifying concealed and falsified emotions in universal facial expressions[J]. Psychological Science, 2008, 19 (5): 508- 514
doi: 10.1111/j.1467-9280.2008.02116.x
[7]
BERNSTEIN D M, LOFTUS E F How to tell if a particular memory is true or false[J]. Perspectives on Psychological Science, 2009, 4 (4): 370- 374
doi: 10.1111/j.1745-6924.2009.01140.x
[8]
RUSSELL T A, CHU E, PHILLIPS M L A pilot study to investigate the effectiveness of emotion recognition remediation in schizophrenia using the micro-expression training tool[J]. British Journal of Clinical Psychology, 2006, 45 (4): 579- 583
doi: 10.1348/014466505X90866
[9]
SALTER F, GRAMMER K, RIKOWSKI A Sex differences in negotiating with powerful males[J]. Human Nature, 2005, 16 (3): 306- 321
doi: 10.1007/s12110-005-1013-4
[10]
PENG M, WU Z, ZHANG Z, et al. From macro to micro expression recognition: deep learning on small datasets using transfer learning [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 657-661.
[11]
付晓峰, 吴俊, 牛力 小数据样本深度迁移网络自发表情分类[J]. 中国图象图形学报, 2019, 24 (5): 753- 761 FU Xiao-feng, WU Jun, NIU Li Classification of small spontaneous expression database based on deep transfer learning network[J]. Journal of Image and Graphics, 2019, 24 (5): 753- 761
[12]
LIONG S T, SEE J, PHAN R C W, et al Spontaneous subtle expression detection and recognition based on facial strain[J]. Signal Processing: Image Communication, 2016, 47: 170- 182
doi: 10.1016/j.image.2016.06.004
[13]
LI X, YU J, ZHAN S. Spontaneous facial micro-expression detection based on deep learning [C] // 2016 IEEE 13th International Conference on Signal Processing. Chengdu: IEEE, 2016: 1130-1134.
[14]
DIANA B, RADU D, RAZVAN I, et al High-speed video system for micro-expression detection and recognition[J]. Sensors, 2017, 17 (12): 2913- 2931
doi: 10.3390/s17122913
[15]
ZHANG Z, CHEN T, MENG H, et al SMEConvNet: a convolutional neural network for spotting spontaneous facial micro-expression from long videos[J]. IEEE Access, 2018, 6: 71143- 71151
doi: 10.1109/ACCESS.2018.2879485
[16]
YAN W J, LI X, WANG S J, et al CASME II: an improved spontaneous micro-expression database and the baseline evaluation[J]. PLOS ONE, 2014, 9 (1): 1- 8
[17]
LI X, PFISTER T, HUANG X, et al. A spontaneous micro-expression database: inducement, collection and baseline [C] // 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Shanghai: IEEE, 2013: 1-6.
[18]
QU F, WANG S J, YAN W J, et al CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition [J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 424- 436
doi: 10.1109/TAFFC.2017.2654440
[19]
SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[20]
HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[21]
SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818-2826.
[22]
SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning [C] // AAAIConference on Artificial Intelligence. San Francisco: AAAI, 2017: 4-12.
[23]
DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C] // 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248-255.
[24]
PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359
doi: 10.1109/TKDE.2009.191
[25]
GLOROT X, BENGIO Y Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9: 249- 256
[26]
EKMAN P, FRIESEN W V. Facial action coding system: a technique for the measurement of facial movement [M]. Palo Alto: Consulting Psychologists Press, 1978.
[27]
FAWCETT T An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27 (8): 861- 874
doi: 10.1016/j.patrec.2005.10.010
[28]
HAN Y, LI B J, LAI Y K, et al. CFD: A collaborative feature difference method for spontaneous micro-expression spotting [C] // 2018 25th IEEE International Conference on Image Processing. Athens: IEEE, 2018: 1942-1946.
[29]
DAVISON A K, LANSLEY C, NG C C, et al. Objective micro-facial movement detection using facs-based regions and baseline evaluation [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 642-649.
[30]
LI X, HONG X, MOILANEN A, et al Towards reading hidden emotions: a comparative study of spontaneous micro-expression spotting and recognition methods[J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 563- 577
doi: 10.1109/TAFFC.2017.2667642
[31]
DUQUE C A, ALATA O, EMONET R, et al. Micro-expression spotting using the Riesz pyramid [C] // 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 66-74.