Please wait a minute...
Journal of ZheJiang University (Engineering Science)  2020, Vol. 54 Issue (11): 2128-2137    DOI: 10.3785/j.issn.1008-973X.2020.11.008
    
Deep micro-expression spotting network training based on concept of transition frame
Xiao-feng FU(),Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU
School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, China
Download: HTML     PDF(1231KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A deep convolutional neural network was applied in view of the small sample size of micro-expression databases, in order to spot facial micro-expressions more accurately from videos through transfer learning. A pre-trained deep convolutional neural network model was selected, and the convolutional layers and the pre-trained parameters were reserved. The full connected layer and the classifier were added after these layers to construct a deep binary classification micro-expression spotting network (MesNet). The concept of transition frame and an adaptive recognition algorithm of transition frames were proposed to remove the noisy labels from micro-expression databases that disturbed the network training. Experimental results show that the AUC values of MesNet on CASME II, SMIC-E-HS and CAS(ME)2 reach 0.955 6, 0.933 8 and 0.785 3, respectively. Among three databases, MesNet achieves state-of-the-art results both on CASME II which is a short video database and CAS(ME)2 which is a long video database. It shows that the proposed MesNet has the characteristics of high accuracy and wide application range. Comparison experiment results of the transition frame show that removing the transition frames from original videos when constructing the training set can effectively improve the micro-expression spotting performance of MesNet.



Key wordsmicro-expression spotting      transfer learning      deep convolutional neural network      binary classification      transition frame     
Received: 19 October 2019      Published: 15 December 2020
CLC:  TP 301.6  
Cite this article:

Xiao-feng FU,Li NIU,Zhuo-qun HU,Jian-jun LI,Qing WU. Deep micro-expression spotting network training based on concept of transition frame. Journal of ZheJiang University (Engineering Science), 2020, 54(11): 2128-2137.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2020.11.008     OR     http://www.zjujournals.com/eng/Y2020/V54/I11/2128


基于过渡帧概念训练的微表情检测深度网络

为了更准确地从视频中检测面部微表情,针对微表情数据库样本规模较小的特点,采用迁移学习方法将深度卷积神经网络应用于微表情检测问题. 选取预训练过的深度卷积神经网络模型,保留卷积层及预训练参数,添加全连接层和分类器,构造一个二分类的微表情检测深度网络(MesNet). 为了去除微表情数据库中影响网络训练的噪声标签,提出过渡帧的概念和自适应识别过渡帧算法. MesNet在CASME II、SMIC-E-HS与 CAS(ME)2数据库上的曲线下面积(AUC)分别达到0.955 6、0.933 8与0.785 3,其中在CASME II短视频数据库和 CAS(ME)2长视频数据库上均取得最优结果,表明MesNet具有高精度和广适用范围的特点;过渡帧对比实验结果表明,构造训练集时从原始视频中去除过渡帧能够有效提高MesNet微表情检测性能.


关键词: 微表情检测,  迁移学习,  深度卷积神经网络,  二分类,  过渡帧 
Fig.1 Training flow chart of MesNet
Fig.2 Example of CASME II video clips
Fig.3 Adaptive algorithm of transition frames recognition
Fig.4 Probability distribution of samples waiting to remove transition frames
Fig.5 Facial landmarks detection
Fig.6 Face alignment coordinates setting
Fig.7 Micro-expression region cropping
Fig.8 Image preprocessing examples of CASME II
Fig.9 Binary classification confusion matrix
模型 图片输入尺寸 训练迭代次数 模型大小/MB L AUC
MesNet-VGG-19 224×224×3 767 1 096 0.029 0 0.837 7
MesNet-Inception V3 299×299×3 1 239 115 0.028 9 0.920 0
MesNet-Inception V4 299×299×3 1 558 184 0.000 1 0.887 6
MesNet-Res V2-50 224×224×3 124 109 0.008 5 0.787 1
MesNet-Res V2-101 224×224×3 319 182 0.001 5 0.844 8
MesNet-Res V2-152 224×224×3 231 242 0.006 9 0.844 8
MesNet-Inception-Res-V2 299×299×3 5 362 235 0.011 5 0.952 6
Tab.1 Training details and AUC values of various MesNet models on CASME II
Fig.10 ROC curves of 5 MesNet models on CASME II
方法 AUC
CASME II SMIC-E-HS CAS(ME)2
0% 0.938 6 0.906 2 0.752 8
10% 0.952 6 0.918 0 0.763 9
20% 0.944 9 0.901 7 0.742 1
30% 0.916 2 0.888 3 0.736 1
自适应 0.955 6 0.933 8 0.785 3
Tab.2 AUC values of MesNet using different methods of removing transition frames
Fig.11 ROC curves of different methods to remove transition frames on SMIC-E-HS and CAS(ME)2
预处理阶段 AUC
CASME II SMIC-E-HS CAS(ME)2
图8(a) 0.613 4 0.574 5 0.532 8
图8(b) 0.770 5 0.721 1 0.603 1
图8(c) 0.938 6 0.906 2 0.752 8
Tab.3 Comparison of AUC values in different stages of preprocessing
方法 Precision Recall F-Measure Accuracy AUC
3D HOG – XT[29] 0.534 1 0.623 5 0.575 4 0.735 5 0.726 1
Frame differences[14] ? ? ? 0.817 5 ?
HOOF[30] ? ? ? ? 0.649 9
LBP[30] ? ? ? ? 0.929 8
CFD[28] ? ? ? ? 0.941 9
MesNet 0.939 6 0.947 8 0.943 7 0.914 6 0.955 6
Tab.4 Performance comparison among MesNet and existing methods on CASME II
方法 Precision Recall F-Measure Accuracy AUC
HOOF[30] ? ? ? ? 0.694 1
LBP[30] ? ? ? ? 0.833 2
Riesz Pyramid[31] ? ? ? ? 0.898 0
CFD[28] ? ? ? ? 0.970 6
MesNet 0.969 0 0.988 5 0.978 7 0.959 6 0.933 8
Tab.5 Performance comparison among MesNet and existing methods on SMIC-E-HS
方法 Precision Recall F-Measure Accuracy AUC
MDMD[32] 0.352 1 0.319 0 0.334 8 0.735 5 0.654 8
LBP[18] ? ? ? ? 0.663 9
MesNet 0.960 2 0.996 2 0.977 9 0.957 0 0.785 3
Tab.6 Performance comparison among MesNet and existing methods on CAS(ME)2
视频 数据库 时长/s 帧总数 Precision Recall F-Measure Accuracy AUC
14_EP09_03 CASME II 0.26 51 0.837 2 1.000 0 0.911 4 0.862 7 0.983 2
31_0401girlcrashing CAS(ME)2 123.7 3 712 0.978 4 1.000 0 0.989 1 0.978 4 0.789 3
Tab.7 Performance comparison between a long video and a short video
[1]   HAPPY S L, ROUTRAY A Fuzzy histogram of optical flow orientations for micro-expression recognition[J]. IEEE Transactions on Affective Computing, 2017, 10 (3): 394- 406
[2]   KHOR H Q, SEE J, PHAN R C W, et al. Enriched long-term recurrent convolutional network for facial micro-expression recognition [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 667-674.
[3]   贲晛烨, 杨明强, 张鹏, 等 微表情自动识别综述[J]. 计算机辅助设计与图形学学报, 2014, 26 (9): 1385- 1395
BEN Xian-ye, YANG Ming-qiang, ZHANG Peng, et al Survey on automatic micro expression recognition methods[J]. Journal of Computer-Aided Design and Computer Graphics, 2014, 26 (9): 1385- 1395
[4]   EKMAN P, FRIESEN W V Nonverbal leakage and clues to deception[J]. Psychiatry, 1969, 32 (1): 88- 106
doi: 10.1080/00332747.1969.11023575
[5]   EKMAN P. The philosophy of deception [M]. Oxford: Oxford University Press, 2009: 118-133.
[6]   BRINKE, PORTER L T Reading between the lies: identifying concealed and falsified emotions in universal facial expressions[J]. Psychological Science, 2008, 19 (5): 508- 514
doi: 10.1111/j.1467-9280.2008.02116.x
[7]   BERNSTEIN D M, LOFTUS E F How to tell if a particular memory is true or false[J]. Perspectives on Psychological Science, 2009, 4 (4): 370- 374
doi: 10.1111/j.1745-6924.2009.01140.x
[8]   RUSSELL T A, CHU E, PHILLIPS M L A pilot study to investigate the effectiveness of emotion recognition remediation in schizophrenia using the micro-expression training tool[J]. British Journal of Clinical Psychology, 2006, 45 (4): 579- 583
doi: 10.1348/014466505X90866
[9]   SALTER F, GRAMMER K, RIKOWSKI A Sex differences in negotiating with powerful males[J]. Human Nature, 2005, 16 (3): 306- 321
doi: 10.1007/s12110-005-1013-4
[10]   PENG M, WU Z, ZHANG Z, et al. From macro to micro expression recognition: deep learning on small datasets using transfer learning [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 657-661.
[11]   付晓峰, 吴俊, 牛力 小数据样本深度迁移网络自发表情分类[J]. 中国图象图形学报, 2019, 24 (5): 753- 761
FU Xiao-feng, WU Jun, NIU Li Classification of small spontaneous expression database based on deep transfer learning network[J]. Journal of Image and Graphics, 2019, 24 (5): 753- 761
[12]   LIONG S T, SEE J, PHAN R C W, et al Spontaneous subtle expression detection and recognition based on facial strain[J]. Signal Processing: Image Communication, 2016, 47: 170- 182
doi: 10.1016/j.image.2016.06.004
[13]   LI X, YU J, ZHAN S. Spontaneous facial micro-expression detection based on deep learning [C] // 2016 IEEE 13th International Conference on Signal Processing. Chengdu: IEEE, 2016: 1130-1134.
[14]   DIANA B, RADU D, RAZVAN I, et al High-speed video system for micro-expression detection and recognition[J]. Sensors, 2017, 17 (12): 2913- 2931
doi: 10.3390/s17122913
[15]   ZHANG Z, CHEN T, MENG H, et al SMEConvNet: a convolutional neural network for spotting spontaneous facial micro-expression from long videos[J]. IEEE Access, 2018, 6: 71143- 71151
doi: 10.1109/ACCESS.2018.2879485
[16]   YAN W J, LI X, WANG S J, et al CASME II: an improved spontaneous micro-expression database and the baseline evaluation[J]. PLOS ONE, 2014, 9 (1): 1- 8
[17]   LI X, PFISTER T, HUANG X, et al. A spontaneous micro-expression database: inducement, collection and baseline [C] // 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition. Shanghai: IEEE, 2013: 1-6.
[18]   QU F, WANG S J, YAN W J, et al CAS(ME)2: a database for spontaneous macro-expression and micro-expression spotting and recognition [J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 424- 436
doi: 10.1109/TAFFC.2017.2654440
[19]   SZEGEDY C, LIU W, JIA Y, et al. Going deeper with convolutions [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 1-9.
[20]   HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition [C] // 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 770-778.
[21]   SZEGEDY C, VANHOUCKE V, IOFFE S, et al. Rethinking the inception architecture for computer vision [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 2818-2826.
[22]   SZEGEDY C, IOFFE S, VANHOUCKE V, et al. Inception-v4, inception-resnet and the impact of residual connections on learning [C] // AAAI Conference on Artificial Intelligence. San Francisco: AAAI, 2017: 4-12.
[23]   DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database [C] // 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami: IEEE, 2009: 248-255.
[24]   PAN S J, YANG Q A survey on transfer learning[J]. IEEE Transactions on Knowledge and Data Engineering, 2010, 22 (10): 1345- 1359
doi: 10.1109/TKDE.2009.191
[25]   GLOROT X, BENGIO Y Understanding the difficulty of training deep feedforward neural networks[J]. Journal of Machine Learning Research, 2010, 9: 249- 256
[26]   EKMAN P, FRIESEN W V. Facial action coding system: a technique for the measurement of facial movement [M]. Palo Alto: Consulting Psychologists Press, 1978.
[27]   FAWCETT T An introduction to ROC analysis[J]. Pattern Recognition Letters, 2006, 27 (8): 861- 874
doi: 10.1016/j.patrec.2005.10.010
[28]   HAN Y, LI B J, LAI Y K, et al. CFD: A collaborative feature difference method for spontaneous micro-expression spotting [C] // 2018 25th IEEE International Conference on Image Processing. Athens: IEEE, 2018: 1942-1946.
[29]   DAVISON A K, LANSLEY C, NG C C, et al. Objective micro-facial movement detection using facs-based regions and baseline evaluation [C] // 2018 13th IEEE International Conference on Automatic Face and Gesture Recognition. Xi'an: IEEE, 2018: 642-649.
[30]   LI X, HONG X, MOILANEN A, et al Towards reading hidden emotions: a comparative study of spontaneous micro-expression spotting and recognition methods[J]. IEEE Transactions on Affective Computing, 2018, 9 (4): 563- 577
doi: 10.1109/TAFFC.2017.2667642
[31]   DUQUE C A, ALATA O, EMONET R, et al. Micro-expression spotting using the Riesz pyramid [C] // 2018 IEEE Winter Conference on Applications of Computer Vision. Lake Tahoe: IEEE, 2018: 66-74.
[1] Ying-jie ZHENG,Song-rong WU,Ruo-yu WEI,Zhen-wei TU,Jin LIAO,Dong LIU. Metro location point matching and false alarm elimination based on FCM algorithm of target image[J]. Journal of ZheJiang University (Engineering Science), 2021, 55(3): 586-593.
[2] Zhuang KANG,Jie YANG,Hao-qi GUO. Automatic garbage classification system based on machine vision[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(7): 1272-1280.
[3] Zong-li SHEN,Jian-bo YU. Wafer map defect recognition based on transfer learning and deep forest[J]. Journal of ZheJiang University (Engineering Science), 2020, 54(6): 1228-1239.
[4] JIANG Bo, XIE Lun, LIU Xin, HAN Jing, WANG Zhi-liang. Micro-expression spotting using optical flow magnitude estimation[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(3): 577-583.