The principles and defects of full convolutional network (FCN), which was widely utilized in facial landmark localization, were studied to improve the facial landmark localization accuracy. Discuss the side effects introduced by the kernel function in the feature of FCN, that the evaluation criteria were inconsistent during training and testing. Firstly, theoretically analyze the possibility and the universality of this problem, and then design experiments to verify the existence of this problem in actual situation. To solve this problem, a hourglass network structure was proposed for facial landmark localization combining residual features; the cascaded hourglass network structure was given. The experimental results show that the two-stage cascade structure can obtain comparable accuracy compared with the four-stage stack structure, which means that the model parameter quantity and time complexity will be reduced greatly. The average normalization error of the proposed method on the difficult subset of the 300-W database was 6.84%, which is better than the previous best result.
XU Ai-dong, HUANG Wen-qi, MING Zhe, CHEN Wei-liang, HU Roland, YANG Hang. Facial landmark localization based on cascaded hourglass network with residual features. Journal of Zhejiang University(Engineering Science)[J], 2019, 53(12): 2365-2371 doi:10.3785/j.issn.1008-973X.2019.12.014
HASSNER T, HAREL S, PAZ E, et al. Effective face frontalization in unconstrained images [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 4295-4304.
XIONG X, TORRE F D L. Supervised descent method and its applications to face alignment [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Portland: IEEE, 2013: 532-539.
RAMANAN D. Face detection, pose estimation, and landmark localization in the wild [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012: 2879-2886.
REN S, CAO X, WEI Y, et al. Face alignment at 3000 FPS via regressing local binary features [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014: 1685-1692.
HONARI S, YOSINSKI J, VINCENT P, et al. Recombinator networks: Learning Coarse-to-fine feature aggregation [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. LAS VEGAS: IEEE, 2016: 5743-5752.
XIAO S, FENG J, XING J, et al. Robust facial landmark detection via recurrent attentive-refinement networks [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 57-72.
BULAT A, TZIMIROPOULOS G. Two-stage convolutional part heatmap regression for the 1st 3D face alignment in the wild (3DFAW) challenge [C] // European Conference on Computer Vision. Amsterdam: ECCV, 2016: 616-624.
YANG J, LIU Q, ZHANG K. Stacked hourglass network for robust facial landmark localisation [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017: 2025-2033.
LONG J, SHELHAMER E, DARRELL T. Fully convolutional networks for semantic segmentation [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3431-3440.
SAGONAS C, TZIMIROPOULOS G, ZAFEIRIOU S, et al. 300 faces in-the-wild challenge: the first facial landmark localization challenge [C] // Proceedings of the IEEE International Conference on Computer Vision Workshops. Sydney: ICCV, 2013: 397-403.
ZAFEIRIOU S, TRIGEORGIS G, CHRYSOS G, et al. The menpo facial landmark localisation challenge: a step towards the solution [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017: 2116-2125.
ZHU S, LI C, CHEN C L, et al. Face alignment by coarse-to-fine shape searching [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 4998-5006.
KOWALSKI M, NARUNIEC J, TRZCINSKI T. Deep alignment network: a convolutional neural network for robust face alignment [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017: 2034-2043.
LV J, SHAO X, XING J, et al. A deep regression architecture with two-stage re-initialization for high performance facial landmark detection [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 3691-3700.
AMIR Z, TADAS B, LOUISPHILIPPE M. Convolutional experts constrained local model for facial landmark detection [C] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Honolulu: IEEE, 2017, pp. 2051-2059.
ZHENHUA F, JOSEF K, MUHAMMAD A, et. al. Face detection, bounding box aggregation and pose estimation for robust facial landmark localisation in the wild [C] // Proceedings of the International Conference on Computer Vision and Pattern Recognition Workshop. Honolulu: IEEE, 2017: 2106-2115.