Scale invariant feature transform (SIFT) local descriptor based registration algorithm and multiscale viewbased appearance model were used aiming at the problem of large head movement tracking. The SIFT local descriptor based registration algorithm can estimate the pose change between two frames even when head scale was also changed by matching salient SIFT features between two intensity images. The multiscale viewbased appearance model was employed to reduce the drift accumulation during tracking in large range. The model selected key frames online when the head underwent different motions and the tracker bounded the drift of current frame by employing multiple registrations approach. Experimental results show that the method is not only accurate (4 °RMS error), but also robust with respect to the movement along the Z axis was about 1 m and the subject returned to the visual field of camera after abrupt leaving.
[1] 梁国远,查红彬,刘宏. 基于三维模型和仿射对应原理的人脸姿态估计方法 [J]. 计算机学报, 2005, 28(5): 792800.
LIANG Guoyuan, ZHA Hongbin, LIU Hong. Face pose estimation based on 3D models and affine correspondences [J]. Chinese Journal of Computers, 2005, 28(5): 792800.
[2] SEEMANN E, NICKEL K, STIEFELHAGEN R. Head pose estimation using stereo vision for humanrobot interaction [C]∥ Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. Korea: IEEE, 2004: 626631.
[3] MALASSIOTIS S, STRINTZIS M. Realtime head tracking and 3D pose estimation from range data [C]∥ Prodeedings of IEEE International Conference on Image Processing. Barcelona: IEEE, 2003: 859862.
[4] GORODNICHY D, MALIK S, ROTH G. Affordable 3D face tracking using projective vision [C]∥ Proceedings of International Conference on Vision Interfaces. Calgary: IEEE, 2002.
[5] RUDDARRAJU R, HARO A, ESSAFAST I. Multiple camera head pose tracking [C]∥ Proceedings of International Conference on Vision Interfaces. Halifax: IEEE, 2003.
[6] YANG R, ZHANG Z. Modelbased head pose tracking with stereo vision [C]∥ Proceedings of IEEE International Conference on Automatic Face and Gesture Recognition. Washington DC: IEEE, 2002: 255260.
[7] MORENCY L, DARRELL T. Stereo tracking using ICP and normal flow constraint [C]∥ Proceedings of IEEE International Conference on Pattern Recognition. Quebec City: IEEE, 2002: 367372.
[8] LOWE D. Distinctive image features from scaleinvariant keypoints [J]. International Journal of Computer Vision, 2004, 60(2):91110.
[9] KRYSTIAN M, CORDELIA S. Performance evaluation of local descriptors [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(10):16151630.
[10] RAHIMI A, MORENCY L, DARRELL T. Reducing drift in differential tracking [J]. Computer Vision and Image Understanding, 2006, 109(2): 97111.
[11] ZHU Y, FUJIMURA K. Head pose estimation for driver monitoring [C]∥ Proceeding of IEEE Intelligent Vehicles Symposium. Parma: IEEE,2004: 501506.
[12] LU F, MILIOS E. Globally consistent range scan alignment for environment mapping [J].Autonomous Robots,1997,4(4):333349.
[13] MORENCY L, RAHIMI A, DARRELL T. Adaptive viewbased appearance models [C]∥ Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition. Madison: IEEE,2003:803810.
[14] HORN B. Closedform solution of absolute orientation using unit quaternions [J]. Journal of the Optical Society of America, 1987, 44(4):629642.
[15] FISCHLER M, BOLLES R. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography [J]. Communications of the ACM, 1981, 24(6):381395.
[16] KAILATH T, SAYED A, HASSIBI B. Linear estimation [M]. EnglewoodCliffs: PrenticeHall, 1999: 1816.
[17] VIOLA P, JONES M. Robust realtime face detection[C]∥ Proceedings of IEEE International Conference on Computer Vision. Vancouver: IEEE,2001:747.