Please wait a minute...

当期目录

2013年, 第7期 刊出日期:2013-07-01 上一期    下一期
Data-driven digital entertainment: a computational perspective
Yue-ting Zhuang
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 475-476.   https://doi.org/10.1631/jzus.CIDE1300
摘要( 1055 )     PDF(0KB)( 684 )
Today massive collections of data can be obtained across different sources (or domains), e.g., the depth data from Kinect, the geometrical data from scanning devices, the imagery/video data from cameras, and the motion data from mocap devices. Since heterogeneous data may have different discriminative powers and are intrinsically complementary for certain tasks, it is desirable to leverage all the information available in digital entertainment. For example, the acquired 3D geometry and texture are jointly exploited to construct the colored 3D environment models; high-resolution geometry and motion-captured data are obtained to synthesize and re-target facial animations; both visual and acoustical features are contextually applied to classification. Therefore, it poses a significant challenge for the appropriate utilization of the different varieties of heterogeneous data in digital entertainment.
A review of behavior mechanisms and crowd evacuation animation in emergency exercises
Gao-qi He, Yu Yang, Zhi-hua Chen, Chun-hua Gu, Zhi-geng Pan
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 477-485.   https://doi.org/10.1631/jzus.CIDE1301
摘要( 1306 )     PDF(0KB)( 1016 )
Emergency exercises are an efficient approach for preventing serious damage and harm, including loss of life and property and a wide range of adverse social effects, during various public emergencies. Among various factors affecting the value of emergency exercises, including their design, development, conduct, evaluation, and improvement planning, this paper emphasizes the focal role of evacuees and their behavior. We address two concerns: What are the intrinsic reasons behind human behavior? How do we model and exhibit human behavior? We review studies investigating the mechanisms of psychological behavior and crowd evacuation animation. A comprehensive analysis of logical patterns of behavior and crowd evacuation is presented first. The interactive effects of information (objective and subjective), psychology (panic, small groups, and conflicting roles), and six kinds of behavior contribute to a more effective understanding of an emergency scene and assist in making scientific decisions. Based on these studies, a wide range of perspectives on crowd formation and evacuation animation models is summarized. Collision avoidance is underlined as a special topic. Finally, this paper highlights some of the technical challenges and key questions to be addressed by future developments in this rapidly developing field.
Applications of structure from motion: a survey
Ying-mei Wei, Lai Kang, Bing Yang, Ling-da Wu
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 486-494.   https://doi.org/10.1631/jzus.CIDE1302
摘要( 1349 )     PDF(0KB)( 1477 )
Structure from motion (SfM) has been an active research area in computer vision for decades and numerous practical applications are benefiting from this research. While no previous work has tried to summarize the applications appearing in the literature, this paper deals with a comprehensive overview of recent applications of SfM by classifying them into 10 categories, namely augmented reality, autonomous navigation/guidance, motion capture, hand-eye calibration, image/video processing, image-based 3D modeling, remote sensing, image organization/browsing, segmentation and recognition, and military applications. The goal is to provide insights for researchers to position their work more appropriately in the context of existing techniques, and to perceive both new applications and relevant research problems.
A review of object representation based on local features
Jian Cao, Dian-hui Mao, Qiang Cai, Hai-sheng Li, Jun-ping Du
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 495-504.   https://doi.org/10.1631/jzus.CIDE1303
摘要( 1328 )     PDF(0KB)( 780 )
Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.
High-dimensional indexing technologies for large scale content-based image retrieval: a review
Lie-fu Ai, Jun-qing Yu, Yun-feng He, Tao Guan
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 505-520.   https://doi.org/10.1631/jzus.CIDE1304
摘要( 1564 )     PDF(0KB)( 1495 )
The boom of Internet and multimedia technology leads to the explosion of multimedia information, especially image, which has created an urgent need of quickly retrieving similar and interested images from huge image collections. The content-based high-dimensional indexing mechanism holds the key to achieving this goal by efficiently organizing the content of images and storing them in computer memory. In the past decades, many important developments in high-dimensional image indexing technologies have occurred to cope with the ‘curse of dimensionality’. The high-dimensional indexing mechanisms can mainly be divided into three categories: tree-based index, hashing-based index, and visual words based inverted index. In this paper we review the technologies with respect to these three categories of mechanisms, and make several recommendations for future research issues.
Synthesis of 3D models by Petri net
Mo-fei Song, Zheng-xing Sun, Yan Zhang, Fei-qian Zhang
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 521-529.   https://doi.org/10.1631/jzus.CIDE1305
摘要( 1186 )     PDF(0KB)( 840 )
This paper presents a synthesis method for 3D models using Petri net. Feature structure units from the example model are extracted, along with their constraints, through structure analysis, to create a new model using an inference method based on Petri net. Our method has two main advantages: first, 3D model pieces are delineated as the feature structure units and Petri net is used to record their shape features and their constraints in order to outline the model, including extending and deforming operations; second, a construction space generating algorithm is presented to convert the curve drawn by the user into local shape controlling parameters, and the free form deformation (FFD) algorithm is used in the inference process to deform the feature structure units. Experimental results showed that the proposed method can create large-scale complex scenes or models and allow users to effectively control the model result.
Portrait drawing from corresponding range and intensity images
Lu Wang, Li-ming Lou, Cheng-lei Yang, Yue-zhu Huang, Xiang-xu Meng
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 530-541.   https://doi.org/10.1631/jzus.CIDE1306
摘要( 1157 )     PDF(0KB)( 913 )
We propose a real-time rendering system for automatically creating a caricature drawing, i.e., an exaggerated portrait, of a human face, based on simultaneous use of a range image (or 3D mesh) and a registered photograph of the same face. Combining these information sources provides complementary information. Significant geometric lines such as occluding contours and suggestive contours are extracted from the range data, while textured areas corresponding to shading features are extracted from the photograph. These are combined, and then distorted to produce the final caricature. The final output may be produced using a choice of non-photorealistic rendering styles. Our system method works well for low resolution range images; for these it is fast enough to allow the viewpoint to be chosen in real time. The final output combines significant lines, textured areas, and optional shading, giving a pleasing result which preserves not only the shape cues of the geometric description, but also other essential visual characteristics of the facial image that cannot be deduced from geometry alone.
Statistical learning based facial animation
Shibiao Xu, Guanghui Ma, Weiliang Meng, Xiaopeng Zhang
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 542-550.   https://doi.org/10.1631/jzus.CIDE1307
摘要( 1598 )     PDF(0KB)( 909 )
To synthesize real-time and realistic facial animation, we present an effective algorithm which combines image- and geometry-based methods for facial animation simulation. Considering the numerous motion units in the expression coding system, we present a novel simplified motion unit based on the basic facial expression, and construct the corresponding basic action for a head model. As image features are difficult to obtain using the performance driven method, we develop an automatic image feature recognition method based on statistical learning, and an expression image semi-automatic labeling method with rotation invariant face detection, which can improve the accuracy and efficiency of expression feature identification and training. After facial animation redirection, each basic action weight needs to be computed and mapped automatically. We apply the blend shape method to construct and train the corresponding expression database according to each basic action, and adopt the least squares method to compute the corresponding control parameters for facial animation. Moreover, there is a pre-integration of diffuse light distribution and specular light distribution based on the physical method, to improve the plausibility and efficiency of facial rendering. Our work provides a simplification of the facial motion unit, an optimization of the statistical training process and recognition process for facial animation, solves the expression parameters, and simulates the subsurface scattering effect in real time. Experimental results indicate that our method is effective and efficient, and suitable for computer animation and interactive applications.
Extracting 3D model feature lines based on conditional random fields
Yao-ye Zhang, Zheng-xing Sun, Kai Liu, Mo-fei Song, Fei-qian Zhang
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 551-560.   https://doi.org/10.1631/jzus.CIDE1308
摘要( 1065 )     PDF(0KB)( 845 )
We propose a 3D model feature line extraction method using templates for guidance. The 3D model is first projected into a depth map, and a set of candidate feature points are extracted. Then, a conditional random fields (CRF) model is established to match the sketch points and the candidate feature points. Using sketch strokes, the candidate feature points can then be connected to obtain the feature lines, and using a CRF-matching model, the 2D image shape similarity features and 3D model geometric features can be effectively integrated. Finally, a relational metric based on shape and topological similarity is proposed to evaluate the matching results, and an iterative matching process is applied to obtain the globally optimized model feature lines. Experimental results showed that the proposed method can extract sound 3D model feature lines which correspond to the initial sketch template.
A fast classification scheme and its application to face recognition
Xiao-hu Ma, Yan-qi Tan, Gang-min Zheng
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 561-572.   https://doi.org/10.1631/jzus.CIDE1309
摘要( 1121 )     PDF(0KB)( 794 )
To overcome the high computational complexity in real-time classifier design, we propose a fast classification scheme. A new measure called ‘reconstruction proportion’ is exploited to reflect the discriminant information. A novel space called the ‘reconstruction space’ is constructed according to the reconstruction proportions. A point in the reconstruction space denotes the case of a sample reconstructed using training samples. This is used to search for an optimal mapping from the conventional sample space to the reconstruction space. When the projection from the sample space to the reconstruction space is obtained, a new sample after mapping to the new discriminant space would be classified quickly according to the reconstruction proportions in the reconstruction space. This projection technique results in a diversion of time-consuming calculations from the classification stage to the training stage. Though training time is prolonged, it is advantageous in that classification problems such as identification can be solved in real time. Experimental results on the ORL, Yale, YaleB, and CMU PIE face databases showed that the proposed fast classification scheme greatly outperforms conventional classifiers in classification accuracy and efficiency.
Speaker-independent speech emotion recognition by fusion of functional and accompanying paralanguage features
Qi-rong Mao, Xiao-lei Zhao, Zheng-wei Huang, Yong-zhao Zhan
Front. Inform. Technol. Electron. Eng., 2013, 14(7): 573-582.   https://doi.org/10.1631/jzus.CIDE1310
摘要( 1222 )     PDF(0KB)( 948 )
Functional paralanguage includes considerable emotion information, and it is insensitive to speaker changes. To improve the emotion recognition accuracy under the condition of speaker-independence, a fusion method combining the functional paralanguage features with the accompanying paralanguage features is proposed for the speaker-independent speech emotion recognition. Using this method, the functional paralanguages, such as laughter, cry, and sigh, are used to assist speech emotion recognition. The contributions of our work are threefold. First, one emotional speech database including six kinds of functional paralanguage and six typical emotions were recorded by our research group. Second, the functional paralanguage is put forward to recognize the speech emotions combined with the accompanying paralanguage features. Third, a fusion algorithm based on confidences and probabilities is proposed to combine the functional paralanguage features with the accompanying paralanguage features for speech emotion recognition. We evaluate the usefulness of the functional paralanguage features and the fusion algorithm in terms of precision, recall, and F1-measurement on the emotional speech database recorded by our research group. The overall recognition accuracy achieved for six emotions is over 67% in the speaker-independent condition using the functional paralanguage features.
11 articles

编辑部公告More

友情链接