Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2015, Vol. 16 Issue (12): 1059-1068    DOI: 10.1631/FITEE.1400398
    
An ensemble method for data stream classification in the presence of concept drift
Omid Abbaszadeh, Ali Amiri, Ali Reza Khanteymoori
Department of Computer Engineering, University of Zanjan, Zanjan 45371-38791, Iran
Download:   PDF(0KB)
Export: BibTeX | EndNote (RIS)      

Abstract  One recent area of interest in computer science is data stream management and processing. By ‘data stream’, we refer to continuous and rapidly generated packages of data. Specific features of data streams are immense volume, high production rate, limited data processing time, and data concept drift; these features differentiate the data stream from standard types of data. An issue for the data stream is classification of input data. A novel ensemble classifier is proposed in this paper. The classifier uses base classifiers of two weighting functions under different data input conditions. In addition, a new method is used to determine drift, which emphasizes the precision of the algorithm. Another characteristic of the proposed method is removal of different numbers of the base classifiers based on their quality. Implementation of a weighting mechanism to the base classifiers at the decision-making stage is another advantage of the algorithm. This facilitates adaptability when drifts take place, which leads to classifiers with higher efficiency. Furthermore, the proposed method is tested on a set of standard data and the results confirm higher accuracy compared to available ensemble classifiers and single classifiers. In addition, in some cases the proposed classifier is faster and needs less storage space.

Key wordsData stream      Classificaion      Ensemble classifiers      Concept drift     
Received: 19 November 2014      Published: 07 December 2015
CLC:  TP391  
Cite this article:

Omid Abbaszadeh, Ali Amiri, Ali Reza Khanteymoori. An ensemble method for data stream classification in the presence of concept drift. Front. Inform. Technol. Electron. Eng., 2015, 16(12): 1059-1068.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/FITEE.1400398     OR     http://www.zjujournals.com/xueshu/fitee/Y2015/V16/I12/1059


一种概念漂移情况下数据流分类的整体方法

目的:数据流(data stream)管理和处理是计算机科学领域的热点问题。本文提及的“数据流”指连续且快速生成的数据包。数据流的专有特性有数据量极大、生成率高、处理时间有限和数据概念漂移(concept drift)等。这些特性将数据流区别于其他标准数据形式。数据流的一个重要问题即为输入数据分类。本文提出一种新型的整体分类器(ensemble classifier)。
创新点:在数据流分类器的基础上,提出一种包含概念漂移检测、基分类器移除和动态加权机制的方法。
方法:(1)针对不同数据输入条件,对基分类器使用两种加权函数;(2)利用Kappa系数确定概念漂移,提升算法精度;(3)基于基分类器的质量,移除不同数目的基分类器;(4)在决策阶段对基分类器应用加权机制,提升算法对漂移的适应性,提高分类器效率。
结论:在标准数据集上测试,本文方法较现有整体分类器和单分类器可获得更高的精度;在某些情况下可节省运行时间和内存用量。

关键词: 数据流,  分类,  整体分类器,  概念漂移 
[1] Gopi Ram , Durbadal Mandal , Sakti Prasad Ghoshal , Rajib Kar . Optimal array factor radiation pattern synthesis for linear antenna array using cat swarm optimization: validation by an electromagnetic simulator[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 570-577.
[2] Lin-bo Qiao, Bo-feng Zhang, Jin-shu Su, Xi-cheng Lu. A systematic review of structured sparse learning[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 445-463.
[3] Rong-Feng Zhang , Ting Deng , Gui-Hong Wang , Jing-Lun Shi , Quan-Sheng Guan . A robust object tracking framework based on a reliable point assignment algorithm[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 545-558.
[4] Yuan-ping Nie, Yi Han, Jiu-ming Huang, Bo Jiao, Ai-ping Li. Attention-based encoder-decoder model for answer selection in question answering[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(4): 535-544.
[5] . A quality requirements model and verification approach for system of systems based on description logic[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(3): 346-361.
[6] Ali Darvish Falehi, Ali Mosallanejad. Dynamic stability enhancement of interconnected multi-source power systems using hierarchical ANFIS controller-TCSC based on multi-objective PSO[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(3): 394-409.
[7] Wen-yan Xiao, Ming-wen Wang, Zhen Weng, Li-lin Zhang, Jia-li Zuo. Corpus-based research on English word recognition rates in primary school and word selection strategy[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(3): 362-372.
[8] Li Weigang. First and Others credit-assignment schema for evaluating the academic contribution of coauthors[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(2): 180-194.
[9] Hui Chen, Bao-gang Wei, Yi-ming Li, Yong-huai Liu, Wen-hao Zhu. An easy-to-use evaluation framework for benchmarking entity recognition and disambiguation systems[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(2): 195-205.
[10] Jun-hong Zhang, Yu Liu. Application of complete ensemble intrinsic time-scale decomposition and least-square SVM optimized using hybrid DE and PSO to fault diagnosis of diesel engines[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(2): 272-286.
[11] Yue-ting Zhuang, Fei Wu, Chun Chen, Yun-he Pan. Challenges and opportunities: from big data to knowledge in AI 2.0[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(1): 3-14.
[12] Bo-hu Li, Hui-yang Qu, Ting-yu Lin, Bao-cun Hou, Xiang Zhai, Guo-qiang Shi, Jun-hua Zhou, Chao Ruan. A swarm intelligence design based on a workshop of meta-synthetic engineering[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(1): 149-152.
[13] Yong-hong Tian, Xi-lin Chen, Hong-kai Xiong, Hong-liang Li, Li-rong Dai, Jing Chen, Jun-liang Xing, Jing Chen, Xi-hong Wu, Wei-min Hu, Yu Hu, Tie-jun Huang, Wen Gao. Towards human-like and transhuman perception in AI 2.0: a review[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(1): 58-67.
[14] Yu-xin Peng, Wen-wu Zhu, Yao Zhao, Chang-sheng Xu, Qing-ming Huang, Han-qing Lu, Qing-hua Zheng, Tie-jun Huang, Wen Gao. Cross-media analysis and reasoning: advances and directions[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(1): 44-57.
[15] Le-kui Zhou, Si-liang Tang, Jun Xiao, Fei Wu, Yue-ting Zhuang. Disambiguating named entities with deep supervised learning via crowd labels[J]. Front. Inform. Technol. Electron. Eng., 2017, 18(1): 97-106.