Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2011, Vol. 12 Issue (8): 647-657    DOI: 10.1631/jzus.C1000437
    
Improving naive Bayes classifier by dividing its decision regions
Zhi-yong Yan, Cong-fu Xu*, Yun-he Pan
Institute of Artificial Intelligence, Zhejiang University, Hangzhou 310027, China
Download:   PDF(236KB)
Export: BibTeX | EndNote (RIS)      

Abstract  Classification can be regarded as dividing the data space into decision regions separated by decision boundaries. In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective. Thus, a decision tree can be regarded as a classifier tree, in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node. Meanwhile, the NBTree algorithm, which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively, can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm. We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier. These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers. The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries, and those that may be ignored by the naive Bayes classifier. Finally, we conduct experiments on 30 data sets from the UC Irvine (UCI) repository. Experiment results show that the SD algorithm can obtain better generalization abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers. Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately.

Key wordsNaive Bayes classifier      Decision region      NBTree      C4.5 algorithm      Support vector machine (SVM)     
Received: 20 December 2010      Published: 03 August 2011
CLC:  TP181  
Cite this article:

Zhi-yong Yan, Cong-fu Xu, Yun-he Pan. Improving naive Bayes classifier by dividing its decision regions. Front. Inform. Technol. Electron. Eng., 2011, 12(8): 647-657.

URL:

http://www.zjujournals.com/xueshu/fitee/10.1631/jzus.C1000437     OR     http://www.zjujournals.com/xueshu/fitee/Y2011/V12/I8/647


Improving naive Bayes classifier by dividing its decision regions

Classification can be regarded as dividing the data space into decision regions separated by decision boundaries. In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective. Thus, a decision tree can be regarded as a classifier tree, in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node. Meanwhile, the NBTree algorithm, which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively, can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm. We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier. These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers. The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries, and those that may be ignored by the naive Bayes classifier. Finally, we conduct experiments on 30 data sets from the UC Irvine (UCI) repository. Experiment results show that the SD algorithm can obtain better generalization abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers. Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately.

关键词: Naive Bayes classifier,  Decision region,  NBTree,  C4.5 algorithm,  Support vector machine (SVM) 
[1] Mohammad Mosleh, Hadi Latifpour, Mohammad Kheyrandish, Mahdi Mosleh, Najmeh Hosseinpour. A robust intelligent audio watermarking scheme using support vector machine[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(12): 1320-1330.
[2] G. R. Brindha, P. Swaminathan, B. Santhi. Performance analysis of new word weighting procedures for opinion mining[J]. Front. Inform. Technol. Electron. Eng., 2016, 17(11): 1186-1198.
[3] Qi-rong Mao, Xin-yu Pan, Yong-zhao Zhan, Xiang-jun Shen. Using Kinect for real-time emotion recognition via facial expressions[J]. Front. Inform. Technol. Electron. Eng., 2015, 16(4): 272-282.
[4] Jian Shi, Shu-you Zhang, Le-miao Qiu. Credit scoring by feature-weighted support vector machines[J]. Front. Inform. Technol. Electron. Eng., 2013, 14(3): 197-204.
[5] Wen-de Dong, Yue-ting Chen, Zhi-hai Xu, Hua-jun Feng, Qi Li. Image stabilization with support vector machine[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(6): 478-485.
[6] Hong-xia Pang, Wen-de Dong, Zhi-hai Xu, Hua-jun Feng, Qi Li, Yue-ting Chen. Novel linear search for support vector machine parameter selection[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(11): 885-896.
[7] Wei-dong Chen, Jian-hui Zhang, Ji-cai Zhang, Yi Li, Yu Qi, Yu Su, Bian Wu, Shao-min Zhang, Jian-hua Dai, Xiao-xiang Zheng, Dong-rong Xu. A P300 based online brain-computer interface system for virtual hand control[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(8): 587-597.
[8] Kui-kang Cao, Hai-bin Shen, Hua-feng Chen. A parallel and scalable digital architecture for training support vector machines[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(8): 620-628.
[9] Hyeon Chang Lee, Byung Jun Kang, Eui Chul Lee, Kang Ryoung Park. Finger vein recognition using weighted local binary pattern code based on a support vector machine[J]. Front. Inform. Technol. Electron. Eng., 2010, 11(7): 514-524.