Small sample learning algorithm based on novel hybrid class labeling technique

doi:10.3785/j.issn.1008-973X.2016.01.020

JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE)

Automatic Technology, Telecommunication Technology

Small sample learning algorithm based on novel hybrid class labeling technique

LI Min dan, SHEN Ye, ZHANG Dong ping, YIN Hai bing

Department of Signal and Information Processing, China Jiliang University, Hangzhou 310018, China

Download:

PDF(740KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

A small sample learning algorithm based on a novel hybrid class labeling technique (HCLT) was proposed in order to address the learning problem resulting from the underrepresented labeled training set in computer aided diagnosis(CAD). The abundant unlabeled samples were labeled by HCLT with three diverse class labeling schemes respectively from the view point of geometric similarity, probabilistic distribution and semantic concept. Only those unlabeled samples which get the unanimous labeling results from three different labeling schemes were added to the training set in order to enlarge the labeled training set. The memberships of pseudo labeled samples were introduced to fuzzy support vector machine (FSVM) in order to reduce the adverse effects for learning performance resulting from the still existing labeling mistakes. The contributions of pseudo labeled samples to learning task were determined by their memberships. Classification experiment results based on datasets in UCI show that the proposed algorithm can deal with the small sample learning problem. The algorithm has less mistakes and better classification performance compared with the other algorithms which adopt the single labeling scheme．

Published: 31 March 2016

CLC:

TP 391

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

LI Min dan, SHEN Ye, ZHANG Dong ping, YIN Hai bing. Small sample learning algorithm based on novel hybrid class labeling technique. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2016, 50(1): 137-143.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2016.01.020 OR http://www.zjujournals.com/eng/Y2016/V50/I1/137

基于混合类别标记新技术的小样本学习算法

针对计算机辅助诊断(CAD)中标记病例样本难以收集所引起的小样本学习问题,提出基于混合类别标记新技术(HCLT)的小样本学习算法.该算法分别基于几何距离、概率分布及语义概念对大量存在的未标记样本进行差异化标记,将有一致标记结果的样本加入样本集,以此扩大训练样本集.为了减少错误标记样本对学习过程造成的不利影响,提出样本伪标记隶属度并引入模糊支持向量机(FSVM)学习中,由隶属度控制样本对学习过程的贡献程度.基于UCI数据集的实验结果表明,采用该算法能够解决小样本学习问题的有效性.与单一类别标记技术相比,该算法产生的错误标记样本显著减少、学习性能显著改善.

［1］沈晔,李敏丹,夏顺仁.计算机辅助乳腺癌诊断中的非平衡学习技术研究［J］.浙江大学学报：工学版,2013, 47(1): 1-7．
SHEN Ye, LI Min dan. XIA Shun ren. Learning algorithm with non balanced data for computer aided diagnosis of breast cancer ［J］. Journal of Zhejiang University: Engineering Science, 2013, 47(1): 1-7．
［2］ GORGEL P, SERTBAS A, UCAN O N. Computer aided classification of breast masses in mammogram images based on spherical wavelet transform and support vector machines［J］. EXPERT SYSTEMS, 2015, 32(1): 155-164．
［3］ DHEEBA J, SELVI S T. Classification of malignant and benign microcalcification using SVM ［C］∥Proceedings of ICETECT. Tamil Nadu: ［s. n.］, 2011: 686-690．
［4］ JEYAKUMAR V, KANAGARAJ B R. A framework for medical image retrieval system using ant colony optimization and weighted relevance feedback ［J］. Journal of Medical Imaging and Health Informatics, 2015, 5(7): 1383-1389．
［5］沈晔,夏顺仁,李敏丹. 基于内容的医学图像检索中的相关反馈技术［J］.中国生物医工程学报,2009, 28(1): 128-136．
SHEN Ye, XIA Shun ren, LI Min dan. A survey on relevance feedback techniques in content based medical image retrieval ［J］. Chinese Journal of Biomedical Engineering, 2009, 28(1): 128-136．
［6］WU K, YAP K H. Fuzzy SVM for content based image retrieval: a pseudo label support vector machine framework ［J］. IEEE Computational Intelligence Magazine, 2006, 1(2): 10-16．
［7］ZHOU D, BOUSQUET O, LAL T N, et al. Learning with local and global consistency ［C］ ∥Proceedings of NIPS. Whistler: ［s. n.］, 2003: 321-328．
［8］WANG Fei, ZHANG Chang shui. Label propagation through linear neighborhoods ［J］. IEEE Transactions on Knowledge and Data Engineering, 2008, 20(1): 55-66．
［9］TU E,YANG J,KASABOV N,et al. Posterior distribution learning (PDL): a novel supervised learning framework using unlabeled samples to improve classification performance ［J］. Neurocomputing, 2015, 157: 173-186．
［10］ZHOU Zhi hua, LI Ming. Tri training: exploiting unlabeled data using three classifier ［J］. IEEE Transactions on Knowledge and Data Engineering, 2005, 117(11): 1529-1541．
［11］ LI Ming, ZHOU Zhi hua. Improve computer aided diagnosis with machine learning techniques using undiagnosed samples ［J］. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 2007, 37(6): 1088-1098．
［12］ KIM J, SHIN H. Breast cancer survivability prediction using labeled, unlabeled, and pseudo labeled patient data ［J］. Journal of the American Medical Informatics Association, 2013, 20(4): 613-618．
［13］ZHOU Zhi hua, CHEN Ke jia, DAI Hong bin. Enhancing relevance feedback in image retrieval using unlabeled data ［J］. ACM Transactions on Information Systems, 2006, 24(2): 219-244.
［14］CHEN K,WANG S H. Semi supervised learning via regularized boosting working on multiple semi supervised assumptions ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(1): 129-143．
［15］LE T B, KIM S W. Modified criterion to select useful unlabeled data for improving semi supervised support vector machines ［J］. Pattern Recognition Letters,2015, 60 61: 48-56．
［16］LI Yu feng, ZHOU Zhi hua. Towards making unlabeled data never hurt ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(1): 175-188．
［17］ MALLAPRAGADA P K, JIN R, JAIN A K, et al. SemiBoost: boosting for semi supervised learning ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009, 31(11): 2000-2014．
［18］ JAVED K, GOURIVEAU R, ZERHOUNI N. A new multivariate approach for prognostics based on extreme learning machine and fuzzy clustering ［J］.IEEE Transactions on Cybernetics, 2015, 45(12): 26-39．
［19］LICHMAN M. Machine learning repository ［DB/OL］. 2013 04 04. http:∥archive.ics.uci.edu/ml/datasets.html.
［20］NIE Fei ping, XU Dong, LI Xue long, et al. Semisupervised dimensionality reduction and classification through virtual label regression ［J］.IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2011, 41(3): 675-685

[1]	HE Xue-jun, WANG Jin, LU Guo-dong, LIU Zhen-yu, CHEN Li, JIN Jing. 3D head portrait sculpture by industrial robot based on triangular mesh slicing and collision detection[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(6): 1104-1110.

[2]	WANG Hua, HAN Tong-yang, ZHOU Ke. KeyGraph-based community detection algorithm for public security intelligence[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(6): 1173-1180.

[3]	YOU Hai-hui, MA Zeng-yi, TANG Yi-jun, WANG Yue-lan, ZHENG Lin, YU Zhong, JI Cheng-jun. Soft measurement of heating value of burning municipal solid waste for circulating fluidized bed[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(6): 1163-1172.

[4]	BI Xiao-jun, WANG Jia-hui. Teaching-learning-based optimization algorithm with hybrid learning strategy[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(5): 1024-1031.

[5]	HUANG Zheng-yu, JIANG Xin-long, LIU Jun-fa, CHEN Yi-qiang, GU Yang. Fusion feature based semi-supervised manifold localization method[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 655-662.

[6]	JIANG Xin-long, CHEN Yi-qiang, LIU Jun-fa, HU Li-sha, SHEN Jian-fei. Wearable system to support proximity awareness for people with autism[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 637-647.

[7]	MU Jing-jing, ZHAO Xin-yue, HE Zai-xing, ZHANG Shu-you. Contour reconstruction of overlapped bubbles based on concave-convex transformation and circle fitting[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 714-721.

[8]	WANG Liang, YU Zhi-wen, GUO Bin. Moving trajectory prediction model based on double layer multi-granularity knowledge discovery[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 669-674.

[9]	LIAO Miao, ZHAO Yu-qian, ZENG Ye-zhan, HUANG Zhong-chao, ZHANG Bing-kui, ZOU Bei-ji. Automatic segmentation for cell images based on support vector machine and ellipse fitting[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 722-728.

[10]	GUO Meng-li, DA Fei-peng, DENG Xing, GAI Shao-yan. 3D face recognition based on keypoints and local feature[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(3): 584-589.

[11]	DAI Cai-yan, CHEN Ling, LI Bin, CHEN Bo-lun. Sampling-based link prediction in complex networks[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(3): 554-561.

[12]	LIU Lei, YANG Peng, LIU Zuo-jun. Locomotion-Mode recognition using multiple kernel relevance vector machine[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(3): 562-571.

[13]	ZHANG Ya nan, CHEN De yun, WANG Ying jie, LIU Yu peng. Incremental graph pattern matching based dynamic recommendation method for cold-start user[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(2): 408-415.

[14]	WANG Hai jun, GE Hong juan, ZHANG Sheng yan. Fast object tracking algorithm via kernel collaborative presentation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(2): 399-407.

[15]	LIU Yu peng, QIAO Xiu ming, ZHAO Shi lei, MA Chun guang. Deep combination of large-scale features in statistical machine translation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(1): 46-56.

Viewed

Full text

Abstract

Cited

Shared

Discussed