Please wait a minute...
Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering)  2005, Vol. 6 Issue ( 1): 8-    DOI: 10.1631/jzus.2005.A0049
    
An improved TF-IDF approach for text classification
ZHANG Yun-tao, GONG Ling, WANG Yong-cheng
Network & Information Center, School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, China
Download:     PDF (0 KB)     
Export: BibTeX | EndNote (RIS)      

Abstract  This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.

Key wordsTerm frequency/inverse document frequency (TF-IDF)      Text classification      Confidence      Support      Characteristic words     
Received: 05 December 2003     
CLC:  TP31  
Cite this article:

ZHANG Yun-tao, GONG Ling, WANG Yong-cheng. An improved TF-IDF approach for text classification. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2005, 6( 1): 8-.

URL:

http://www.zjujournals.com/xueshu/zjus-a/10.1631/jzus.2005.A0049     OR     http://www.zjujournals.com/xueshu/zjus-a/Y2005/V6/I 1/8

[1] Wei-teng Li, Ning Yang, Ting-chun Li, Yu-hua Zhang, Gang Wang. A new approach to simulate the supporting arch in a tunnel based on improvement of the beam element in FLAC3D[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2017, 18(3): 179-193.
[2] Wei Liu, Bettina Albers, Yu Zhao, Xiao-wu Tang. Upper bound analysis for estimation of the influence of seepage on tunnel face stability in layered soils[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2016, 17(11): 886-902.
[3] She-rong Zhang, An-kui Hu, Chao Wang. Three-dimensional inversion analysis of an in situ stress field based on a two-stage optimization algorithm[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2016, 17(10): 782-802.
[4] Wen-yang Duan, Li-min Huang, Yang Han, Ya-hui Zhang, Shuo Huang. A hybrid AR-EMD-SVR model for the short-term prediction of nonlinear and non-stationary ship motion[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2015, 16(7): 562-576.
[5] Pijush Samui, Dookie Kim, Bhairevi G. Aiyer. Pullout capacity of small ground anchor: a least square support vector machine approach[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2015, 16(4): 295-301.
[6] Rui Zhou, Zhou-hong Zong, Xue-yang Huang, Zhang-hua Xia. Seismic response study on a multi-span cable-stayed bridge scale model under multi-support excitations. Part II: numerical analysis[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2014, 15(6): 405-418.
[7] Ngoc-Anh Do, Daniel Dias, Pierpaolo Oreste. Three-dimensional numerical simulation of mechanized twin stacked tunnels in soft ground[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2014, 15(11): 896-913.
[8] Hai-bo Huo, Yi Ji, Xin-jian Zhu, Xing-hong Kuang, Yu-qing Liu. Control-oriented dynamic identification modeling of a planar SOFC stack based on genetic algorithm-least squares support vector regression[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2014, 15(10): 829-839.
[9] Jian-guo Yang, Xiao-long Zhang, Hong Zhao, Li Shen. Non-linear relationship between combustion kinetic parameters and coal quality[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2012, 13(5): 344-352.
[10] Fu Huang, Xiao-li Yang, Lian-heng Zhao. Upper bound solution of supporting pressure for a shallow square tunnel based on the Hoek-Brown failure criterion[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2012, 13(4): 284-292.
[11] Wan-huan Zhou, Ren-peng Chen, Lin-shuang Zhao, Zheng-zhong Xu, Yun-min Chen. A semi-analytical method for the analysis of pile-supported embankments[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2012, 13(11): 888-894.
[12] Bao-gui Qiu, Jun-xia Jiang, Ying-lin Ke. A new principle and device for large aircraft components gaining accurate support by ball joint[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2011, 12(5): 405-414.
[13] Guo-liang Xiong, Long Zhang, He-sheng Liu, Hui-jun Zou, Wei-zhong Guo. A comparative study on ApEn, SampEn and their fuzzy counterparts in a multiscale framework for feature extraction[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2010, 11(4): 270-279.
[14] Xin LIU, Guo WEI, Jin-wei SUN, Dan LIU. Nonlinear multifunctional sensor signal reconstruction based on least squares support vector machines and total least squares algorithm[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2009, 10(4): 497-503.
[15] Bing LIU, Li-chao ZHANG, Jian-hua MO, Bo QIAN. New method of improving parts accuracy by adding heat balance support in selective laser sintering[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2009, 10(3): 361-369.