|
|
An improved TF-IDF approach for text classification |
ZHANG Yun-tao, GONG Ling, WANG Yong-cheng |
Network & Information Center, School of Electronic & Information Technology, Shanghai Jiaotong University, Shanghai 200030, China |
|
|
Abstract This paper presents a new improved term frequency/inverse document frequency (TF-IDF) approach which uses confidence, support and characteristic words to enhance the recall and precision of text classification. Synonyms defined by a lexicon are processed in the improved TF-IDF approach. We detailedly discuss and analyze the relationship among confidence, recall and precision. The experiments based on science and technology gave promising results that the new TF-IDF approach improves the precision and recall of text classification compared with the conventional TF-IDF approach.
|
Received: 05 December 2003
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|