Please wait a minute...
J4  2009, Vol. 43 Issue (10): 1848-1852    DOI: 10.3785/j.issn.1008-973X.2009.10.018
    
Method of discovering similar patents based on vector space model and characteristics of patent documents
CHEN Ji-xi1, GU Xin-jian1, CHEN Guo-hai1, WEI Jiang2
(1. Institute of Manufacturing Engineering, Zhejiang Province Key Laboratory of Advanced Manufacturing Technology,
Zhejiang University, Hangzhou 310027,China; 2. School of Management, Zhejiang University, Hangzhou 310058, China)
Download:   PDF(1639KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A method to discover the similarity of patent documents was proposed in order to help enterprises in patent application, protection and utilization. A patent model tree was built based on the characteristics of patent documents. The patent model tree and its nodes were defined. Through analyzing the nodes’ attribute values, patent documents were categorized by using the vector space model(VSM) based text categorization technology and the weighted similarities of patent name and patent abstract. According to the categorization, similar patents were discovered by the weighted similarities of patent characteristics in the same category. Several ways to identify the weight of patent characteristics were discussed according to the actual needs in enterprise application. A case study showed that the method can be used in patent categorization and similar patent search.



Published: 29 November 2009
CLC:  TP 391  
Cite this article:

CHEN Ji-Xi, GU Xin-Jian, CHEN Guo-Hai, et al. Method of discovering similar patents based on vector space model and characteristics of patent documents. J4, 2009, 43(10): 1848-1852.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2009.10.018     OR     http://www.zjujournals.com/eng/Y2009/V43/I10/1848


基于向量空间模型和专利文献特征的相似专利确定方法

为了确定专利文献的相似性,帮助企业进行专利申请、保护和利用,提出基于向量空间模型(VSM)和专利文献特征的相似专利确定方法.依据专利文献的信息特征构建专利模型树,定义了专利模型树和专利模型树的节点.通过分析专利模型树的节点属性值,采用基于向量空间模型的文本分类技术,以专利名称和专利摘要的加权相似度作为专利文献分类的依据,对专利文献进行分类,然后在类内根据专利文献特征的相似性确定相似专利,并根据企业的实际应用需求,分析专利文献要素权重确定的几种方法.应用示例验证了该方法能够有效地进行专利分类和相似专利检索.

[1] CONNELLY M C, SEKHAR J A. Invention and innovation: A case study in metals[J]. Key Engineering Materials, 2008, 380: 1539.
[2] SAIKI T, AKANO Y, WATANABE C, et al. A new dimension of potential resources in innovation: A wider scope of patent claims can lead to new functionality development[J].Technovation, 2006, 26(7): 796806.
[3] SOO Von-wun, LIN Szu-yin, YANG Shih-yao, et al. A cooperative multi-agent platform for invention based on patent document analysis and ontology[J]. Expert Systems with Applications, 2006, 31(4): 766775.
[4] 庞剑锋,卜东波,白硕. 基于向量空间模型的文本自动分类系统的研究与实现[J]. 计算机应用研究,2001,18(9): 2326.
PANG Jian-feng, BU Dong-bo, BAI Shuo. Research and implementation of text categorization system based on VSM[J]. Application Research of Computers, 2001, 18(9): 2326.
[5] 陈治纲,何丕廉,孙越恒,等. 基于向量空间模型的文本分类系统的研究与实现[J]. 中文信息学报,2005, 19(1): 3641.
CHEN Zhi-gang, HE Pi-lian, SUN Yue-heng, et al.Research and implementation of text classification system based on VSP[J]. Journal of Chinese Information Processing, 2005, 19(1): 3641.
[6] 李雪蕾,张冬茉. 一种基于向量空间模型的文本分类方法[J]. 计算机工程,2003, 29(17): 9092.
LI Xue-lei, ZHANG Dong-mo. A text categorization method based on VSM[J]. Computer Engineering, 2003, 29(17): 9092.
[7] 马辉民,李卫华,吴良元. VSM 在中文文本聚类中的应用及实证分析[J]. 武汉理工大学学报:信息与管理工程版,2006, 28(4): 5659,81.
MA Hui-min, LI Wei-hua, WU Liang-yuan. Application and empirical research of VSM in chinese text clustering[J]. Journal of WUT : Information and Management Engineering, 2006, 28(4): 5659,81.
[8] LI Bao-li, LU Qin, YU Shi-wen. An adaptive k-nearest neighbor text categorization strategy[J]. ACM Transactions on Asian Language Information Processing, 2004, 3(4): 215226.
[9] BAI Jing, NIE Jian-yun, CAO Gui-hong. Integrating compound terms in Bayesian text classification[C]∥Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence.[S.l.]: IEEE/WIC/ACM ,2005: 598601.
[10] KANG In-su, NA Seung-hoon, KIM Jungi, et al. Cluster-based patent retrieval[J]. Information Processing and Management, 2007, 43(5): 1173118.
[11] KIM Jae-ho, CHOI Key-sun. Patent document categorization based on semantic structural information[J]. Information Processing and Management, 2007, 43(5): 12001215.
[12] LI Yao-yong, SHAWE-TAYLOR J. Advanced learning algorithms for cross-language patent retrieval and classification[J]. Information Processing and Management, 2007, 43(5): 11831199.

[1] ZHAO Jian-jun, WANG Yi, YANG Li-bin. Threat assessment method based on time series forecast[J]. J4, 2014, 48(3): 398-403.
[2] ZHANG Tian-yu, FENG Hua-jun, XU Zhi-hai, LI Qi, CHEN Yue-ting. Sharpness metric based on histogram of strong edge width[J]. J4, 2014, 48(2): 312-320.
[3] LIU Zhong, CHEN Wei-hai, WU Xing-ming, ZOU Yu-hua, WANG Jian-hua. Salient region detection based on stereo vision[J]. J4, 2014, 48(2): 354-359.
[4] CUI Guang-mang, ZHAO Ju-feng,FENG Hua-jun, XU Zhi-hai,LI Qi, CHEN Yue-ting. Construction of fast simulation model for degraded image by inhomogeneous medium[J]. J4, 2014, 48(2): 303-311.
[5] WANG Xiang-bing,TONG Shui-guang,ZHONG Wei,ZHANG Jian. Study on  scheme design technique for hydraulic excavator's structure performance based on extension reuse[J]. J4, 2013, 47(11): 1992-2002.
[6] WANG Jin, LU Guo-dong, ZHANG Yun-long. Quantification-I theory based IGA and its application[J]. J4, 2013, 47(10): 1697-1704.
[7] HU Gen-sheng, BAO Wen-xia, LIANG Dong, ZHANG Wei. Fusion of panchromatic image and multi-spectral image based on
SVR and Bayesian method 
[J]. J4, 2013, 47(7): 1258-1266.
[8] LIU Yu, WANG Guo-jin. Designing  developable surface pencil through  given curve as its common asymptotic curve[J]. J4, 2013, 47(7): 1246-1252.
[9] WU Jin-liang, HUANG Hai-bin, LIU Li-gang. Texture details preserving seamless image composition[J]. J4, 2013, 47(6): 951-956.
[10] CHEN Xiao-hong,WANG Wei-dong. A HDTV video de-noising algorithm based on spatial-temporal filtering[J]. J4, 2013, 47(5): 853-859.
[11] ZHU Fan , LI Yue, JIANG Kai, YE Shu-ming, ZHENG Xiao-xiang. Decoding of rat’s primary motor cortex by partial least square[J]. J4, 2013, 47(5): 901-905.
[12] WU Ning, CHEN Qiu-xiao, ZHOU Ling, WAN Li. Multi-level method of optimizing vector graphs converted from remote sensing images[J]. J4, 2013, 47(4): 581-587.
[13] JI Yu, SHEN Ji-zhong, SHI Jin-he. Automatic ocular artifact removal based on blind source separation[J]. J4, 2013, 47(3): 415-421.
[14] WANG Xiang, DING Yong. Full reference image quality assessment based on Gabor filter[J]. J4, 2013, 47(3): 422-430.
[15] TONG Shui-guang, WANG Xiang-bing, ZHONG Wei, ZHANG Jian. Dynamic optimization design for rigid landing leg of crane
based on BP-HGA
[J]. J4, 2013, 47(1): 122-130.