Please wait a minute...
J4  2009, Vol. 43 Issue (10): 1848-1852    DOI: 10.3785/j.issn.1008-973X.2009.10.018
工业工程与制造业信息化     
基于向量空间模型和专利文献特征的相似专利确定方法
陈芨熙1,顾新建1,陈国海1,魏江2
(1.浙江大学 现代制造工程研究所,浙江省先进制造技术重点研究实验室,浙江 杭州 310027;
2.浙江大学 管理学院,浙江 杭州 310058)
Method of discovering similar patents based on vector space model and characteristics of patent documents
CHEN Ji-xi1, GU Xin-jian1, CHEN Guo-hai1, WEI Jiang2
(1. Institute of Manufacturing Engineering, Zhejiang Province Key Laboratory of Advanced Manufacturing Technology,
Zhejiang University, Hangzhou 310027,China; 2. School of Management, Zhejiang University, Hangzhou 310058, China)
 全文: PDF(1639 KB)   HTML
摘要:

为了确定专利文献的相似性,帮助企业进行专利申请、保护和利用,提出基于向量空间模型(VSM)和专利文献特征的相似专利确定方法.依据专利文献的信息特征构建专利模型树,定义了专利模型树和专利模型树的节点.通过分析专利模型树的节点属性值,采用基于向量空间模型的文本分类技术,以专利名称和专利摘要的加权相似度作为专利文献分类的依据,对专利文献进行分类,然后在类内根据专利文献特征的相似性确定相似专利,并根据企业的实际应用需求,分析专利文献要素权重确定的几种方法.应用示例验证了该方法能够有效地进行专利分类和相似专利检索.

Abstract:

A method to discover the similarity of patent documents was proposed in order to help enterprises in patent application, protection and utilization. A patent model tree was built based on the characteristics of patent documents. The patent model tree and its nodes were defined. Through analyzing the nodes’ attribute values, patent documents were categorized by using the vector space model(VSM) based text categorization technology and the weighted similarities of patent name and patent abstract. According to the categorization, similar patents were discovered by the weighted similarities of patent characteristics in the same category. Several ways to identify the weight of patent characteristics were discussed according to the actual needs in enterprise application. A case study showed that the method can be used in patent categorization and similar patent search.

出版日期: 2009-11-29
:  TP 391  
基金资助:

国家“十一五”科技支撑计划资助项目(2006BAF01A02),国家“863”高技术研究发展计划资助项目(2007AA04Z101).

作者简介: 陈芨熙(1966-),男,浙江桐乡人,副教授,主要从事造业信息化和知识管理的研究与应用.
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

陈芨熙, 顾新建, 陈国海, 等. 基于向量空间模型和专利文献特征的相似专利确定方法[J]. J4, 2009, 43(10): 1848-1852.

CHEN Ji-Xi, GU Xin-Jian, CHEN Guo-Hai, et al. Method of discovering similar patents based on vector space model and characteristics of patent documents. J4, 2009, 43(10): 1848-1852.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2009.10.018        http://www.zjujournals.com/eng/CN/Y2009/V43/I10/1848

[1] CONNELLY M C, SEKHAR J A. Invention and innovation: A case study in metals[J]. Key Engineering Materials, 2008, 380: 1539.
[2] SAIKI T, AKANO Y, WATANABE C, et al. A new dimension of potential resources in innovation: A wider scope of patent claims can lead to new functionality development[J].Technovation, 2006, 26(7): 796806.
[3] SOO Von-wun, LIN Szu-yin, YANG Shih-yao, et al. A cooperative multi-agent platform for invention based on patent document analysis and ontology[J]. Expert Systems with Applications, 2006, 31(4): 766775.
[4] 庞剑锋,卜东波,白硕. 基于向量空间模型的文本自动分类系统的研究与实现[J]. 计算机应用研究,2001,18(9): 2326.
PANG Jian-feng, BU Dong-bo, BAI Shuo. Research and implementation of text categorization system based on VSM[J]. Application Research of Computers, 2001, 18(9): 2326.
[5] 陈治纲,何丕廉,孙越恒,等. 基于向量空间模型的文本分类系统的研究与实现[J]. 中文信息学报,2005, 19(1): 3641.
CHEN Zhi-gang, HE Pi-lian, SUN Yue-heng, et al.Research and implementation of text classification system based on VSP[J]. Journal of Chinese Information Processing, 2005, 19(1): 3641.
[6] 李雪蕾,张冬茉. 一种基于向量空间模型的文本分类方法[J]. 计算机工程,2003, 29(17): 9092.
LI Xue-lei, ZHANG Dong-mo. A text categorization method based on VSM[J]. Computer Engineering, 2003, 29(17): 9092.
[7] 马辉民,李卫华,吴良元. VSM 在中文文本聚类中的应用及实证分析[J]. 武汉理工大学学报:信息与管理工程版,2006, 28(4): 5659,81.
MA Hui-min, LI Wei-hua, WU Liang-yuan. Application and empirical research of VSM in chinese text clustering[J]. Journal of WUT : Information and Management Engineering, 2006, 28(4): 5659,81.
[8] LI Bao-li, LU Qin, YU Shi-wen. An adaptive k-nearest neighbor text categorization strategy[J]. ACM Transactions on Asian Language Information Processing, 2004, 3(4): 215226.
[9] BAI Jing, NIE Jian-yun, CAO Gui-hong. Integrating compound terms in Bayesian text classification[C]∥Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence.[S.l.]: IEEE/WIC/ACM ,2005: 598601.
[10] KANG In-su, NA Seung-hoon, KIM Jungi, et al. Cluster-based patent retrieval[J]. Information Processing and Management, 2007, 43(5): 1173118.
[11] KIM Jae-ho, CHOI Key-sun. Patent document categorization based on semantic structural information[J]. Information Processing and Management, 2007, 43(5): 12001215.
[12] LI Yao-yong, SHAWE-TAYLOR J. Advanced learning algorithms for cross-language patent retrieval and classification[J]. Information Processing and Management, 2007, 43(5): 11831199.

[1] 赵建军,王毅,杨利斌. 基于时间序列预测的威胁估计方法[J]. J4, 2014, 48(3): 398-403.
[2] 崔光茫, 赵巨峰, 冯华君, 徐之海, 李奇, 陈跃庭. 非均匀介质退化图像快速仿真模型的建立[J]. J4, 2014, 48(2): 303-311.
[3] 张天煜, 冯华君, 徐之海, 李奇, 陈跃庭. 基于强边缘宽度直方图的图像清晰度指标[J]. J4, 2014, 48(2): 312-320.
[4] 刘中, 陈伟海, 吴星明, 邹宇华, 王建华. 基于双目视觉的显著性区域检测[J]. J4, 2014, 48(2): 354-359.
[5] 王相兵,童水光,钟崴,张健. 基于可拓重用的液压挖掘机结构性能方案设计[J]. J4, 2013, 47(11): 1992-2002.
[6] 王进, 陆国栋, 张云龙. 基于数量化一类分析的IGA算法及应用[J]. J4, 2013, 47(10): 1697-1704.
[7] 刘羽, 王国瑾. 以已知曲线为渐进线的可展曲面束的设计[J]. J4, 2013, 47(7): 1246-1252.
[8] 胡根生,鲍文霞,梁栋,张为. 基于SVR和贝叶斯方法的全色与多光谱图像融合[J]. J4, 2013, 47(7): 1258-1266.
[9] 吴金亮, 黄海斌, 刘利刚. 保持纹理细节的无缝图像合成[J]. J4, 2013, 47(6): 951-956.
[10] 陈潇红,王维东. 基于时空联合滤波的高清视频降噪算法[J]. J4, 2013, 47(5): 853-859.
[11] 朱凡,李悦,蒋 凯,叶树明,郑筱祥. 基于偏最小二乘的大鼠初级运动皮层解码[J]. J4, 2013, 47(5): 901-905.
[12] 吴宁, 陈秋晓, 周玲, 万丽. 遥感影像矢量化图形的多层次优化方法[J]. J4, 2013, 47(4): 581-587.
[13] 计瑜,沈继忠,施锦河. 一种基于盲源分离的眼电伪迹自动去除方法[J]. J4, 2013, 47(3): 415-421.
[14] 王翔,丁勇. 基于Gabor滤波器的全参考图像质量评价方法[J]. J4, 2013, 47(3): 422-430.
[15] 刘芳, 孙芸, 杨庚, 林海. 基于粒子群优化算法的社交网络可视化[J]. J4, 2013, 47(1): 37-43.