永恒语言学习研究与发展

doi:10.3785/j.issn.1008-973X.2017.01.010

浙江大学学报(工学版)

自动化技术

永恒语言学习研究与发展

丰小月, 梁艳春, 林希珣, 管仁初

1. 吉林大学符号计算与知识工程教育部重点实验室,吉林长春 130012;
2. 吉林大学珠海学院，符号计算与知识工程教育部重点实验室珠海市分实验室，广州珠海 519041

Research and development of never-ending language learning

FENG Xiao yue, LIANG Yan chun, LIN Xi xun, GUAN Ren chu

1. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University,Changchun 130012, China;
2. Zhuhai Labaratory of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai 519041, China

全文: PDF(1287 KB) HTML

摘要：

为了构建智能的语言学习模型,Tom M. Mitchell教授2010年在美国人工智能协会（AAAI）上提出永恒语言学习（NELL）的概念.NELL模型主要运用半监督学习和自然语言处理技术,持续不断地从互联网上获取大量文本,抽取知识,丰富知识库,使永恒语言学习模型变得更加智能.介绍了永恒语言学习模型及模型的组成|描述了NELL的孕育和发展以及面临的6个主要问题,包括自省能力开发,每天需要短暂的人工监督,新谓词学习,新类型知识的学习|命名实体建模和更精确的统计学习模型构建|提出拟解决现有问题的新永恒语言学习模型.

Abstract:

Tom M. Mitchell proposed the never-ending language learning (NELL) in 2010 at American Association for Artificial Intelligence (AAAI) in order to develop an intelligent language learning model. Using semi-supervised learning and natural language processing technique, NELL continuously gets large number of texts from Internet, extracts knowledge and enriches its knowledge base, which improves its intelligence. The NELL model and its modules were introduced. The incubation and development of NELL were depicted. Six problems about NELL were described, including: self-reflection to decide what to do next; daily human interaction; discovery of new predicates to learn; learning additional types of knowledge about language; entity-level (rather than sting-level) modeling; more sophisticated probabilistic modeling throughout the implementation. A new NELL model tending to be a potential solution was proposed.

出版日期: 2017-01-01

CLC:

TP 181

基金资助:

国家自然科学基金资助项目(61602207, 61572228, 61373050, 61272207, 61472158); 吉林省科技发展资助项目(20140520070JH)；珠海市优势学科、广东省优势重点学科建设资助项目.

通讯作者: 管仁初,男,副教授. ORCID: 0000-0002-7162-7826. E-mail: guanrenchu@jlu.edu.cn

作者简介: 丰小月（1977—）,女,博士,从事机器学习研究. ORCID: 0000-0003-3954-1333. E-mail: fengxy@jlu.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章

引用本文:

丰小月, 梁艳春, 林希珣, 管仁初. 永恒语言学习研究与发展[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2017.01.010.

FENG Xiao yue, LIANG Yan chun, LIN Xi xun, GUAN Ren chu. Research and development of never-ending language learning. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2017.01.010.

［1］ HOWE D, COSTANZO M, FEY P, et al. Big data: the future of biocuration ［J］. Nature, 2008, 455:47-50.
［2］ Science. Special online collection: dealing with data ［EB/OL］. (20110211). http:∥www.sciencemag.org/site/special/data/.
［3］ EVANS J A, FOSTER J G. Meta knowledge ［J］. Science, 2011, 331: 721-725.
［4］ CARLSON A, BETTERIDGE J, KISIEL B, et al.Toward an architecture for neverending language learning ［C］ ∥ Proceedings of AAAI 2010. Atlanta: AAAI, 2010: 1306-1313.
［5］ MITCHELL T M, COHEN W W, TALUKDAR P P, et al. Neverending learning ［C］ ∥ Proceeding of AAAI. Austin Texas: AAAI, 2015: 2302-2310.
［6］ LOHR S. NELL is a computer that reads the webwith a little human help ［N/OL］. New York Times, 20130311. http:∥bits.blogs.nytimes.com/2013/ 03/11/nellisacomputerthatreadsthewebwithalittlehumanhelp/.
［7］ LOHR S. Aiming to learn as we do, a machine teaches itself ［N/OL］. New York Times, 20101005. http:∥www.nytimes. com/2010/10/05/science/05compute.html.
［8］ CARLSON A, BETTERIDGE J, WANG R C, et al. Coupled semisupervised learning for information extraction ［C］ ∥ Proceeding of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). New York: ACM, 2010: 101-110.
［9］ WANG R C, COHEN W W. Characterlevel analysis of semistructured documents for set expansion ［C］ ∥ Proceeding of EMNLP. Singapore: ACL, 2009: 1503-1512.
［10］ GARDNER M, TALUKDAR P P, KISIEL B, et al. Improving learning and inference in a large knowledgebase using latent syntactic cues ［C］ ∥ Proceeding of EMNLP. Seattle: ACL, 2013: 833-838.
［11］ LAO N, MITCHELL T M, COHEN W W. Random walk inference and learning in a large scale knowledge base ［C］ ∥ Proceeding of EMNLP. Edinburgh: ACL, 2011: 529-539.
［12］ THRUN S, MITCHELL T M. Lifelong robot learning ［J］. Robotics and Autonomous Systems, 1995, 15 (12): 25-46.
［13］ MITCHELL T M, SHINKAREVA S V, CARLSON A, et al. Predicting human brain activity associated with the meanings of nouns ［J］. Science, 2008, 320: 1191-1195.
［14］ MITCHELL T M. Mining our reality ［J］. Science, 2009, 326: 1644-1645.
［15］ Princeton University. Lecture videos and speaker bios of Princeton centennial celebration of Alan Turing ［EB/OL］. (20130614). http:∥www.princeton.edu/turing/speakers.
［16］ Carnegie Mellon University. Read the Web ［EB/OL］. ［20160805］. http:∥rtw.ml.cmu.edu/ rtw/index.php.
［17］ International conference machine learning. Invited speakers ［EB/OL］. (20100624). http:∥www.icml2010.org/invited. html.
［18］ University of Washington. UW CSE speaker abstract of Tom Mitchell ［EB/OL］. (20101021). https:∥www.cs. washington.edu/htbinpost/mvis/mvis？ID=957.
［19］ KRISHNAMURTHY J, MITCHELL T M. Which noun phrases denote which concepts？［C］ ∥ Proceeding of ACL. Stroudsburg: ACL, 2011: 570-580.
［20］ MOHAMED T, HRUSCHKA Jr E R, MITCHELL T M. Discovering relations between noun categories ［C］ ∥ Proceeding of EMNLP. Edinburgh: ACL, 2011:1447-1455.
［21］ DALVI B, COHEN W W, CALLAN J. Classifying entities into an incomplete ontology ［C］ ∥ Proceeding of the 2013 Workshop on Automated Knowledge Base Construction. San Francisco: ACM, 2013: 31-36.
［22］ WIJAYA D T, TALUKDAR P P, MITCHELL T M. PIDGIN: ontology alignment using web text as Interlingua ［C］ ∥ Proceeding of CIKM. San Francisco: ACM, 2013: 589-598.
［23］ WANG W Y, MAZAITIS K, COHEN W W. Programming with personalized pagerank: a locally groundable firstorder probabilistic logic ［C］ ∥ Proceeding of CIKM. San Francisco: ACM, 2013:2129-2138.
［24］ VERMA S, JR HRUSCHKA E R. Coupled Bayesian sets algorithm for semisupervised learning and information extraction ［C］ ∥ Proceeding of ECML PKDD. Bristol: Springer, 2012: 307-322.
［25］ DALVI B, COHEN W W, CALLAN J. WebSets: extracting sets of entities from the web using unsupervised information extraction ［C］ ∥ Proceeding of the 5th ACM International Conference on Web Search and Data Mining. Seattle: ACM, 2012: 243-252.
［26］ DALVI B, COHEN W W, CALLAN J. Collectively representing semistructured data from the web ［C］ ∥ Proceeding of the Joint Workshop on Automatic Knowledge Base Construction and WebScale Knowledge Extraction Table of Contents. Montreal: ACL, 2012:712.
［27］ DALVI B, COHEN W W, CALLAN J. Very fast similarity queries on semistructured data from the web ［C］ ∥ 2013 SIAM International Conference on Data Mining. Austin: SIAM, 2013: 512-520.
［28］ BALASUBRAMANYAN R, DALVI B, COHEN W W.From topic models to semisupervised learning: biasing mixedmembership models to exploit topicindicative features in entity clustering ［G］ ∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013:628-642.
［29］ DALVI B, COHEN W W, CALLAN J. Exploratory learning ［G］ ∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013: 128-143.
［30］ MOVSHOVITZATTIAS D, COHEN W W. Bootstrapping biomedical ontologies for scientific text using NELL ［C］ ∥ The 11th Workshop on Biomedical Natural Language Processing. Montreal: ACL, 2012: 1119.
［31］ CHEN X L, SHRIVASTAVA A, GUPTA A. NEIL: extracting visual knowledge from web data ［C］ ∥ Proceeding of ICCV. Sydney: IEEE, 2013: 1409-1416.
［32］ SCUDDER H J. Probability of error of some adaptive patternrecognition machines ［J］. IEEE Transaction on Information Theory, 1965, 11 (3): 363371.
［33］ FRALICK S C. Learning to recognize patterns without a teacher ［J］. IEEE Transaction on Information Theory, 1967, 13(1): 57-64.
［34］ AGRAWALA A K. Learning with a probabilistic teacher ［J］. IEEE Transaction on Information Theory, 1970, 16(4): 373-379.
［35］ YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods ［C］ ∥ Proceeding of ACL. Boston: ACL, 1995: 189-196.
［36］ MCCALLUM A, NIGAM K. A comparison of event models for naive Bayes text classification ［C］ ∥ Proceeding of AAAI. Madison: AAAI, 1998: 41-48.
［37］ MCCALLUM A, NIGAM K. Employing EM and poolbased active learning for text classification ［C］ ∥ Proceeding of ICML. Madison: Morgan Kaufmann, 1998: 350-358.
［38］ WANG L, CHAN K L, ZHANG Z. Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval ［C］ ∥ Proceeding of CVPR. Los Alamitos: IEEE, 2003: 629-634.
［39］ KLEIN D, KAMVAR S D, MANNING C. Frominstancelevel constraints to spacelevel constraints: making the most of prior knowledge in data clustering ［C］ ∥ Proceeding of ICML. Australia: Morgan Kaufmann, 2002: 307-314.
［40］ ZHU X, GHAHRAMANI Z, LAFFERTY J. Semisupervised learning using gaussian fields and harmonic functions ［C］ ∥ Proceeding of ICML. Washington D.C.: Morgan Kaufmann, 2003: 912-919.
［41］ LIN T, ZHA H B. Riemannian manifold learning ［J］. IEEE Transaction on Pattern Analysis and MachineIntelligence, 2008, 30(5): 796-809.
［42］ ZHOU Z H, LI M. Semisupervised regression with cotraining style algorithms ［J］. IEEE Transaction on Knowledge and Data Engineering, 2007, 19(11):14791493.
［43］管仁初.半监督聚类的研究与应用［D］.吉林：吉林大学, 2010.
GUAN Renchu. Research and application of semisupervised clustering algorithms ［D］. Jilin: Jilin University, 2010.
［44］ CHAPELLE O, SCHLKOPF B, ZIEN A. Semisupervised learning ［M］. Cambridge: MIT, 2006: 130.
［45］ GUAN R C, SHI X H, MARCHESE M, et al. Text clustering with seeds affinity propagation ［J］. IEEE Transaction on Knowledge and Data Engineering, 2011, 23(4): 627-637.
［46］ YANG C, BRUZZONE L, GUAN R C, et al. Incremental and decremental affinity propagation for semisupervised clustering in multispectral images ［J］. IEEE Transaction on Geoscience and Remote Sensing, 2013, 51(3): 16661679.
［47］ RILOFF E, JONES R. Learning dictionaries for information extraction by multilevel bootstrapping ［C］ ∥ Proceeding of AAAI. Menlo Park: AAAI, 1999:474479.
［48］ CURRAN J R, MURPHY T, SCHOLZ B. Minimising semantic drift with mutual exclusion bootstrapping［C］ ∥ Proceeding of PACLING. Melbourne: University of Melbourne, 2007: 172180.
［49］王元卓,靳小龙,程学旗. 网络大数据：现状与展望［J］.计算机学报,2013, 36(6): 1125-1138.
WANG Yuanzhuo, JIN Xiaolong, CHENG Xueqi. Network big data: present and future ［J］. Chinese Journals of Computers, 2013, 36(6): 1125-1138.
［50］ WANG L P, WANG J X, WANG M, et al. UsingInternet search engines to obtain medical information: a comparative study ［J］. Journal of Medical InternetResearch, 2012, 14(3): e74.
［51］ GARDNER M, HUANG K J, PAPALEXAKIS E, et al. Translation invariant word embeddings ［C］ ∥ Proceeding of EMNLP. Lisbon: ACL, 2015: 10841088.
［52］ WIJAYA D T, MITCHELL T M. Mapping verbs in different languages to knowledge base relations using web text as interlingua ［C］ ∥ Proceedings of NAACL. San Diego: ACL, 2016: 818827.
［53］刘知远,孙茂松,林衍凯,等.知识表示学习研究进展［J］.计算机研究与发展学报,2016, 53(2): 247-261.
LIU Zhiyuan, SUN Maosong, LIN Yankai, et al. Knowledge representation learning: a review ［J］. Journal of Computer Research and Development, 2016, 53(2): 247-261.
［54］ GARDNER M, MITCHELL T M. Efficient andexpressive knowledge base completion using subgraph feature extraction ［C］ ∥ Proceedings of EMNLP. Lisbon: ACL, 2015: 1488-1498.
［55］ RICHARDSON M, DOMINGOS P. Markov logic networks ［J］. Machine Learning, 2006, 62(1): 107-136.
［56］ NICKEL M, TRESP V, KRIEGEL H P. A threeway model for collective learning on multirelational data ［C］ ∥ Proceedings of ICML. Washington: Morgan Kaufmann, 2011: 809816.
［57］ NICKEL M, TRESP V, KRIEGEL H P. Factorizing Yago: scalable machine learning for linked data ［C］ ∥ Proceedings of the 21st International Conference on World Wide Web. Lyon: ACM, 2012: 271-280.
［58］ SUTSKEVER I, TENENBAUM J B, SALAKHUTDINOV R R. Modelling relational data using Bayesian clustered tensor factorization ［C］ ∥ Advances in Neural Information Processing Systems. Vancouver: MIT, 2009: 1821-1828.
［59］ KEMP C, TENENBAUM J B, GRIFFITHS T L, et al. Learning systems of concepts with an infinite relational model ［C］ ∥ Proceeding of AAAI. Boston: AAAI, 2006: 381-388.
［60］ BORDES A, USUNIER N, GARCIADURAN A, et al. Translating embeddings for modeling multirelational data ［C］ ∥ Advances in Neural Information Processing Systems. Lake Tahoe： MIT, 2013: 2787-2795.
［61］ SOCHER R, CHEN D, MANNING C D, et al. Reasoning with neural tensor networks for knowledge base completion ［C］ ∥ Advances in Neural Information Processing Systems. Lake Tahoe: MIT, 2013: 926-934.
［62］ YANG B S, MITCHELL T M. Joint extraction ofevents and entities within a document context ［C］ ∥ Proceedings of NAACL. San Diego: ACL, 2016:289-299.

[1]	刘晓伟,陈赟,张思,陈康. FDM型增材制造中送丝机构动态监测与识别[J]. 浙江大学学报(工学版), 2021, 55(3): 548-554.
[2]	陈巧红,陈翊,李文书,贾宇波. 多尺度SE-Xception服装图像分类[J]. 浙江大学学报(工学版), 2020, 54(9): 1727-1735.
[3]	李文书,邹涛涛,王洪雁,黄海. 基于双尺度长短期记忆网络的交通事故量预测模型[J]. 浙江大学学报(工学版), 2020, 54(8): 1613-1619.
[4]	胡天磊,王皓波,尹文栋. 基于深度双向分类器链的多标签新闻分类算法[J]. 浙江大学学报(工学版), 2019, 53(11): 2110-2117.
[5]	徐兵,刘潇,汪子扬,刘飞虎,梁军. 采用梯度提升决策树的车辆换道融合决策模型[J]. 浙江大学学报(工学版), 2019, 53(6): 1171-1181.
[6]	王硕朋,杨鹏,孙昊,刘迈. 两级参考点匹配位置指纹声源定位方法[J]. 浙江大学学报(工学版), 2019, 53(6): 1198-1204.
[7]	朱东阳, 沈静逸, 黄炜平, 梁军. 基于主动学习和加权支持向量机的工业故障识别[J]. 浙江大学学报(工学版), 2017, 51(4): 697-705.
[8]	裘日辉, 刘康玲, 谭海龙, 梁军. 基于极限学习机的分类算法及在故障识别中的应用[J]. 浙江大学学报(工学版), 2016, 50(10): 1965-1972.
[9]	居斌, 钱沄涛, 叶敏超. 基于结构投影非负矩阵分解的协同过滤算法[J]. 浙江大学学报(工学版), 2015, 49(7): 1319-1325.
[10]	谭海龙, 刘康玲, 金鑫, 石向荣, 梁军. 基于μσ-DWC特征和树结构M-SVM的多维时间序列分类[J]. 浙江大学学报(工学版), 2015, 49(6): 1061-1069.
[11]	林亦宁, 韦巍, 戴渊明. 半监督Hough Forest跟踪算法[J]. J4, 2013, 47(6): 977-983.
[12]	李侃,黄文雄,黄忠华. 基于支持向量机的多传感器探测目标分类方法[J]. J4, 2013, 47(1): 15-22.
[13]	姚伏天, 钱沄涛, 李吉明. 空间约束半监督高斯过程下的高光谱图像分类[J]. J4, 2012, 46(7): 1295-1300.
[14]	戴兴虎,钱沄涛,唐凤仙,居斌. 基于图表标题信息的在线生物文献MRI图像检测[J]. J4, 2012, 46(7): 1307-1313.
[15]	王洪波, 赵光宙, 齐冬莲, 卢达. 一类支持向量机的快速增量学习方法[J]. J4, 2012, 46(7): 1327-1332.

Viewed

Full text

Abstract

Cited

Shared

Discussed