Research and development of never-ending language learning

doi:10.3785/j.issn.1008-973X.2017.01.010

JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE)

Automation technology

Research and development of never-ending language learning

FENG Xiao yue, LIANG Yan chun, LIN Xi xun, GUAN Ren chu

1. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University,Changchun 130012, China;
2. Zhuhai Labaratory of Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Zhuhai College of Jilin University, Zhuhai 519041, China

Download:

PDF(1287KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

Tom M. Mitchell proposed the never-ending language learning (NELL) in 2010 at American Association for Artificial Intelligence (AAAI) in order to develop an intelligent language learning model. Using semi-supervised learning and natural language processing technique, NELL continuously gets large number of texts from Internet, extracts knowledge and enriches its knowledge base, which improves its intelligence. The NELL model and its modules were introduced. The incubation and development of NELL were depicted. Six problems about NELL were described, including: self-reflection to decide what to do next; daily human interaction; discovery of new predicates to learn; learning additional types of knowledge about language; entity-level (rather than sting-level) modeling; more sophisticated probabilistic modeling throughout the implementation. A new NELL model tending to be a potential solution was proposed.

Published: 01 January 2017

CLC:

TP 181

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors

Cite this article:

FENG Xiao yue, LIANG Yan chun, LIN Xi xun, GUAN Ren chu. Research and development of never-ending language learning. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(1): 82-88.

永恒语言学习研究与发展

为了构建智能的语言学习模型,Tom M. Mitchell教授2010年在美国人工智能协会（AAAI）上提出永恒语言学习（NELL）的概念.NELL模型主要运用半监督学习和自然语言处理技术,持续不断地从互联网上获取大量文本,抽取知识,丰富知识库,使永恒语言学习模型变得更加智能.介绍了永恒语言学习模型及模型的组成|描述了NELL的孕育和发展以及面临的6个主要问题,包括自省能力开发,每天需要短暂的人工监督,新谓词学习,新类型知识的学习|命名实体建模和更精确的统计学习模型构建|提出拟解决现有问题的新永恒语言学习模型.

［1］ HOWE D, COSTANZO M, FEY P, et al. Big data: the future of biocuration ［J］. Nature, 2008, 455:47-50.
［2］ Science. Special online collection: dealing with data ［EB/OL］. (20110211). http:∥www.sciencemag.org/site/special/data/.
［3］ EVANS J A, FOSTER J G. Meta knowledge ［J］. Science, 2011, 331: 721-725.
［4］ CARLSON A, BETTERIDGE J, KISIEL B, et al.Toward an architecture for neverending language learning ［C］ ∥ Proceedings of AAAI 2010. Atlanta: AAAI, 2010: 1306-1313.
［5］ MITCHELL T M, COHEN W W, TALUKDAR P P, et al. Neverending learning ［C］ ∥ Proceeding of AAAI. Austin Texas: AAAI, 2015: 2302-2310.
［6］ LOHR S. NELL is a computer that reads the webwith a little human help ［N/OL］. New York Times, 20130311. http:∥bits.blogs.nytimes.com/2013/ 03/11/nellisacomputerthatreadsthewebwithalittlehumanhelp/.
［7］ LOHR S. Aiming to learn as we do, a machine teaches itself ［N/OL］. New York Times, 20101005. http:∥www.nytimes. com/2010/10/05/science/05compute.html.
［8］ CARLSON A, BETTERIDGE J, WANG R C, et al. Coupled semisupervised learning for information extraction ［C］ ∥ Proceeding of the 3rd ACM International Conference on Web Search and Data Mining (WSDM). New York: ACM, 2010: 101-110.
［9］ WANG R C, COHEN W W. Characterlevel analysis of semistructured documents for set expansion ［C］ ∥ Proceeding of EMNLP. Singapore: ACL, 2009: 1503-1512.
［10］ GARDNER M, TALUKDAR P P, KISIEL B, et al. Improving learning and inference in a large knowledgebase using latent syntactic cues ［C］ ∥ Proceeding of EMNLP. Seattle: ACL, 2013: 833-838.
［11］ LAO N, MITCHELL T M, COHEN W W. Random walk inference and learning in a large scale knowledge base ［C］ ∥ Proceeding of EMNLP. Edinburgh: ACL, 2011: 529-539.
［12］ THRUN S, MITCHELL T M. Lifelong robot learning ［J］. Robotics and Autonomous Systems, 1995, 15 (12): 25-46.
［13］ MITCHELL T M, SHINKAREVA S V, CARLSON A, et al. Predicting human brain activity associated with the meanings of nouns ［J］. Science, 2008, 320: 1191-1195.
［14］ MITCHELL T M. Mining our reality ［J］. Science, 2009, 326: 1644-1645.
［15］ Princeton University. Lecture videos and speaker bios of Princeton centennial celebration of Alan Turing ［EB/OL］. (20130614). http:∥www.princeton.edu/turing/speakers.
［16］ Carnegie Mellon University. Read the Web ［EB/OL］. ［20160805］. http:∥rtw.ml.cmu.edu/ rtw/index.php.
［17］ International conference machine learning. Invited speakers ［EB/OL］. (20100624). http:∥www.icml2010.org/invited. html.
［18］ University of Washington. UW CSE speaker abstract of Tom Mitchell ［EB/OL］. (20101021). https:∥www.cs. washington.edu/htbinpost/mvis/mvis？ID=957.
［19］ KRISHNAMURTHY J, MITCHELL T M. Which noun phrases denote which concepts？［C］ ∥ Proceeding of ACL. Stroudsburg: ACL, 2011: 570-580.
［20］ MOHAMED T, HRUSCHKA Jr E R, MITCHELL T M. Discovering relations between noun categories ［C］ ∥ Proceeding of EMNLP. Edinburgh: ACL, 2011:1447-1455.
［21］ DALVI B, COHEN W W, CALLAN J. Classifying entities into an incomplete ontology ［C］ ∥ Proceeding of the 2013 Workshop on Automated Knowledge Base Construction. San Francisco: ACM, 2013: 31-36.
［22］ WIJAYA D T, TALUKDAR P P, MITCHELL T M. PIDGIN: ontology alignment using web text as Interlingua ［C］ ∥ Proceeding of CIKM. San Francisco: ACM, 2013: 589-598.
［23］ WANG W Y, MAZAITIS K, COHEN W W. Programming with personalized pagerank: a locally groundable firstorder probabilistic logic ［C］ ∥ Proceeding of CIKM. San Francisco: ACM, 2013:2129-2138.
［24］ VERMA S, JR HRUSCHKA E R. Coupled Bayesian sets algorithm for semisupervised learning and information extraction ［C］ ∥ Proceeding of ECML PKDD. Bristol: Springer, 2012: 307-322.
［25］ DALVI B, COHEN W W, CALLAN J. WebSets: extracting sets of entities from the web using unsupervised information extraction ［C］ ∥ Proceeding of the 5th ACM International Conference on Web Search and Data Mining. Seattle: ACM, 2012: 243-252.
［26］ DALVI B, COHEN W W, CALLAN J. Collectively representing semistructured data from the web ［C］ ∥ Proceeding of the Joint Workshop on Automatic Knowledge Base Construction and WebScale Knowledge Extraction Table of Contents. Montreal: ACL, 2012:712.
［27］ DALVI B, COHEN W W, CALLAN J. Very fast similarity queries on semistructured data from the web ［C］ ∥ 2013 SIAM International Conference on Data Mining. Austin: SIAM, 2013: 512-520.
［28］ BALASUBRAMANYAN R, DALVI B, COHEN W W.From topic models to semisupervised learning: biasing mixedmembership models to exploit topicindicative features in entity clustering ［G］ ∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013:628-642.
［29］ DALVI B, COHEN W W, CALLAN J. Exploratory learning ［G］ ∥ Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Prague: Springer, 2013: 128-143.
［30］ MOVSHOVITZATTIAS D, COHEN W W. Bootstrapping biomedical ontologies for scientific text using NELL ［C］ ∥ The 11th Workshop on Biomedical Natural Language Processing. Montreal: ACL, 2012: 1119.
［31］ CHEN X L, SHRIVASTAVA A, GUPTA A. NEIL: extracting visual knowledge from web data ［C］ ∥ Proceeding of ICCV. Sydney: IEEE, 2013: 1409-1416.
［32］ SCUDDER H J. Probability of error of some adaptive patternrecognition machines ［J］. IEEE Transaction on Information Theory, 1965, 11 (3): 363371.
［33］ FRALICK S C. Learning to recognize patterns without a teacher ［J］. IEEE Transaction on Information Theory, 1967, 13(1): 57-64.
［34］ AGRAWALA A K. Learning with a probabilistic teacher ［J］. IEEE Transaction on Information Theory, 1970, 16(4): 373-379.
［35］ YAROWSKY D. Unsupervised word sense disambiguation rivaling supervised methods ［C］ ∥ Proceeding of ACL. Boston: ACL, 1995: 189-196.
［36］ MCCALLUM A, NIGAM K. A comparison of event models for naive Bayes text classification ［C］ ∥ Proceeding of AAAI. Madison: AAAI, 1998: 41-48.
［37］ MCCALLUM A, NIGAM K. Employing EM and poolbased active learning for text classification ［C］ ∥ Proceeding of ICML. Madison: Morgan Kaufmann, 1998: 350-358.
［38］ WANG L, CHAN K L, ZHANG Z. Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval ［C］ ∥ Proceeding of CVPR. Los Alamitos: IEEE, 2003: 629-634.
［39］ KLEIN D, KAMVAR S D, MANNING C. Frominstancelevel constraints to spacelevel constraints: making the most of prior knowledge in data clustering ［C］ ∥ Proceeding of ICML. Australia: Morgan Kaufmann, 2002: 307-314.
［40］ ZHU X, GHAHRAMANI Z, LAFFERTY J. Semisupervised learning using gaussian fields and harmonic functions ［C］ ∥ Proceeding of ICML. Washington D.C.: Morgan Kaufmann, 2003: 912-919.
［41］ LIN T, ZHA H B. Riemannian manifold learning ［J］. IEEE Transaction on Pattern Analysis and MachineIntelligence, 2008, 30(5): 796-809.
［42］ ZHOU Z H, LI M. Semisupervised regression with cotraining style algorithms ［J］. IEEE Transaction on Knowledge and Data Engineering, 2007, 19(11):14791493.
［43］管仁初.半监督聚类的研究与应用［D］.吉林：吉林大学, 2010.
GUAN Renchu. Research and application of semisupervised clustering algorithms ［D］. Jilin: Jilin University, 2010.
［44］ CHAPELLE O, SCHLKOPF B, ZIEN A. Semisupervised learning ［M］. Cambridge: MIT, 2006: 130.
［45］ GUAN R C, SHI X H, MARCHESE M, et al. Text clustering with seeds affinity propagation ［J］. IEEE Transaction on Knowledge and Data Engineering, 2011, 23(4): 627-637.
［46］ YANG C, BRUZZONE L, GUAN R C, et al. Incremental and decremental affinity propagation for semisupervised clustering in multispectral images ［J］. IEEE Transaction on Geoscience and Remote Sensing, 2013, 51(3): 16661679.
［47］ RILOFF E, JONES R. Learning dictionaries for information extraction by multilevel bootstrapping ［C］ ∥ Proceeding of AAAI. Menlo Park: AAAI, 1999:474479.
［48］ CURRAN J R, MURPHY T, SCHOLZ B. Minimising semantic drift with mutual exclusion bootstrapping［C］ ∥ Proceeding of PACLING. Melbourne: University of Melbourne, 2007: 172180.
［49］王元卓,靳小龙,程学旗. 网络大数据：现状与展望［J］.计算机学报,2013, 36(6): 1125-1138.
WANG Yuanzhuo, JIN Xiaolong, CHENG Xueqi. Network big data: present and future ［J］. Chinese Journals of Computers, 2013, 36(6): 1125-1138.
［50］ WANG L P, WANG J X, WANG M, et al. UsingInternet search engines to obtain medical information: a comparative study ［J］. Journal of Medical InternetResearch, 2012, 14(3): e74.
［51］ GARDNER M, HUANG K J, PAPALEXAKIS E, et al. Translation invariant word embeddings ［C］ ∥ Proceeding of EMNLP. Lisbon: ACL, 2015: 10841088.
［52］ WIJAYA D T, MITCHELL T M. Mapping verbs in different languages to knowledge base relations using web text as interlingua ［C］ ∥ Proceedings of NAACL. San Diego: ACL, 2016: 818827.
［53］刘知远,孙茂松,林衍凯,等.知识表示学习研究进展［J］.计算机研究与发展学报,2016, 53(2): 247-261.
LIU Zhiyuan, SUN Maosong, LIN Yankai, et al. Knowledge representation learning: a review ［J］. Journal of Computer Research and Development, 2016, 53(2): 247-261.
［54］ GARDNER M, MITCHELL T M. Efficient andexpressive knowledge base completion using subgraph feature extraction ［C］ ∥ Proceedings of EMNLP. Lisbon: ACL, 2015: 1488-1498.
［55］ RICHARDSON M, DOMINGOS P. Markov logic networks ［J］. Machine Learning, 2006, 62(1): 107-136.
［56］ NICKEL M, TRESP V, KRIEGEL H P. A threeway model for collective learning on multirelational data ［C］ ∥ Proceedings of ICML. Washington: Morgan Kaufmann, 2011: 809816.
［57］ NICKEL M, TRESP V, KRIEGEL H P. Factorizing Yago: scalable machine learning for linked data ［C］ ∥ Proceedings of the 21st International Conference on World Wide Web. Lyon: ACM, 2012: 271-280.
［58］ SUTSKEVER I, TENENBAUM J B, SALAKHUTDINOV R R. Modelling relational data using Bayesian clustered tensor factorization ［C］ ∥ Advances in Neural Information Processing Systems. Vancouver: MIT, 2009: 1821-1828.
［59］ KEMP C, TENENBAUM J B, GRIFFITHS T L, et al. Learning systems of concepts with an infinite relational model ［C］ ∥ Proceeding of AAAI. Boston: AAAI, 2006: 381-388.
［60］ BORDES A, USUNIER N, GARCIADURAN A, et al. Translating embeddings for modeling multirelational data ［C］ ∥ Advances in Neural Information Processing Systems. Lake Tahoe： MIT, 2013: 2787-2795.
［61］ SOCHER R, CHEN D, MANNING C D, et al. Reasoning with neural tensor networks for knowledge base completion ［C］ ∥ Advances in Neural Information Processing Systems. Lake Tahoe: MIT, 2013: 926-934.
［62］ YANG B S, MITCHELL T M. Joint extraction ofevents and entities within a document context ［C］ ∥ Proceedings of NAACL. San Diego: ACL, 2016:289-299.

[1]	Xiao-wei LIU,Yun CHEN,Si ZHANG,Kang CHEN. Dynamic monitoring and identification of wire feeder in FDM-based additive manufacturing[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2021, 55(3): 548-554.

[2]	Qiao-hong CHEN,YI CHEN,Wen-shu Li,Yu-bo JIA. Clothing image classification based on multi-scale SE-Xception[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2020, 54(9): 1727-1735.

[3]	Wen-shu LI,Tao-tao ZOU,Hong-yan WANG,Hai HUANG. Traffic accident quantity prediction model based on dual-scale long short-term memory network[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2020, 54(8): 1613-1619.

[4]	Tian-lei HU,Hao-bo WANG,Wen-dong YIN. Multi-label news classification algorithm based on deep bi-directional classifier chains[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2019, 53(11): 2110-2117.

[5]	Bing XU,Xiao LIU,Zi-yang WANG,Fei-hu LIU,Jun LIANG. Fusion decision model for vehicle lane change with gradient boosting decision tree[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2019, 53(6): 1171-1181.

[6]	Shuo-peng WANG,Peng YANG,Hao SUN,Mai LIU. Fingerprint-based sound source localization method using two-stage reference points matching[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2019, 53(6): 1198-1204.

[7]	ZHU Dong-yang, SHEN Jing-yi, HUANG Wei-ping, LIANG Jun. Fault classification based on modified active learning and weighted SVM[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(4): 697-705.

[8]	QIU Ri hui, LIU Kang ling, TAN Hai long, LIANG Jun. Classification algorithm based on extreme learning machine and its application in fault identification of Tennessee Eastman process[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2016, 50(10): 1965-1972.

[9]	JU Bin, QIAN Yun-tao, YE Min-chao. Collaborative filtering algorithm based on structured projective nonnegative matrix factorization[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(7): 1319-1325.

[10]	TAN Hailong, LIU Kangling, JIN Xin, SHI Xiang rong, LIANG Jun. Multivariate time series classification based on μσ-DWC feature and tree-structured M-SVM[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(6): 1061-1069.

[11]	LIN Yi-ning, WEI Wei, DAI Yuan-ming. Semi-supervised Hough Forest tracking method[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2013, 47(6): 977-983.

[12]	LI Kan, HUANG Wen-xiong, HUANG Zhong-hua. Multi-sensor detected object classification method based on support vector machine[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2013, 47(1): 15-22.

[13]	YAO Fu-tian, QIAN Yun-tao, LI Ji-ming. Semi-supervised learning based Gaussian processes for hyperspectral image classification[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2012, 46(7): 1295-1300.

[14]	DAI Xing-hu, QIAN Yun-tao, TANG Feng-xian, JU Bin. Figure caption based MRI image detection from online biological literature[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2012, 46(7): 1307-1313.

[15]	WANG Hong-bo, ZHAO Guang-zhou, QI Dong-lian, LU Da. Fast incremental learning method for one-class support vector machine[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2012, 46(7): 1327-1332.

Viewed

Full text

Abstract

Cited

Shared

Discussed