基于语义相似度与XGBoost算法的英语作文智能评价框架研究

doi:10.3785/j.issn.1008-9497.2020.03.010

浙江大学学报（理学版）

2020, Vol. 47

Issue (3): 329-336 DOI: 10.3785/j.issn.1008-9497.2020.03.010

数学与计算机科学

基于语义相似度与XGBoost算法的英语作文智能评价框架研究

吕欣¹, 程雨夏²

1.杭州电子科技大学外国语学院，浙江杭州 310018
2.杭州电子科技大学计算机学院，浙江杭州 310018

A study of automated English essay evaluating framework based on semantic similarity and XGBoost algorithm

LYU Xin¹, CHENG Yuxia²

1.School of Foreign Languages and Literatures, Hangzhou Dianzi University, Hangzhou 310018, China
1.School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018,China

全文: PDF(1100 KB)

HTML

摘要： 作文智能评分和评语智能生成能极大减轻评阅专家的工作量、节约人力成本。目前，评分和评语结果的准确性与公平性尚不高。近年来，机器学习和自然语言处理等技术的快速发展，在一定程度上提升了文本分类、机器翻译等任务的性能，但仍有许多新的研究成果尚未应用于作文智能评价。本研究综合了词向量(word2vec)、段落向量(paragraph2vec)、词性向量(pos2vec)和LDA (latent dirichlet allocation)等特征，共同组合为作文的语义表示向量；采用基于kNN (k nearest neighbors)算法的语义相似度模型，得到作文的评语标签；采用基于XGBoost(extreme gradient boosting)的回归模型计算英语作文的评分值；并以900篇大学生英语作文为样本，构造算例进行验证。最后表明，提出的智能评价框架在英语作文自动评分和评语生成的准确性上，都要高于传统方法。

关键词： 智能评分; 相似度; 语义表示; XGBoost; 英语作文

Abstract: Automated essay scoring and comment generation has greatly released expert human raters from huge workload of evaluating English essays, but up to now, there is still some doubt about the accuracy and fairness of its results. In recent years, with the rapid development of machine learning and natural language processing, etc., to some extent the performance of text classification, machine translation and the like has been improved. However, quite a number of new research achievements have not been applied to automated essay scoring. This paper presents a semantic representation vector of essays, which is a combination of the features of word2vec, paragraph2vec, pos2vec and LDA (latent Dirichlet allocation); then, the commentary labels of essays are generated through the semantic similarity model based on kNN (k nearest neighbors) algorithm; next, the English essays are scored on the basis of XGBoost (extreme gradient boosting) regression model; finally, 900 college students’ English essays are taken as samples to verify the results. The case studies show that the evaluating framework in this paper has higher accuracy in automated scoring and comment generation of English essays than traditional methods.

Key words: English essay automated essay scoring semantic representation similarity XGBoost

收稿日期: 2019-04-18 出版日期: 2020-06-25

CLC:

TP391.6

基金资助: 国家社会科学基金资助项目（16BYY092）；浙江省哲学社会科学规划课题项目（19NDJC043YB）；杭州市哲学社会科学规划课题项目(M18JC040)；杭州电子科技大学2017年高等教育研究资助项目（YB201763）；浙江省杭电智慧城市研究中心开放课题项目（GK150906299001/034）.

作者简介: 吕欣（1981—），ORCID:http://orcid.org/0000-0002-8567-0251，女，硕士，讲师，主要从事语言学研究,E-mail:luxin98@163.com.

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	吕欣
	程雨夏

引用本文:

吕欣, 程雨夏. 基于语义相似度与XGBoost算法的英语作文智能评价框架研究[J]. 浙江大学学报（理学版）, 2020, 47(3): 329-336.

LYU Xin, CHENG Yuxia. A study of automated English essay evaluating framework based on semantic similarity and XGBoost algorithm. Journal of Zhejiang University (Science Edition), 2020, 47(3): 329-336.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2020.03.010 或 https://www.zjujournals.com/sci/CN/Y2020/V47/I3/329

1 DIKLI S. An overview of automated scoring of essays[J]. Journal of Technology Learning and Assessment, 2006, 5(1): 1-35.
2 VALENTI S, NERI F， CUCCHIARELLI A. An overview of current research on automated essay grading[J]. Journal of Information Technology Education, 2003, 2(1): 319-330.
3 LANDAUER T K, LAHAM D, FOLTZ P W. The intelligent essay assessor: applications to educational technology [J]. IEEE Intelligent Systems and Their Applications, 2000, 15(5): 27-31.
4 张梅, 印勇. 英语作文计算机评分技术综述[J]. 外语电化教学, 2010(136): 44-47.DOI:10.3969/j.issn.1001-5795.2010.06.008 ZHANG M, YIN Y. An overview of computerized scoring of English essays[J]. Computer-Assisted Foreign Language Education, 2010(136): 44-47.DOI:10.3969/j.issn.1001-5795.2010.06.008
5 刘明杨. 高考作文自动评分关键技术研究[D]. 哈尔滨: 哈尔滨工业大学, 2015. LIU M Y. Research on the Key Technology of the Automatic Scoring of the College Entrance Examination Essay [D]. Harbin: Harbin Institute of Technology, 2015.
6 HIRSCHBERG J , MANNING C D. Advances in natural language processing[J]. Science, 2015, 349(6245): 261-266.
7 ISHIOKA T , KAMEDA M. Automated Japanese essay scoring system based on articles written by experts[C]// 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference.Sydney ACL, 2006 (1): 233-240.DOI:10.3115/1220175.1220205
8 LARKEY L S. Automated Essay Scoring: A Cross Disciplinary Perspective[M]. Hillsdale, NJ: Lawrence Erlbaum Associates, 2002.
9 RAMINENI C, TRAPANI C S, WILLIAMSON D M, et al. Evaluation of the E-rater scoring engine for the GRE issue and argument prompts[J]. ETS Research Report Series, 2012 （1）: i-106.DOI:10.1002/j.2333-8504.2012.tb02284.x
10 殷小娟, 贾永华, 林庆英. “句酷网”和“冰果”自动评分效度的对比实证研究[J]. 河北北方学院学报(社会科学版), 2017(1): 91-96.DOI:10.3969/j.issn.2095-462X.2017.01.022 YIN X J, JIA Y H, LIN Q Y. A comparative empirical study on the reliability of “Juku” and “Bingo” online autonomous grading systems[J]. Journal of Hebei North University (Social Science Edition), 2017(1): 91-96.DOI:10.1371/journal.pone.0066730
11 FOLTZ P W. Latent semantic analysis for text-based research[J]. Behavior Research Methods Instruments & Computers, 1996, 28(2): 197-202.DOI:10.3758/bf03204765
12 王耀华, 李舟军, 何跃鹰, 等. 基于文本语义离散度的自动作文评分关键技术研究[J]. 中文信息学报, 2016(6): 173-181.DOI:10.4028/www.scientific.net/AMM.519-520.1542 WANG Y H, LI Z J, HE Y Y, et al. Research on key technology of automatic essay scoring based on text semantic dispersion[J]. Journal of Chinese Information Processing,2016(6): 173-181.DOI:10.4028/www.scientific.net/AMM.519-520.1542
13 陈一乐. 基于回归分析的中文作文自动评分技术研究[D]. 哈尔滨: 哈尔滨工业大学,2016. CHEN Y L. Research on Key Techniques of Automated Chinese Essay Scoring Based on Regression Analysis[D]. Harbin: Harbin Institute of Technology, 2016.
14 李斌. 基于文本分类技术的英语作文自动评分研究[D]. 哈尔滨:哈尔滨工业大学,2009. DOI:10.3923/itj.2013.7977.7982 LI B. Research on Automated English Essay Scoring Using Text Categorization[D]. Harbin: Harbin Institute of Technology,2009.DOI:10.3923/itj.2013.7977.7982
15 CHEN H,XU J, HE B. Automated essay scoring by capturing relative writing quality [J].The Computer Journal,2014,57(9):1318-1330.
16 魏扬威, 黄萱菁. 结合语言学特征和自编码器的英语作文自动评分[J].计算机系统应用,2017(1): 1-8. DOI:10.15888/j.cnki.csa.005535 WEI Y W, HUANG X J. Automatic essay scoring using linguistic features and auto encoder[J]. Computer Systems and Applications,2017(1): 1-8.DOI:10.15888/j.cnki.csa.005535
17 李婷, 张景祥. 集中趋势自适应增强的英语作文评分算法[J].计算机工程与应用,2018(9): 151-155.DOI:10.3778/j.issn.1002-8331.1611-0502 LI T, ZHANG J X. Adaptive boosting with central tendency algorithm for English essay scoring[J]. Computer Engineering and Applications, 2018(9): 151-155. DOI:10.3778/j.issn.1002-8331.1611-0502
18 LE Q , MIKOLOV T. Distributed representations of sentences and documents[C]// The 31st International Conference on Machine Learning. Beijing: ICML, 2014.
19 BLEI D M, NG A Y , JORDAN M I. Latent dirichlet allocation [J]. Journal of Machine Learning Research, 2003,3(4/5): 993-1022.
20 CHEN T , GUESTRIN C. XGBoost: A scalable tree boosting system[C]//The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM,2016(1):785-794. DOI:10.1145/2939672.2939785

[1]	王波, 惠小静, 鲁星. 谓词逻辑系统MTL∀中公式的公理化真度[J]. 浙江大学学报（理学版）, 2022, 49(5): 521-526.
[2]	华维灿,孙刚,王贵君. 基于概率犹豫模糊相似度的交互式群体决策方法[J]. 浙江大学学报（理学版）, 2022, 49(4): 398-407.
[3]	郑晶, 张恺. 考虑决策者后悔规避的瓦斯爆炸案例决策方法[J]. 浙江大学学报（理学版）, 2020, 47(3): 337-344.
[4]	陈永佩, 杜震洪, 刘仁义, 张丰, 王炼刚. 一种引入实体的地理语义相似度混合计算模型[J]. 浙江大学学报（理学版）, 2018, 45(2): 196-204.
[5]	何敬, 刘仁义, 张丰, 杜震洪, 陈永佩. 基于特征点群相似度计算模型的图像表示方法[J]. 浙江大学学报（理学版）, 2017, 44(5): 599-605.

Viewed

Full text

Abstract

Cited

Shared

Discussed