基于自然近邻与协同过滤的API推荐方法

doi:10.3785/j.issn.1008-973X.2022.03.008

浙江大学学报(工学版)

2022, Vol. 56

Issue (3): 494-502 DOI: 10.3785/j.issn.1008-973X.2022.03.008

计算机与控制工程

基于自然近邻与协同过滤的API推荐方法

郑黄河(

),黄志球*(

),李伟湋,喻垚慎,王永超

南京航空航天大学计算机科学与技术学院，江苏南京 210016

API recommendation method based on natural nearest neighbors and collaborative filtering

Huang-he ZHENG(

),Zhi-qiu HUANG*(

),Wei-wei LI,Yao-shen YU,Yong-chao WANG

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

全文: PDF(914 KB) HTML

摘要：

为了解决由于近邻选择不恰当导致的推荐性能下降问题，提出基于自然近邻与协同过滤的API推荐方法——N-APIRec. 该方法利用BM25算法将项目转换成向量，以自然近邻算法筛选数据集中的相似项目以减少搜索范围，从相似项目中筛选相似的方法声明，通过协同过滤的方式推荐API. 将N-APIRec在MV、SH数据集上与前沿方法进行实验对比，结果验证了N-APIRec的有效性，在MV、SH数据集上的推荐成功率分别为77.38%、30.00%，优于现有方法.

关键词： 代码复用; API推荐; 自然近邻; BM25; 协同过滤

Abstract:

An API recommendation method based on natural nearest neighbors and collaborative filtering named N-APIRec was proposed in order to solve the problem of recommendation performance degradation caused by improper neighbor selection. In this model, BM25 algorithm was used to transform the projects into vectors. Then the natural neighbor algorithm was used to filter the similar projects in the dataset to reduce the search scope, and the similar method declarations were filtered from the similar projects. Finally, the APIs were recommended through collaborative filtering. N-APIRec was compared with the state-of-the-art approach on MV and SH data sets. The results were verified the effectiveness of N-APIRec, the success rate of MV and SH data sets recommendation was 77.38%and 30.00% respectively, which was better than the existing methods.

Key words: code reuse API recommendation natural nearest neighbors BM25 collaborative filtering

收稿日期: 2021-08-19 出版日期: 2022-03-29

CLC:

TP 391

基金资助: 国家重点研发计划资助项目（2018YFB1003900）

通讯作者: 黄志球 E-mail: sz1916053@nuaa.edu.cn;zqhuang@nuaa.edu.cn

作者简介: 郑黄河(1996—)，男，硕士生，从事智能化软件开发研究. orcid.org/0000-0001-9934-9453. E-mail: sz1916053@nuaa.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	作者相关文章
	郑黄河
	黄志球
	李伟湋
	喻垚慎
	王永超

引用本文:

郑黄河,黄志球,李伟湋,喻垚慎,王永超. 基于自然近邻与协同过滤的API推荐方法[J]. 浙江大学学报(工学版), 2022, 56(3): 494-502.

Huang-he ZHENG,Zhi-qiu HUANG,Wei-wei LI,Yao-shen YU,Yong-chao WANG. API recommendation method based on natural nearest neighbors and collaborative filtering. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 494-502.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2022.03.008 或 https://www.zjujournals.com/eng/CN/Y2022/V56/I3/494

图 1 N-APIRec的工作流程

图 2 API调用结构图

表 1 项目g_batik-codec-1.8的相似度列表

图 3 项目相似度折线图

表 2 实验数据集统计信息

表 3 N-APIRec在数据集MV、SH上的表现

表 4 在SH数据集上不同方法的成功率

图 4 不同数据集上的精确度召回率曲线

表 5 在MV数据集上不同方法的成功率

表 6 SH数据集上不同项目完成度下的成功率

表 7 SH数据集上不同方法声明完成度下的召回率

1	NIE L, JIANG H, REN Z, et al Query expansion based on crowd knowledge for code search[J]. IEEE Transactions on Services Computing, 2016, 9 (5): 771- 783 doi: 10.1109/TSC.2016.2560165
2	JIANG H, NIE L, SUN Z, et al ROSF: leveraging information retrieval and supervised learning for recommending code snippets[J]. IEEE Transactions on Services Computing, 2019, 12 (1): 34- 46 doi: 10.1109/TSC.2016.2592909
3	RAGHOTHAMAN M, WEI Y, HAMADI Y. SWIM: synthesizing what I mean—code search and idiomatic snippet synthesis [C]// Proceedings of the 38th International Conference on Software Engineering. Austin: ACM, 2016: 357–367.
4	GU X, ZHANG H, ZHANG D, et al. Deep API learning [C]// Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Seattle: ACM, 2016: 631-642.
5	CAI L, WANG H, HUANG Q, et al. BIKER: a tool for Bi-information source based API method recommendation [C]// Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Tallinn: ACM, 2019: 1075-1079.
6	ZHOU Y, YANG X, CHEN T, et al Boosting API recommendation with implicit feedback[J]. IEEE Transactions on Software Engineering, 2021, 1 (1): 1
7	XIE W, PENG X, LIU M, et al. API method recommendation via explicit matching of functionality verb phrases [C]// Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Sacramento: ACM, 2020: 1015-1026.
8	LECUN Y, YOSHUA B Convolutional networks for images, speech, and time series[J]. The Handbook of Brain Theory and Neural Networks, 1995, 3361 (10): 1995
9	HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural computation, 1997, 9 (8): 1735- 1780 doi: 10.1162/neco.1997.9.8.1735
10	SCARSELLI F, GORI M, TSOI A, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80
11	LING C, ZOU Y, XIE B. Graph neural network based collaborative filtering for API usage recommendation [C]// 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering . Tokyo: IEEE, 2021: 36-47.
12	ZHONG H, XIE T, ZHANG L, et al. MAPO: Mining and recommending API usage patterns [C]// European Conference on Object-Oriented Programming. Genoa: Springer, 2009: 318-343.
13	WANG J, DANG Y, ZHANG H, et al. Mining succinct and high-coverage API usage patterns from source code [C]// 2013 10th Working Conference on Mining Software Repositories. San Francisco: IEEE, 2013: 319-328.
14	FOWKES J, SUTTON C. Parameter-free probabilistic API mining across GitHub [C]// Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Seattle : ACM, 2016: 254-265.
15	NGUYEN P T, ROCCO J D, RUSCIO D D, et al. FOCUS: a recommender system for mining API function calls and usage patterns [C]// International Conference on Software Engineering. Montreal: IEEE, 2019: 1050-1060.
16	CHEN A. Context-aware collaborative filtering system: predicting the user’s preference in the ubiquitous computing environment [C]// International Symposium on Location-and Context-Awareness. Berlin: Springer, 2005: 244-253.
17	GUO G, WANG H, BELL D, et al. KNN model-based approach in classification [C]// On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. Montpellier: Springer, 2003: 986-996.
18	NGUYEN T, NGUYEN A, PHAN H, et al. Combining Word2Vec with revised vector space model for better code retrieval [C]// 2017 IEEE/ACM 39th International Conference on Software Engineering Companion. Buenos Aires: IEEE, 2017: 183-185.
19	NIU J, ZHAO Q, WANG L, et al. OnSeS: a novel online short text summarization based on BM25 and neural network [C]// 2016 IEEE Global Communications Conference. Washington: IEEE, 2016: 1-6.
20	RAMOS J. Using TF-IDF to determine word relevance in document queries [C]// Proceedings of the first instructional conference on machine learning. Moscow: Citeseer, 2003: 29-48.
21	ZHU Q, HUANG J, FENG J, et al A clustering algorithm based on natural nearest neighbor[J]. Journal of Computational Information Systems, 2014, 10 (13): 5473- 5480
22	XIE R, KONG X, WANG L, et al. Hirec: API recommendation using hierarchical context [C]// 2019 IEEE 30th International Symposium on Software Reliability Engineering. Berlin: IEEE, 2019: 369-379.
23	BASTEN B, HILLS M, KLINT P, et al. M3: a general model for code analytics in rascal [C]// 2015 IEEE 1st International Workshop on Software Analytics. Montreal: IEEE, 2015: 25-28.
24	JACCARD P The distribution of the flora in the alpine zone[J]. New phytologist, 1912, XI (2): 37- 50
25	NAH F A study on tolerable waiting time: how long are web users willing to wait?[J]. Behaviour Information Technology, 2004, 23 (3): 153- 163 doi: 10.1080/01449290410001669914

[1]	李诺,郭斌,刘琰,景瑶,於志文. 神经协同过滤智能商业选址方法[J]. 浙江大学学报(工学版), 2019, 53(9): 1788-1794.
[2]	董立岩,金佳欢,方塬程,王越群,李永丽,孙铭会. 基于非负矩阵分解的Slope One算法[J]. 浙江大学学报(工学版), 2019, 53(7): 1349-1353.
[3]	王红霞,陈健,程艳芬. 采用评论挖掘修正用户评分的改进协同过滤算法[J]. 浙江大学学报(工学版), 2019, 53(3): 522-532.
[4]	厉小军,柳虹,施寒潇,朱柳青,张亚辉. 基于深度学习的课程推荐模型[J]. 浙江大学学报(工学版), 2019, 53(11): 2139-2145.
[5]	刘臻, 武泽慧, 曹琰, 魏强. 基于漏洞指纹的软件脆弱性代码复用检测方法[J]. 浙江大学学报(工学版), 2018, 52(11): 2180-2190.
[6]	任迪, 万健, 殷昱煜, 周丽, 高敏. 基于贝叶斯分类的Web服务质量预测方法研究[J]. 浙江大学学报(工学版), 2017, 51(6): 1242-1251.
[7]	毛宜钰, 刘建勋, 胡蓉, 唐明董. 基于Logistic函数和用户聚类的协同过滤算法[J]. 浙江大学学报(工学版), 2017, 51(6): 1252-1258.
[8]	居斌, 钱沄涛, 叶敏超. 基于结构投影非负矩阵分解的协同过滤算法[J]. 浙江大学学报(工学版), 2015, 49(7): 1319-1325.
[9]	扈中凯，郑小林，吴亚峰，陈德人. 基于用户评论挖掘的产品推荐算法[J]. J4, 2013, 47(8): 1475-1485.

Viewed

Full text

Abstract

Cited

Shared

Discussed