API recommendation method based on natural nearest neighbors and collaborative filtering

doi:10.3785/j.issn.1008-973X.2022.03.008

Journal of ZheJiang University (Engineering Science)

2022, Vol. 56

Issue (3): 494-502 DOI: 10.3785/j.issn.1008-973X.2022.03.008

API recommendation method based on natural nearest neighbors and collaborative filtering

Huang-he ZHENG(

),Zhi-qiu HUANG*(

),Wei-wei LI,Yao-shen YU,Yong-chao WANG

College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

Download:

HTML

PDF(914KB) HTML
Export: BibTeX | EndNote (RIS)

Abstract

An API recommendation method based on natural nearest neighbors and collaborative filtering named N-APIRec was proposed in order to solve the problem of recommendation performance degradation caused by improper neighbor selection. In this model, BM25 algorithm was used to transform the projects into vectors. Then the natural neighbor algorithm was used to filter the similar projects in the dataset to reduce the search scope, and the similar method declarations were filtered from the similar projects. Finally, the APIs were recommended through collaborative filtering. N-APIRec was compared with the state-of-the-art approach on MV and SH data sets. The results were verified the effectiveness of N-APIRec, the success rate of MV and SH data sets recommendation was 77.38%and 30.00% respectively, which was better than the existing methods.

Key words： code reuse API recommendation natural nearest neighbors BM25 collaborative filtering

Received: 19 August 2021 Published: 29 March 2022

CLC:

TP 391

Fund: 国家重点研发计划资助项目（2018YFB1003900）

Corresponding Authors: Zhi-qiu HUANG E-mail: sz1916053@nuaa.edu.cn;zqhuang@nuaa.edu.cn

	Service
	E-mail this article
	Add to my bookshelf
	Add to citation manager
	E-mail Alert
	RSS
	Articles by authors
	Huang-he ZHENG
	Zhi-qiu HUANG
	Wei-wei LI
	Yao-shen YU
	Yong-chao WANG

Cite this article:

Huang-he ZHENG,Zhi-qiu HUANG,Wei-wei LI,Yao-shen YU,Yong-chao WANG. API recommendation method based on natural nearest neighbors and collaborative filtering. Journal of ZheJiang University (Engineering Science), 2022, 56(3): 494-502.

URL:

https://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2022.03.008 OR https://www.zjujournals.com/eng/Y2022/V56/I3/494

基于自然近邻与协同过滤的API推荐方法

为了解决由于近邻选择不恰当导致的推荐性能下降问题，提出基于自然近邻与协同过滤的API推荐方法——N-APIRec. 该方法利用BM25算法将项目转换成向量，以自然近邻算法筛选数据集中的相似项目以减少搜索范围，从相似项目中筛选相似的方法声明，通过协同过滤的方式推荐API. 将N-APIRec在MV、SH数据集上与前沿方法进行实验对比，结果验证了N-APIRec的有效性，在MV、SH数据集上的推荐成功率分别为77.38%、30.00%，优于现有方法.

关键词： 代码复用, API推荐, 自然近邻, BM25, 协同过滤

Fig.1 Workflow of N-APIRec

Fig.2 Graph of API call structure

Tab.1 Similarity list of project g_batik-codec-1.8

Fig.3 Similarity curve of project

Tab.2 Statistics of experimental data sets

Tab.3 Performance of N-APIRec on MV and SH data sets

Tab.4 Success rates of different approach on SH data set

Fig.4 Precision-recall curves on different data sets

Tab.5 Success rates of different approach on MV data set

Tab.6 Success rate of different project completeness on SH data set

Tab.7 Recall rate of different method declaration completeness on SH data set


[1]	NIE L, JIANG H, REN Z, et al Query expansion based on crowd knowledge for code search[J]. IEEE Transactions on Services Computing, 2016, 9 (5): 771- 783 doi: 10.1109/TSC.2016.2560165

[2]	JIANG H, NIE L, SUN Z, et al ROSF: leveraging information retrieval and supervised learning for recommending code snippets[J]. IEEE Transactions on Services Computing, 2019, 12 (1): 34- 46 doi: 10.1109/TSC.2016.2592909

[3]	RAGHOTHAMAN M, WEI Y, HAMADI Y. SWIM: synthesizing what I mean—code search and idiomatic snippet synthesis [C]// Proceedings of the 38th International Conference on Software Engineering. Austin: ACM, 2016: 357–367.

[4]	GU X, ZHANG H, ZHANG D, et al. Deep API learning [C]// Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Seattle: ACM, 2016: 631-642.

[5]	CAI L, WANG H, HUANG Q, et al. BIKER: a tool for Bi-information source based API method recommendation [C]// Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Tallinn: ACM, 2019: 1075-1079.

[6]	ZHOU Y, YANG X, CHEN T, et al Boosting API recommendation with implicit feedback[J]. IEEE Transactions on Software Engineering, 2021, 1 (1): 1

[7]	XIE W, PENG X, LIU M, et al. API method recommendation via explicit matching of functionality verb phrases [C]// Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Sacramento: ACM, 2020: 1015-1026.

[8]	LECUN Y, YOSHUA B Convolutional networks for images, speech, and time series[J]. The Handbook of Brain Theory and Neural Networks, 1995, 3361 (10): 1995

[9]	HOCHREITER S, SCHMIDHUBER J Long short-term memory[J]. Neural computation, 1997, 9 (8): 1735- 1780 doi: 10.1162/neco.1997.9.8.1735

[10]	SCARSELLI F, GORI M, TSOI A, et al The graph neural network model[J]. IEEE Transactions on Neural Networks, 2008, 20 (1): 61- 80

[11]	LING C, ZOU Y, XIE B. Graph neural network based collaborative filtering for API usage recommendation [C]// 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering . Tokyo: IEEE, 2021: 36-47.

[12]	ZHONG H, XIE T, ZHANG L, et al. MAPO: Mining and recommending API usage patterns [C]// European Conference on Object-Oriented Programming. Genoa: Springer, 2009: 318-343.

[13]	WANG J, DANG Y, ZHANG H, et al. Mining succinct and high-coverage API usage patterns from source code [C]// 2013 10th Working Conference on Mining Software Repositories. San Francisco: IEEE, 2013: 319-328.

[14]	FOWKES J, SUTTON C. Parameter-free probabilistic API mining across GitHub [C]// Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. Seattle : ACM, 2016: 254-265.

[15]	NGUYEN P T, ROCCO J D, RUSCIO D D, et al. FOCUS: a recommender system for mining API function calls and usage patterns [C]// International Conference on Software Engineering. Montreal: IEEE, 2019: 1050-1060.

[16]	CHEN A. Context-aware collaborative filtering system: predicting the user’s preference in the ubiquitous computing environment [C]// International Symposium on Location-and Context-Awareness. Berlin: Springer, 2005: 244-253.

[17]	GUO G, WANG H, BELL D, et al. KNN model-based approach in classification [C]// On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. Montpellier: Springer, 2003: 986-996.

[18]	NGUYEN T, NGUYEN A, PHAN H, et al. Combining Word2Vec with revised vector space model for better code retrieval [C]// 2017 IEEE/ACM 39th International Conference on Software Engineering Companion. Buenos Aires: IEEE, 2017: 183-185.

[19]	NIU J, ZHAO Q, WANG L, et al. OnSeS: a novel online short text summarization based on BM25 and neural network [C]// 2016 IEEE Global Communications Conference. Washington: IEEE, 2016: 1-6.

[20]	RAMOS J. Using TF-IDF to determine word relevance in document queries [C]// Proceedings of the first instructional conference on machine learning. Moscow: Citeseer, 2003: 29-48.

[21]	ZHU Q, HUANG J, FENG J, et al A clustering algorithm based on natural nearest neighbor[J]. Journal of Computational Information Systems, 2014, 10 (13): 5473- 5480

[22]	XIE R, KONG X, WANG L, et al. Hirec: API recommendation using hierarchical context [C]// 2019 IEEE 30th International Symposium on Software Reliability Engineering. Berlin: IEEE, 2019: 369-379.

[23]	BASTEN B, HILLS M, KLINT P, et al. M3: a general model for code analytics in rascal [C]// 2015 IEEE 1st International Workshop on Software Analytics. Montreal: IEEE, 2015: 25-28.

[24]	JACCARD P The distribution of the flora in the alpine zone[J]. New phytologist, 1912, XI (2): 37- 50

[25]	NAH F A study on tolerable waiting time: how long are web users willing to wait?[J]. Behaviour Information Technology, 2004, 23 (3): 153- 163 doi: 10.1080/01449290410001669914

[1]	Nuo LI,Bin GUO,Yan LIU,Yao JING,Zhi-wen YU. Intelligent commercial site recommendation with neural collaborative filtering[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(9): 1788-1794.

[2]	Li-yan DONG,Jia-huan JIN,Yuan-cheng FANG,Yue-qun WANG,Yong-li LI,Ming-hui SUN. Slope One algorithm based on nonnegative matrix factorization[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(7): 1349-1353.

[3]	Hong-xia WANG,Jian CHEN,Yan-fen CHENG. Improved collaborative filtering algorithm to revise users' rating by review mining[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(3): 522-532.

[4]	Xiao-jun LI,Hong LIU,Han-xiao SHI,Liu-qing ZHU,Ya-hui ZHANG. Deep learning based course recommendation model[J]. Journal of ZheJiang University (Engineering Science), 2019, 53(11): 2139-2145.

[5]	LIU Zhen, WU Ze-hui, CAO Yan, WEI Qiang. Software vulnerable code reuse detection method based on vulnerability fingerprint[J]. Journal of ZheJiang University (Engineering Science), 2018, 52(11): 2180-2190.

[6]	REN Di, WAN Jian, YIN Yu-yu, ZHOU Li, GAO Min. Web services QoS prediction method based on Bayes classification[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(6): 1242-1251.

[7]	MAO Yi-yu, LIU Jian-xun, HU Rong, TANG Ming-dong. Collaborative filtering algorithm based on Logistic function and user clustering[J]. Journal of ZheJiang University (Engineering Science), 2017, 51(6): 1252-1258.

[8]	JU Bin, QIAN Yun-tao, YE Min-chao. Collaborative filtering algorithm based on structured projective nonnegative matrix factorization[J]. Journal of ZheJiang University (Engineering Science), 2015, 49(7): 1319-1325.

[9]	HU Zhong-kai, ZHENG Xiao-lin, WU Ya-feng, CHEN De-ren. Product recommendation algorithm based on users’ reviews mining[J]. Journal of ZheJiang University (Engineering Science), 2013, 47(8): 1475-1485.

Viewed

Full text

Abstract

Cited

Shared

Discussed