|
|
Survey on program representation learning |
Jun-chi MA(),Xiao-xin DI,Zong-tao DUAN,Lei TANG |
College of Information Engineering, Chang’an University, Xi’an 710064, China |
|
|
Abstract There has been a trend of intelligent development using artificial intelligence technology in order to improve the efficiency of software development. It is important to understand program semantics to support intelligent development. A series of research work on program representation learning has emerged to solve the problem. Program representation learning can automatically learn useful features from programs and represent the features as low-dimensional dense vectors in order to efficiently extract program semantic and apply it to corresponding downstream tasks. A comprehensive review to categorize and analyze existing research work of program representation learning was provided. The mainstream models for program representation learning were introduced, including the frameworks based on graph structure and token sequence. Then the applications of program representation learning technology in defect detection, defect localization, code completion and other tasks were described. The common toolsets and benchmarks for program representation learning were summarized. The challenges for program representation learning in the future were analyzed.
|
Received: 08 March 2022
Published: 17 January 2023
|
|
Fund: 国家自然基金青年资助项目(62002030);陕西省重点研发资助项目(2019ZDLGY17-08, 2019ZDLGY03-09-01, 2019GY-006, 2020GY-013) |
程序表示学习综述
为了提高软件的开发效率,目前已出现应用人工智能技术进行智能化开发的趋势,如何理解程序语义是智能化开发中需要重点解决的问题. 针对该问题,出现了一系列程序表示学习的研究,程序表示学习可以自动地从程序中学习有用的特征,将特征表示为低维稠密向量,高效地提取程序语义并使用于相应的下游任务. 对程序表示学习的研究工作进行综述,介绍了主流的程序表示学习模型,包括基于图结构和基于token序列的程序表示学习框架. 展示了程序表示学习技术在缺陷检测、缺陷定位、代码补全等任务上的应用,总结了程序表示学习的常用工具集和测试集. 分析了程序表示学习未来需要应对的挑战.
关键词:
软件工程,
表示学习,
程序语义,
神经网络,
深度学习
|
|
[1] |
PEROZZI B, ALRFOU R, SKIENA S. Deepwalk: Online learning of social representations [C]// Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710.
|
|
|
[2] |
VELIČKOVIĆ P, CUCURULL G, CASANOVA A, et al. Graph attention networks [C]// International Conference on Learning Representations. Vancouver: IEEE, 2018: 164-175.
|
|
|
[3] |
KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks [C]// International Conference on Learning Representations. Toulon: IEEE, 2017: 12-26.
|
|
|
[4] |
FERRANTE J, OTTENSTEIN K J, WARREN J D The program dependence graph and its use in optimization[J]. ACM Transactions on Programming Languages and Systems, 1987, 9 (3): 319- 349
doi: 10.1145/24039.24041
|
|
|
[5] |
MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [C]// International Conference on Learning Representations. Scottsdale: IEEE, 2013: 1-12.
|
|
|
[6] |
WHITE M, VENDOME C, LINARES-VASQUEZ M, et al. Toward deep learning software repositories [C]// IEEE/ACM 12th Working Conference on Mining Software Repositories. Florence: IEEE, 2015: 334-345.
|
|
|
[7] |
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need [C]// 31st Conference on Neural Information Processing Systems. Long Beach: IEEE. 2017: 5998-6008.
|
|
|
[8] |
ALON U, ZILBERSTEIN M, LEVY O, et al. code2vec: learning distributed representations of code [C]// Proceedings of the ACM on Programming Languages. Phoenix: ACM, 2019: 1-29.
|
|
|
[9] |
ALON U, BRODY S, LEVY O, et al. code2seq: generating sequences from structured representations of code [C]// International Conference on Learning Representations. New Orleans: IEEE, 2019: 1-22.
|
|
|
[10] |
LI Y, WANG S, NGUYEN T N, et al. Improving bug detection via context-based code representation learning and attention-based neural networks [C]// Proceedings of the ACM on Programming Languages. Phoenix: ACM, 2019: 1-30.
|
|
|
[11] |
VAGAVOLU D, SWARNA K C, CHIMALAKONDA S. A mocktail of source code representations [C]// International Conference on Automated Software Engineering. Melbourne: IEEE, 2021: 1269-1300.
|
|
|
[12] |
SHI E, WANG Y, DU L, et al. Enhancing code summarization with hierarchical splitting and reconstruction of abstract syntax trees [C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Punta Cana: IEEE, 2021: 4053-4062.
|
|
|
[13] |
ZHANG J, WANG X, ZHANG H, et al. A novel neural source code representation based on abstract syntax tree [C]// 41st International Conference on Software Engineering. Montreal: IEEE, 2019: 783-794.
|
|
|
[14] |
BUI N D Q, YU Y, JIANG L. InferCode: self-supervised learning of code representations by predicting subtrees [C]// 43rd International Conference on Software Engineering. Madrid: IEEE, 2021: 1186-1197.
|
|
|
[15] |
JAYASUNDARA M H V Y, BUI D Q N, JIANG L, et al. TreeCaps: tree-structured capsule networks for program source code processing [C]// Workshop on Machine Learning for Systems at the Conference on Neural Information Processing Systems. Vancouver: IEEE, 2019: 8-14.
|
|
|
[16] |
BUCH L, ANDRZEJAK A. Learning-based recursive aggregation of abstract syntax trees for code clone detection [C]// International Conference on Software Analysis, Evolution and Reengineering. Hangzhou: IEEE, 2019: 95-104.
|
|
|
[17] |
LIU C, WANG X, SHIN R, et al. Neural code completion [C]// International Conference on Learning Representations. Toulon: IEEE, 2017: 1-14.
|
|
|
[18] |
HU X, LI G, XIA X, et al. Deep code comment generation [C]// International Conference on Program Comprehension. Gothenburg: IEEE, 2018: 200-210.
|
|
|
[19] |
LECLAIR A, JIANG S, MCMILLANCM C. A neural model for generating natural language summaries of program subroutines [C]// 41st International Conference on Software Engineering. Montreal: IEEE, 2019: 795-806.
|
|
|
[20] |
HAQUE S, LECLAIR A, WU L, et al. Improved automatic summarization of subroutines via attention to file context [C]// International Conference on Mining Software Repositories. New York: ACM, 2020: 300-310.
|
|
|
[21] |
JIANG H, SONG L, GE Y, et al An AST structure enhanced decoder for code generation[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2021, 30: 468- 476
|
|
|
[22] |
ALLAMANIS M, BROCKSCHMIDT M, KHADEMI M. Learning to represent programs with graphs [C]// International Conference on Learning Representations. Vancouver: IEEE, 2018: 1-17.
|
|
|
[23] |
WANG Y, LI H. Code completion by modeling flattened abstract syntax trees as graphs [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2021: 14015-14023.
|
|
|
[24] |
YANG K, YU H, FAN G, et al A graph sequence neural architecture for code completion with semantic structure features[J]. Journal of Software: Evolution and Process, 2022, 34 (1): 1- 22
|
|
|
[25] |
BEN-NUN T, JAKOBOVITS A S, HOEFLER T Neural code comprehension: a learnable representation of code semantics[J]. Advances in Neural Information Processing Systems, 2018, 31 (1): 3589- 3601
|
|
|
[26] |
LATTNER C, ADVE V. LLVM: a compilation framework for lifelong program analysis and transformation [C]// International Symposium on Code Generation and Optimization. San Jose: IEEE, 2004: 75-86.
|
|
|
[27] |
WANG Z, YU L, WANG S, et al. Spotting silent buffer overflows in execution trace through graph neural network assisted data flow analysis [EB/OL]. (2021-02-20). https://arxiv.org/abs/2102.10452.
|
|
|
[28] |
SCHLICHTKRULL M, KIPF T N, BLOEM P, et al. Modeling relational data with graph convolutional networks [C]// European Semantic Web Conference. Heraklion: IEEE, 2018: 593-607.
|
|
|
[29] |
WANG W, ZHANG K, LI G, et al. Learning to represent programs with heterogeneous graphs [EB/OL]. (2020-12-08). https://arxiv.org/abs/2012.04188.
|
|
|
[30] |
CUMMINS C, FISCHES Z V, BEN-NUN T, et al. PROGRAML: a graph-based program representation for data flow analysis and compiler optimizations [C]// International Conference on Machine Learning. Vienna: IEEE, 2021: 2244-2253.
|
|
|
[31] |
TUFANO M, WATSON C, BAVOTA G, et al. Deep learning similarities from different representations of source code [C]// International Conference on Mining Software Repositories. Gothenburg: IEEE, 2018: 542-553.
|
|
|
[32] |
OU M, WATSON C, PEI J, et al. Asymmetric transitivity preserving graph embedding [C]// International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1105-1114.
|
|
|
[33] |
SUI Y, CHENG X, ZHANG G, et al. Flow2vec: Value-flow-based precise code embedding [C]// Proceedings of the ACM on Programming Languages. [S. l.]: ACM, 2020: 1-27.
|
|
|
[34] |
MEHROTRA N, AGARWAL N, GUPTA P, et al Modeling functional similarity in source code with graph-based Siamese networks[J]. IEEE Transactions on Software Engineering, 2021, 48 (10): 1- 22
|
|
|
[35] |
KARMAKAR A, ROBBES R. What do pre-trained code models know about code? [C]// International Conference on Automated Software Engineering. Melbourne: IEEE, 2021: 1332-1336.
|
|
|
[36] |
HINDLE A, BARR E T, SU Z, et al. On the naturalness of software [C]// Proceedings of the 34th International Conference on Software Engineering. Zurich: IEEE, 2012: 837-847.
|
|
|
[37] |
BIELIK P, RAYCHEV V, VECHEV M. Program synthesis for character level language modeling [C]// 5th International Conference on Learning Representations. Toulon: IEEE, 2017: 1-17.
|
|
|
[38] |
HELLENDOORN V, DEVANBU P. Are deep neural networks the best choice for modeling source code? [C]// Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. New York: IEEE, 2017: 763-773.
|
|
|
[39] |
DAM H K, TRAN T, PHAM T T M. A deep language model for software code [C]// Proceedings of the Foundations Software Engineering International Symposium. Seattle: ACM, 2016: 1-4.
|
|
|
[40] |
BHOOPCHAND A, ROCKTASCHEL T, BARR E, et al. Learning python code suggestion with a sparse pointer network [C]// International Conference on Learning Representations. Toulon: IEEE, 2017: 1-11.
|
|
|
[41] |
LIU F, ZHANG L, JIN Z Modeling programs hierarchically with stack-augmented LSTM[J]. Journal of Systems and Software, 2020, 164 (11): 1- 16
|
|
|
[42] |
LI B, YAN M, XIA X, et al. DeepCommenter: a deep code comment generation tool with hybrid lexical and syntactical information [C]// Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York: ACM, 2020: 1571-1575.
|
|
|
[43] |
LECLAIR A, HAQUE S, WU L, et al. Improved code summarization via a graph neural network [C]// International Conference on Program Comprehension. Seoul: ACM, 2020: 184-195.
|
|
|
[44] |
PENNINGTON J, SOCHER R, MANNING C D. Glove: global vectors for word representation [C]// Conference on Empirical Methods in Natural Language Processing. Doha: ACM, 2014: 1532-1543.
|
|
|
[45] |
DEVLIN J, CHANG M W, LEE K, et al. Bert: pre-training of deep bidirectional transformers for language understanding [EB/OL]. (2018-10-11). https://arxiv.org/abs/1810.04805.
|
|
|
[46] |
FENG Z, GUO D, TANG D, et al. Codebert: a pre-trained model for programming and natural languages [C]. // Conference on Empirical Methods in Natural Language Processing. [S. l.]: ACM, 2020: 1536-1547.
|
|
|
[47] |
JIANG N, LUTELLIER T, TAN L. CURE: code-aware neural machine translation for automatic program repair [C]// International Conference on Software Engineering. Madrid: IEEE, 2021: 1161-1173.
|
|
|
[48] |
GAO S, CHEN C, XING Z, et al. A neural model for method name generation from functional description [C]// 26th IEEE International Conference on Software Analysis, Evolution and Reengineering. Hangzhou: IEEE, 2019: 414-421.
|
|
|
[49] |
KARAMPATSIS R M, SUTTON C. Maybe deep neural networks are the best choice for modeling source code [EB/OL]. (2019-03-13). https://arxiv.org/abs/1903.05734.
|
|
|
[50] |
YE G, TANG Z, WANG H, et al. Deep program structure modeling through multi-relational graph-based learning [C]// Proceedings of the ACM International Conference on Parallel Architectures and Compilation Techniques. Georgia: ACM, 2020: 111-123.
|
|
|
[51] |
MA W, ZHAO M, SOREMEKUN E, et al. GraphCode2Vec: generic code embedding via lexical and program dependence analyses [EB/OL]. (2021-12-02). https://arxiv.org/abs/2112.01218.
|
|
|
[52] |
ZHOU Y, LIU S, SIOW J K, et al. Devign: effective vulnerability identification by learning comprehensive program semantics via graph neural networks [C]// Advances in Neural Information Processing Systems. Vancouver: IEEE, 2019: 1-11.
|
|
|
[53] |
FANG C, LIU Z, SHI Y, et al. Functional code clone detection with syntax and semantics fusion learning [C]// Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2020: 516-527.
|
|
|
[54] |
WANG H, YE G, TANG Z, et al Combining graph-based learning with automated data collection for code vulnerability detection[J]. IEEE Transactions on Information Forensics and Security, 2020, 16 (1): 1943- 1958
|
|
|
[55] |
WU H, ZHAO H, ZHANG M. SIT3: code summarization with structure-induced transformer [C]// Annual Meeting of the Association for Computational Linguistics. [S. l.]: IEEE, 2021: 1078-1090.
|
|
|
[56] |
GUO D, REN S, LU S, et al. GraphCodeBERT: pre-training code representations with data flow [C]// International Conference on Learning Representations. [S. l. ]: IEEE, 2021: 1-18.
|
|
|
[57] |
GAO S, GAO C, HE Y, et al. Code structure guided transformer for source code summarization [EB/OL]. (2021-04-19). https://arxiv.org/abs/2104.09340.
|
|
|
[58] |
RAY B, HELLENDOORN V, GODHANE S, et al. On the naturalness of buggy code [C]// 38th IEEE/ACM International Conference on Software Engineering. Austin: IEEE, 2016: 428-439.
|
|
|
[59] |
张献, 贲可荣, 曾杰 基于代码自然性的切片粒度缺陷预测方法[J]. 软件学报, 2021, 32 (7): 2219- 2241 ZHANG Xian, BEN Ke-rong, ZENG Jie Slice granularity defect prediction method based on code naturalness[J]. Journal of Software, 2021, 32 (7): 2219- 2241
|
|
|
[60] |
陈皓, 易平 基于图神经网络的代码漏洞检测方法[J]. 网络与信息安全学报, 2021, 7 (3): 37- 40 CHEN Hao, YI Ping Code vulnerability detection method based on graph neural network[J]. Journal of Network and Information Security, 2021, 7 (3): 37- 40
|
|
|
[61] |
PHAN A V, LE NGUYEN M, BUI L T. Convolutional neural networks over control flow graphs for software defect prediction [C]// 29th International Conference on Tools with Artificial Intelligence. Boston: IEEE, 2017: 45-52.
|
|
|
[62] |
WANG S, LIU T, NAM J, et al Deep semantic feature learning for software defect prediction[J]. IEEE Transactions on Software Engineering, 2018, 46 (12): 1267- 1293
|
|
|
[63] |
XU J, WANG F, AI J Defect prediction with semantics and context features of codes based on graph representation learning[J]. IEEE Transactions on Reliability, 2020, 70 (2): 613- 625
|
|
|
[64] |
WANG H, ZHUANG W, ZHANG X Software defect prediction based on gated hierarchical LSTMs[J]. IEEE Transactions on Reliability, 2021, 70 (2): 711- 727
|
|
|
[65] |
GROVER A, LESKOVEC J. node2vec: scalable feature learning for networks [C]// 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 855-864.
|
|
|
[66] |
HUO X, THUNG F, LI M, et al Deep transfer bug localization[J]. IEEE Transactions on Software Engineering, 2019, 47 (7): 1368- 1380
|
|
|
[67] |
ZHU Z, LI Y, TONG H, et al. CooBa: cross-project bug localization via adversarial transfer learning [C]// 29th International Joint Conference on Artificial Intelligence. Yokohama: IEEE, 2020: 3565-3571.
|
|
|
[68] |
YANG S, CAO J, ZENG H, et al. Locating faulty methods with a mixed RNN and attention model [C]// 29th International Conference on Program Comprehension. Madrid: IEEE, 2021: 207-218.
|
|
|
[69] |
LUTELLIER T, PHAM H V, PANG L, et al. Coconut: combining context-aware neural translation models using ensemble for program repair [C]// Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. New York: ACM, 2020: 101-114.
|
|
|
[70] |
LI Y, WANG S, NGUYEN T N. Dlfix: context-based code transformation learning for automated program repair [C]// Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering. Seoul: IEEE, 2020: 602-614.
|
|
|
[71] |
杨博, 张能, 李善平, 等 智能代码补全研究综述[J]. 软件学报, 2020, 31 (5): 1435- 1453 YANG Bo, ZHANG Neng, LI Shan-ping, et al Review of intelligent code completion[J]. Journal of Software, 2020, 31 (5): 1435- 1453
|
|
|
[72] |
LI J, WANG Y, LYU M R, et al. Code completion with neural attention and pointer networks [C]// Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm: AAAI, 2018: 4159-4225.
|
|
|
[73] |
LIU F, LI G, WEI B, et al. A self-attentional neural architecture for code completion with multi-task learning [C]// Proceedings of the 28th International Conference on Program Comprehension. Seoul: ACM, 2020: 37-47.
|
|
|
[74] |
BROCKSCHMIDT M, ALLAMANIS M, GAUNT A L, et al. Generative code modeling with graphs [C]// International Conference on Learning Representations. New Orleans: IEEE, 2019: 1-24.
|
|
|
[75] |
YONAI H, HAYASE Y, KITAGAWA H. Mercem: method name recommendation based on call graph embedding [C]// 26th Asia-Pacific Software Engineering Conference. Putrajaya: IEEE, 2019: 134-141.
|
|
|
[76] |
ALLAMANIS M, PENG H, SUTTON C. A convolutional attention network for extreme summarization of source code [C]// International Conference on Machine Learning. New York: IEEE, 2016: 2091-2100.
|
|
|
[77] |
ZHANG F, CHEN B, LI R, et al A hybrid code representation learning approach for predicting method names[J]. Journal of Systems and Software, 2021, 180 (16): 110- 111
|
|
|
[78] |
IYER S, KONSTAS I, CHEUNG A, et al. Summarizing source code using a neural attention model [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin: ACM, 2016: 2073-2083.
|
|
|
[79] |
YANG Z, KEUNG J, YU X, et al. A multi-modal transformer-based code summarization approach for smart contracts [C]// 29th International Conference on Program Comprehension. Madrid: IEEE, 2021: 1-12.
|
|
|
[80] |
陈秋远, 李善平, 鄢萌, 等 代码克隆检测研究进展[J]. 软件学报, 2019, 30 (4): 962- 980 CHEN Qiu-yuan, LI Shan-ping, YAN Meng, et al Research progress of code clone detection[J]. Journal of Software, 2019, 30 (4): 962- 980
doi: 10.13328/j.cnki.jos.005711
|
|
|
[81] |
BARCHI F, PARISI E, URGESE G, et al Exploration of convolutional neural network models for source code classification[J]. Engineering Applications of Artificial Intelligence, 2021, 97 (20): 104- 175
|
|
|
[82] |
PARISI E, BARCHI F, BARTOLINI A, et al Making the most of scarce input data in deep learning-based source code classification for heterogeneous device mapping[J]. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2021, 41 (6): 1- 12
|
|
|
[83] |
XIAO Y, MA G, AHMED N K, et al. Deep graph learning for program analysis and system optimization [C]// Graph Neural Networks and Systems Workshop. [S. l. ]: IEEE, 2021: 1-8.
|
|
|
[84] |
BRAUCKMANN A, GOENS A, ERTEL S, et al. Compiler-based graph representations for deep learning models of code [C]// Proceedings of the 29th International Conference on Compiler Construction. San Diego: ACM, 2020: 201-211.
|
|
|
[85] |
JIAO J, PAL D, DENG C, et al. GLAIVE: graph learning assisted instruction vulnerability estimation [C]// Design, Automation and Test in Europe Conference and Exhibition. Grenoble: IEEE, 2021: 82-87.
|
|
|
[86] |
HAMILTON W L, YING R, LESKOVEC J. Inductive representation learning on large graphs [J]. Advances in Neural Information Processing Systems, 2017, 30(1): 128-156.
|
|
|
[87] |
MA J, DUAN Z, TANG L. GATPS: an attention-based graph neural network for predicting SDC-causing instructions [C]// 39th IEEE VLSI Test Symposium. San Diego: IEEE, 2021: 1-7.
|
|
|
[88] |
WANG J, ZHANG C. Software reliability prediction using a deep learning model based on the RNN encoder–decoder [J]. Reliability Engineering and System Safety, 2018, 170(20): 73-82.
|
|
|
[89] |
NIU W, ZHANG X, DU X, et al A deep learning based static taint analysis approach for IoT software vulnerability location[J]. Measurement, 2020, 32 (152): 107- 139
|
|
|
[90] |
XU X, LIU C, FENG Q, et al. Neural network-based graph embedding for cross-platform binary code similarity detection [C]// ACM SIGSAC Conference on Computer and Communications Security. Dallas: ACM, 2017: 363-376.
|
|
|
[91] |
DAI H, DAI B, SONG L. Discriminative embeddings of latent variable models for structured data [C]// International Conference on Machine Learning. Dallas: IEEE, 2016: 2702-2711.
|
|
|
[92] |
YU Z, CAO R, TANG Q, et al. Order matters: semantic-aware neural networks for binary code similarity detection [C]// Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2020, 34: 1145-1152.
|
|
|
[93] |
DING S H H, FUNG B C M, CHARLAND P. Asm2vec: boosting static representation robustness for binary clone search against code obfuscation and compiler optimization [C]// IEEE Symposium on Security and Privacy. San Francisco: IEEE, 2019: 472-489.
|
|
|
[94] |
LE Q, MIKOLOV T. Distributed representations of sentences and documents [C]// International Conference on Machine Learning. Dallas: IEEE, 2014: 1188-1196.
|
|
|
[95] |
YANG J, FU C, LIU X Y, et al Codee: a tensor embedding scheme for binary code search[J]. IEEE Transactions on Software Engineering, 2021, 48 (7): 1- 20
|
|
|
[96] |
DUAN Y, LI X, WANG J, et al. Deepbindiff: learning program-wide code representations for binary diffing [C]// Network and Distributed System Security Symposium. San Diego: IEEE, 2020: 1-12.
|
|
|
[97] |
TANG J, QU M, WANG M, et al. Line: large-scale information network embedding [C]// Proceedings of the 24th International Conference on World Wide Web. Florence: ACM, 2015: 1067-1077.
|
|
|
[98] |
YANG C, LIU Z, ZHAO D, et al. Network representation learning with rich text information [C]// International Joint Conference on Artificial Intelligence. Buenos Aires: AAAI, 2015: 2111-2117.
|
|
|
[99] |
BRAUCKMANN A, GOENS A, CASTRILLON J. ComPy-Learn: a toolbox for exploring machine learning representations for compilers [C]// 2020 Forum for Specification and Design Languages. Kiel: IEEE, 2020: 1-4.
|
|
|
[100] |
CUMMINS C, WASTI B, GUO J, et al. CompilerGym: robust, performant compiler optimization environments for AI research [EB/OL]. (2021-09-17). https://arxiv.org/abs/2109.08267.
|
|
|
[101] |
MOU L, LI G, ZHANG L, et al. Convolutional neural networks over tree structures for programming language processing [C]// AAAI Conference on Artificial Intelligence. Washington: AAAI, 2016: 1287-1293.
|
|
|
[102] |
YE X, BUNESCU R, LIU C. Learning to rank relevant files for bug reports using domain knowledge [C]// Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. Hong Kong: ACM, 2014: 689-699.
|
|
|
[103] |
CUMMINS C, PETOUMENOS P, WANG Z, et al. End-to-end deep learning of optimization heuristics [C]// 26th International Conference on Parallel Architectures and Compilation Techniques. Portland: IEEE, 2017: 219-232.
|
|
|
[104] |
KANG H J, BISSYANDE T F, LO D. Assessing the generalizability of code2vec token embeddings [C]// 34th IEEE/ACM International Conference on Automated Software Engineering. San Diego: IEEE, 2019: 1-12.
|
|
|
[105] |
VENKATAKEERTHY S, AGGARWAL R, JAIN S, et al Ir2vec: LLVM ir based scalable program embeddings[J]. ACM Transactions on Architecture and Code Optimization, 2020, 17 (4): 1- 27
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|