Please wait a minute...
浙江大学学报(工学版)  2023, Vol. 57 Issue (8): 1505-1515    DOI: 10.3785/j.issn.1008-973X.2023.08.004
计算机技术     
基于GPU的区块链交易验签加速技术
崔璨1(),杨小虎1,*(),邱炜伟2,黄方蕾2
1. 浙江大学 计算机科学与技术学院,浙江 杭州 310027
2. 杭州趣链科技有限公司,浙江 杭州 310000
GPU-based acceleration technology for signature verification of blockchain transactions
Can CUI1(),Xiao-hu YANG1,*(),Wei-wei QIU2,Fang-lei HUANG2
1. College of Computer Science and Technology, Zhejiang University, Hangzhou 310027, China
2. Hangzhou Qulian Technology Co. Ltd, Hangzhou 310000, China
 全文: PDF(1166 KB)   HTML
摘要:

为了提高区块链中节点的验签效率,提出基于GPU的区块链交易验签加速技术. 结合CPU-GPU异构平台架构特性对交易验签过程进行分阶段优化,大幅提高SM2验签算法运行效率,同时充分利用GPU内核调用的异步性,有效降低交易验签过程的整体IO开销. 考虑到GPU计算能力强而分支预测能力弱的特性,提出改进的同时多点乘算法,不仅提升了GPU验签效率,而且增加了多线程并行规模. 所提方法将交易验签操作卸载至GPU处理,释放了节点被占用的CPU资源,在不修改区块链协议的情况下实现了区块链系统整体性能的提升. 基于RTX3080平台和国产许可区块链Hyperchain平台进行实验,结果表明,该方法峰值验签吞吐量为4.52×106 次/s,集成该方法的Hyperchain平台交易吞吐量提高了15.81%,且延迟下降了6.56%.

关键词: 区块链交易验签GPU加速吞吐量延迟    
Abstract:

A GPU-based acceleration technology for signature verification of blockchain transactions was proposed, in order to improve the signature verification efficiency of peers. A phased optimization for signature verification process of blockchain transactions was performed by combining the characteristics of CPU-GPU heterogeneous platform architecture, greatly improving the efficiency of the SM2 verification algorithm. The asynchronous nature of GPU kernel calls was fully utilized, effectivelly reducing the overall IO overhead of the transaction signature verification process. Considering the characteristics of GPU with strong computational power and weak branch prediction, an improved simultaneous multipoint multiplication algorithm was proposed, which not only improved the efficiency of GPU signature verification, but also increased the multi-threaded parallelism scale. The proposed method offloaded the transaction signature verification operation to GPU processing, which freed up the CPU resources of peers occupied and improved the overall performance of the blockchain system without modifying the blockchain protocols. Experimental results based on the RTX3080 platform and Hyperchain, a domestic permissioned blockchain, showed that the peak signature verification throughput of the proposed method was 4.52×106 transactions per second, and the transaction throughput of Hyperchain platform integrated with the proposed method increased by 15.81% while latency decreased by 6.56%.

Key words: blockchain    transaction signature verification    GPU acceleration    throughput    latency
收稿日期: 2022-10-21 出版日期: 2023-08-31
CLC:  TP 311  
基金资助: 浙江省科技计划资助项目(2022C01126);国家重点研发计划资助项目(2021YFB2701100)
通讯作者: 杨小虎     E-mail: cuican97@zju.edu.cn;yangxh@zju.edu.cn
作者简介: 崔璨(1997—),男,硕士生,从事区块链性能优化研究. orcid.org/0000-0002-4626-3020. E-mail: cuican97@zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  
崔璨
杨小虎
邱炜伟
黄方蕾

引用本文:

崔璨,杨小虎,邱炜伟,黄方蕾. 基于GPU的区块链交易验签加速技术[J]. 浙江大学学报(工学版), 2023, 57(8): 1505-1515.

Can CUI,Xiao-hu YANG,Wei-wei QIU,Fang-lei HUANG. GPU-based acceleration technology for signature verification of blockchain transactions. Journal of ZheJiang University (Engineering Science), 2023, 57(8): 1505-1515.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2023.08.004        https://www.zjujournals.com/eng/CN/Y2023/V57/I8/1505

图 1  基于GPU的区块链交易验签加速技术框架
图 2  列优先访存模式
图 3  验签加速技术吞吐量与延迟
优化阶段 是否优化 ${T_{\text{P}}}$/(105 次·s?1 ${L_{\text{P}}}$/ms $\lambda $/%
数据组织与传输 优化 45.2 38.03 11.06
未优化 40.7 34.33
GPU多线程
并行验签
优化 39.6 28.91 2490.00
未优化 1.59 1080
结果更新与确认 优化 45.2 38.03 3.91
未优化 43.5 36.09
表 1  各阶段优化效果评估
图 4  Euclid和Fermat验签方案吞吐量对比
图 5  Euclid和Fermat计算方案实验设计
图 6  Euclid和Fermat计算方案峰值吞吐量对比
图 7  同时多点乘法和常规多倍点乘法峰值吞吐量对比
实验方案 ${T_{\text{O}}}$/(次·s?1 ${L_{\text{B}}}$/ms
Hyperchain(使用本研究方案) 45410 58.42
Hyperchain(不使用本研究方案) 39209 62.52
表 2  集成优化效果验证
1 MEMON M, HUSSAIN S S, BAJWA U A, et al. Blockchain beyond bitcoin: blockchain technology challenges and real-world applications [C]// 2018 International Conference on Computing, Electronics and Communications Engineering (iCCECE). Southend: IEEE, 2018: 29-34.
2 WANG J, WANG H. Monoxide: scale out blockchains with asynchronous consensus zones [C]// 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19). Boston: USENIX Association, 2019: 95-112.
3 ZAMANI M, MOVAHEDI M, RAYKOVA M. Rapidchain: scaling blockchain via full sharding [C]// Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. New York: Association for Computing Machinery, 2018: 931-948.
4 CHUNG G, DESROSIERS L, GUPTA M, et al. Performance tuning and scaling enterprise blockchain applications [EB/OL]. (2019-12-24) [2022-10-09]. http://arxiv.org/abs/1912.11456.
5 MELONI N. New point addition formulae for ECC applications [C]// Arithmetic of Finite Fields. Berlin, Heidelberg: Springer-Verlag, 2007: 189-201.
6 AGRAWAL R, YANG J, JAVAID H. Efficient FPGA-based ecdsa verification engine for permissioned blockchains [C]// Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. New York: Association for Computing Machinery, 2022: 50.
7 SAKAKIBARA Y, TOKUSASHI Y, MORISHIMA S, et al. Accelerating blockchain transfer system using FPGA-based NIC [C]// 2018 IEEE International Conference on Parallel and Distributed Processing with Applications, Ubiquitous Computing and Communications, Big Data and Cloud Computing, Social Computing and Networking, Sustainable Computing and Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom). Melbourne: IEEE, 2018: 171-178.
8 JAVAID H, YANG J, SANTOSO N, et al. Blockchain machine: a network-attached hardware accelerator for hyperledger fabric [C]// 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS). Bologna: IEEE, 2022: 258-268.
9 PHAM H L, TRAN T H, PHAN T D, et al Double SHA-256 hardware architecture with compact message expander for bitcoin mining[J]. IEEE Access, 2020, 8: 139634- 139646
doi: 10.1109/ACCESS.2020.3012581
10 TRAN T H, PHAM H L, PHAN T D, et al BCA: a 530-mW multicore blockchain accelerator for power-constrained devices in securing decentralized networks[J]. IEEE Transactions on Circuits and Systems I: Regular Papers, 2021, 68 (10): 4245- 4258
doi: 10.1109/TCSI.2021.3102618
11 朱立, 俞欢, 詹士潇, 等 高性能联盟区块链技术研究[J]. 软件学报, 2019, 30 (6): 1577- 1593
ZHU Li, YU Huan, ZHAN Shi-xiao, et al Research on high-performance consortium blockchain technology[J]. Journal of Software, 2019, 30 (6): 1577- 1593
doi: 10.13328/j.cnki.jos.005737
12 MORISHIMA S, MATSUTANI H. Accelerating blockchain search of full nodes using GPUs [C]// 2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP). Cambridge: IEEE, 2018: 244-248.
13 ILIAKIS K, KOLIOGEORGI K, LITKE A, et al GPU accelerated blockchain over key-value database transactions[J]. IET Blockchain, 2022, 2 (1): 1- 12
doi: 10.1049/blc2.12011
14 MORISHIMA S. Scalable anomaly detection method for blockchain transactions using GPU [C]// 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT). Gold Coast: IEEE, 2019: 160-165.
15 PAN W, ZHENG F, ZHAO Y, et al An efficient elliptic curve cryptography signature server with GPU acceleration[J]. IEEE Transactions on Information Forensics and Security, 2017, 12 (1): 111- 122
doi: 10.1109/TIFS.2016.2603974
16 朱辉, 黄煜坤, 王枫为, 等 一种基于图形处理器的高吞吐量SM2数字签名计算方案[J]. 电子与信息学报, 2022, 44 (12): 4274- 4283
ZHU Hui, HUANG Yu-kun, WANG Feng-wei, et al A high throughput SM2 digital signature computing scheme over graphics processing unit platform[J]. Journal of Electronics and Information Technology, 2022, 44 (12): 4274- 4283
doi: 10.11999/JEIT211049
17 AL-ZUBAIDIE M, ZHANG Z, ZHANG J. Efficient and secure ECDSA algorithm and its applications: a survey [EB/OL]. [2022-10-01]. https://ijcnis.org/index.php/ijcnis/article/view/3827.
18 SEO H, KIM H, PARK T, et al Fixed-base comb with window-non-adjacent form (NAF) method for scalar multiplication[J]. Sensors, 2013, 13 (7): 9483- 9512
doi: 10.3390/s130709483
19 DRUCKER N, GUERON S Speeding-up P-256 ECDSA verification on x86-64 servers[J]. IEEE Letters of the Computer Society, 2019, 2 (2): 12- 15
doi: 10.1109/LOCS.2019.2911063
20 HANSER C, WAGNER C. Speeding up the fixed-base comb method for faster scalar multiplication on koblitz curves [M]// CUZZOCREA A, KITTL C, SIMOS D E, et al. Security engineering and intelligence informatics: Vol. 8128. Berlin, Heidelberg: Springer, 2013: 168-179.
21 MOHAMED N A F, HASHIM M H A, HUTTER M. Improved fixed-base comb method for fast scalar multiplication [M]// MITROKOTSA A, VAUDENAY S. Progress in cryptology - AFRICACRYPT 2012: Vol. 7374. Berlin, Heidelberg: Springer, 2012: 342-359.
22 ROBERT J M, NEGRE C, PLANTARD T Efficient fixed-base exponentiation and scalar multiplication based on a multiplicative splitting exponent recoding[J]. Journal of Cryptographic Engineering, 2019, 9 (2): 115- 136
doi: 10.1007/s13389-018-0196-7
23 CUDA C++ programming guide [EB/OL]. [2022-10-09]. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html.
24 HANKERSON D R, VANSTONE S A, MENEZES A J. Guide to elliptic curve cryptography [M]. New York: Springer, 2003.
25 CUDA C++ best practices guide [EB/OL]. [2022-10-09]. https://docs.nvidia.com/cuda/cuda-c-best-practices-guide/index.html.
26 RIVAIN M. Fast and regular algorithms for scalar multiplication over elliptic curves [J]. Cryptology ePrint Archive, 2011: 25.
27 BOS J W Low-latency elliptic curve scalar multiplication[J]. International Journal of Parallel Programming, 2012, 40 (5): 532- 550
doi: 10.1007/s10766-012-0198-5
28 CUI S, GROSSSCHÄDL J, LIU Z, et al. High-speed elliptic curve cryptography on the NVIDIA GT200 graphics processing unit [C]// Information Security Practice and Experience. Cham: Springer, 2014: 202-216.
[1] 王传华,张权,王慧敏,徐欣,麻瓯勃. 区块链架构下具有隐私保护的车联网信誉模型[J]. 浙江大学学报(工学版), 2023, 57(4): 760-772.
[2] 刘雪娇,宋庆武,夏莹杰. 基于区块链的车联网矩阵计算安全卸载方案[J]. 浙江大学学报(工学版), 2023, 57(1): 144-154.
[3] 杨晋生,王浩,高镇,郭朝晖. 基于双RSA累加器的无状态交易验证方案[J]. 浙江大学学报(工学版), 2023, 57(1): 178-189.
[4] 刘雪娇,王慧敏,夏莹杰,赵思苇. 具有隐私保护的车联网空间众包任务分配方法[J]. 浙江大学学报(工学版), 2022, 56(7): 1267-1275.
[5] 何苗,柏粉花,于卓,沈韬. 区块链中可公开验证密钥共享技术[J]. 浙江大学学报(工学版), 2022, 56(2): 306-312.
[6] 董思含,信俊昌,郝琨,姚钟铭,陈金义. 多区块链环境下的连接查询优化算法[J]. 浙江大学学报(工学版), 2022, 56(2): 313-321.
[7] 孙亮,李晓风,赵赫,余斌,周桐,李皙茹. 基于NFT的实物上链资产化方法[J]. 浙江大学学报(工学版), 2022, 56(10): 1900-1911.
[8] 梁秀波,吴俊涵,赵昱,尹可挺. 区块链数据安全管理和隐私保护技术研究综述[J]. 浙江大学学报(工学版), 2022, 56(1): 1-15.
[9] 骆阳,张为. 基于改进旋转因子的高性能FFT硬件设计[J]. 浙江大学学报(工学版), 2021, 55(6): 1199-1207.
[10] 刘雪娇,殷一丹,陈蔚,夏莹杰,许佳丽,韩立东. 基于区块链的车联网数据安全共享方案[J]. 浙江大学学报(工学版), 2021, 55(5): 957-965.
[11] 刘葛辉,陈绍宽,金华,刘爽,彭宏勤. 基于延迟时间模型的不完全检修计划优化模型[J]. 浙江大学学报(工学版), 2020, 54(7): 1298-1307.
[12] 罗珊, 周永潮, 张仪萍. 绿色屋面对雨水径流控制效果及影响因素[J]. 浙江大学学报(工学版), 2018, 52(5): 845-852.
[13] 苏慧, 冯华君, 徐之海, 李奇, 陈跃庭. 基于数值保真项优化的TDI遥感图像复原方法[J]. 浙江大学学报(工学版), 2018, 52(4): 674-679.
[14] 盛念祖, 李芳, 李晓风, 赵赫, 周桐. 基于区块链智能合约的物联网数据资产化方法[J]. 浙江大学学报(工学版), 2018, 52(11): 2150-2158.
[15] 丁加新, 陈英龙, 周华. 水辅成型浮动芯注射对制品残余壁厚的影响[J]. 浙江大学学报(工学版), 2017, 51(5): 937-945.