Please wait a minute...
浙江大学学报(理学版)  2023, Vol. 50 Issue (6): 722-735    DOI: 10.3785/j.issn.1008-9497.2023.06.007
第26届全国计算机辅助设计与图形学学术会议专题     
面向多尺度拓扑优化的渐进均匀化GPU并行算法研究
夏兆辉1,刘健力1,高百川1,聂涛1,余琛2(),陈龙3,余金桂4
1.华中科技大学 机械科学与工程学院/智能制造装备与技术全国重点实验室,湖北 武汉 430074
2.武汉轻工大学 数学与计算机学院,湖北 武汉 430023
3.上海理工大学 机械工程学院,上海 200093
4.武汉理工大学 机电工程学院,湖北 武汉 430070
Efficient GPU parallel strategy for multi-scale topology optimization via asymptotic homogenization
Zhaohui XIA1,Jianli LIU1,Baichuan GAO1,Tao NIE1,Chen YU2(),Long CHEN3,Jingui YU4
1.School of Mechanical Science and Engineering/National Key Laboratory of Advanced Manufacturing Technology,Huazhong University of Science and Technology,Wuhan 430074,China
2.School of Mathematics and Computer Science,Wuhan Polytechnic University,Wuhan 430023,China
3.School of Mechanical Engineering,University of Shanghai for Science and Technology,Shanghai 200093,China
4.School of Mechanical and Electrical Engineering,Wuhan University of Technology,Wuhan 430070,China
 全文: PDF(4553 KB)   HTML( 4 )
摘要:

针对多尺度结构拓扑设计计算效率低等问题,提出了一种基于水平集渐进均匀化的多尺度拓扑优化并行算法。基于通用图形处理器(graphics processing unit,GPU),通过水平集初始化、大型稀疏刚度矩阵方程求解以及本构矩阵并行计算,可大幅提升渐进均匀化算法的效率。实验结果表明,当三维晶胞单元网格细化至分辨率为10万时,多尺度结构拓扑优化GPU并行算法较CPU串行算法快数十倍。

关键词: 多尺度拓扑优化渐进均匀化统一计算设备架构(CUDA)GPU并行计算    
Abstract:

In response to the low computational efficiency in the context of multi-scale structural topology design, an efficient asymptotic homogenization GPU parallel strategy is presented. The strategy leverages the graphics processing unit (GPU) and investigates parallel strategies for level set initialization, large sparse stiffness matrix equations solving and constitutive properties computing. Experimental results demonstrate that the computing efficiency of the asymptotic homogenization can be greatly improved by adopting the parallel strategies, in particular, when refining a three-dimensional unit cell grid to a resolution of 100 000, the GPU parallel strategy achieves a speedup of two orders of magnitude compared to the CPU serial.

Key words: multi-scale topology optimization    asymptotic homogenization    compute unified device architecture (CUDA)    GPU parallel computing
收稿日期: 2023-07-12 出版日期: 2023-11-30
CLC:  TP 391  
基金资助: 国家自然科学基金青年项目(52005192);国家重点研发计划青年科学家项目(2022YFB3302900)
通讯作者: 余琛     E-mail: mc_yuchen@whpu.edu.cn
作者简介: 夏兆辉(1986—),ORCID:https://orcid.org/0000-0002-0299-6871,男,博士,主要从事CAD/CAE设计分析优化一体化方法与工业软件技术研究.
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
夏兆辉
刘健力
高百川
聂涛
余琛
陈龙
余金桂

引用本文:

夏兆辉,刘健力,高百川,聂涛,余琛,陈龙,余金桂. 面向多尺度拓扑优化的渐进均匀化GPU并行算法研究[J]. 浙江大学学报(理学版), 2023, 50(6): 722-735.

Zhaohui XIA,Jianli LIU,Baichuan GAO,Tao NIE,Chen YU,Long CHEN,Jingui YU. Efficient GPU parallel strategy for multi-scale topology optimization via asymptotic homogenization. Journal of Zhejiang University (Science Edition), 2023, 50(6): 722-735.

链接本文:

https://www.zjujournals.com/sci/CN/10.3785/j.issn.1008-9497.2023.06.007        https://www.zjujournals.com/sci/CN/Y2023/V50/I6/722

12345678
101001011
200100111
300011101
表1  结点矩阵
123456789101112
1111223344567
2234577656888
表2  线框矩阵
图1  线框模型可视化
图2  GPU并行架构
图3  CUDA线程及内存的层次逻辑结构
图4  水平集模型
图5  离散结点与线框的最小距离
图6  线程全局标号与离散网格结点序号的映射关系
表3  水平集初始化并行算法
图7  并行计算水平集初值示例
图8  Matlab,C和CUDA混合编程模型
Segment 1-CG,共轭梯度算法

1: a=1

2: r0=0

3: d_Ax ? matA*d_x----cusparseSpMV ( )

4: d_r ? -a*d_Ax+d_r----cublasSaxpy ( )

5: r1←d_r.*d_r----cublasSdot ( )

6: k=1

7: while r1>tol*tol && k<=max_iter // 条件判断

8: if k>1

9: beta=r1/r0;

10: d_p ? beta*d_p----cublasSscal ( )

11: d_p ? a*d_r+d_p----cublasSaxpy ( )

12: end

13: else

14: d_p ? d_r----cublasScopy ( )

15: end

16: d_Ax ? mat A*d_p----cusparseSpMV ( )

17: dot←d_p.*d_Ax----cublasSdot ( )

18: alpha=r1/dot

19: alpha←numerator / denominator

20: d_x ? alpha*d_p+d_x----cublasSaxpy ( )

21: d_r ? -alpha*d_p+d_r---cublasSaxpy ( )

22: r0=r1;

23: r1←d_r.*d_r----cublasSdot ( )

24: k++

25: end

表4  并行共轭梯度算法
表5  并行计算CH离散积分点值算法
表6  并行计算本构矩阵算法
图9  三维悬臂梁的设计域和边界条件
序号

分辨

自由

度数

步骤运行时间/s加速比
CPUMGPUCUDA
143192

S1

S2

S3

0.068

0.007

0.004

0.199

1.018

0.435

0.342

0.007

0.009

2831 536

S1

S2

S3

0.076

0.098

0.045

0.179

1.016

0.418

0.425

0.097

0.108

316312 288

S1

S2

S3

0.101

0.733

0.143

0.189

0.966

0.270

0.534

0.759

0.530

432398 304

S1

S2

S3

0.347

9.434

1.464

0.190

1.454

0.517

1.826

6.488

2.832

5643786 432

S1

S2

S3

1.980

112.796

10.274

0.183

3.759

1.072

10.820

30.007

9.584

612836 291 456

S1

S2

S3

14.421

1 397.177

77.820

0.192

36.116

6.897

75.109

38.686

11.283

表7  运行时间及加速比
自由度杨氏模量/GPa相对误差
CPUGPU
192349.207348.6491.5×10-3
1 536330.442329.9131.6×10-3
12 288333.689333.1541.6×10-3
98 304303.242302.7561.6×10-3
786 432299.738299.2531.6×10-3
6 291 456300.135297.2079.0×10-3
表8  杨氏模量
图10  计算环节执行时间
图11  GPU/CPU加速比
图12  悬臂梁多尺度拓扑优化结果
1 SCHAEDLER T A, CARTER W B. Architected cellular materials[J]. Annual Review of Materials Research, 2016, 46: 187-210. DOI:10.1146/annurev-matsci-070115-031624
doi: 10.1146/annurev-matsci-070115-031624
2 LIU K, JIANG L. Bio-inspired design of multiscale structures for function integration[J]. Nano Today, 2011, 6(2): 155-175. DOI:10.1016/j.nantod. 2011.02.002
doi: 10.1016/j.nantod. 2011.02.002
3 YANG Y, SONG X, LI X J, et al. Recent progress in biomimetic additive manufacturing technology: From materials to functional structures[J]. Advanced Materials, 2018, 30: 1706539. DOI:10.1002/adma.201706539
doi: 10.1002/adma.201706539
4 YING J M, LU L, TIAN L H, et al. Anisotropic porous structure modeling for 3D printed objects[J]. Computers & Graphics, 2018, 70: 157-164. DOI:10.1016/j.cag.2017.07.008
doi: 10.1016/j.cag.2017.07.008
5 NAZIR A, ABATE K M, KUMAR A, et al. A state-of-the-art review on types, design, optimization, and additive manufacturing of cellular structures[J]. The International Journal of Advanced Manufacturing Technology, 2019, 104: 3489-3510. DOI:10.1007/s00170-019-04085-3
doi: 10.1007/s00170-019-04085-3
6 WESTER T. Nature teaching structures[J]. International Journal of Space Structures, 2002, 17: 135-147. DOI:10.1260/026635102320321789
doi: 10.1260/026635102320321789
7 SCHWERDTFEGER J, WEIN F, LEUGERING G, et al. Design of auxetic structures via mathematical optimization[J]. Advanced Materials, 2011, 23: 2650-2654. DOI:10.1002/adma.201004090
doi: 10.1002/adma.201004090
8 OLSON R A, MARTINS L C B. Cellular ceramics in metal filtration[J]. Advanced Engineering Materials, 2005, 7: 187-192. DOI:10.1002/adem.200500021
doi: 10.1002/adem.200500021
9 SANCHEZ-PALENCIA E. Comportements local et macroscopique d'un type de milieux physiques heterogenes[J]. International Journal of Engineering Science, 1974, 12: 331-351. DOI:10.1016/0020-7225(74)90062-7
doi: 10.1016/0020-7225(74)90062-7
10 OHNO N, WU X, MATSUDA T. Homogenized properties of elastic-viscoplastic composites with periodic internal structures[J]. International Journal of Mechanical Sciences, 2000, 42(8): 1519-1536. DOI:10.1016/S0020-7403(99)00088-0
doi: 10.1016/S0020-7403(99)00088-0
11 GUEDES J M, KIKUCHI N. Preprocessing and postprocessing for materials based on the homogenization method with adaptive finite element methods[J]. Computer Methods in Applied Mechanics and Engineering, 1990, 83(2): 143-198. DOI:10.1016/0045-7825(90)90148-f
doi: 10.1016/0045-7825(90)90148-f
12 ANDREASSEN E, ANDREASEN C S. How to determine composite material properties using numerical[J]. Computational Materials Science,2014, 83: 488-495. DOI:10.1016/j.commatsci. 2013.09.006
doi: 10.1016/j.commatsci. 2013.09.006
13 DONG G Y, TANG Y L, ZHAO Y F. A 149 line homogenization code for three-dimensional cellular materials written in Matlab[J]. Journal of Engineering Materials and Technology, 2019, 141(1): 011005-01100516. DOI:10.1115/1.4040555
doi: 10.1115/1.4040555
14 LIU P Q, LIU A, PENG H, et al. Mechanical property profiles of microstructures via asymptotic homogenization[J]. Computers & Graphics, 2021, 100: 106-115. DOI:10.1016/j.cag.2021.07.021
doi: 10.1016/j.cag.2021.07.021
15 WANG Y Q, CHEN F F, WANG M Y. Concurrent design with connectable graded microstructures[J]. Computer Methods in Applied Mechanics and Engineering, 2017, 317: 84-101. DOI:10.1016/j.cma.2016.12.007
doi: 10.1016/j.cma.2016.12.007
16 YU W B, TANG T. A variational asymptotic micromechanics model for predicting thermoelastic properties of heterogeneous materials[J]. International Journal of Solids and Structures,2007, 44: 7510-7525. DOI:10.1016/j.ijsolstr.2007.04.026
doi: 10.1016/j.ijsolstr.2007.04.026
17 TEMIZER İ, WRIGGERS P. Homogenization in finite thermoelasticity[J]. Journal of the Mechanics and Physics of Solids, 2011, 59(2): 344-372. DOI:10.1016/j.jmps.2010.10.004
doi: 10.1016/j.jmps.2010.10.004
18 BACIGALUPO A, MORINI L, PICCOLROAZ A. Multiscale asymptotic homogenization analysis of thermo-diffusive composite materials[J]. International Journal of Solids and Structures, 2016, 85-86: 15-33. DOI:10.1016/j.ijsolstr.2016.01.016
doi: 10.1016/j.ijsolstr.2016.01.016
19 PRÉVE D, BACIGALUPO A, PAGGI M. Variational-asymptotic homogenization of thermoelastic periodic materials with thermal relaxation[J]. International Journal of Mechanical Sciences, 2021, 205: 106566. DOI:10.1016/j.ijmecsci.2021.106566
doi: 10.1016/j.ijmecsci.2021.106566
20 SALVADORI A, BOSCO E, GRAZIOLI D. A computational homogenization approach for Li-ion battery cells: Part 1-formulation[J]. Journal of the Mechanics and Physics of Solids, 2014, 65: 114-137. DOI:10.1016/j.jmps.2013.08.010
doi: 10.1016/j.jmps.2013.08.010
21 FANTONI F, BACIGALUPO A, PAGGI M. Multi-field asymptotic homogenization of thermo-piezoelectric materials with periodic microstructure[J]. International Journal of Solids and Structures, 2017,120: 31-56. DOI:10.1016/j.ijsolstr.2017.04.009
doi: 10.1016/j.ijsolstr.2017.04.009
22 SCHRÖDER J, KEIP M A. Two-scale homogenization of electromechanically coupled boundary value problems: Consistent linearization and applications[J]. Computational Mechanics, 2012, 50: 229-244. DOI:10.1007/s00466-012-0715-9
doi: 10.1007/s00466-012-0715-9
23 FANTONI F, BACIGALUPO A. Wave propagation modeling in periodic elasto-thermo-diffusive materials via multifield asymptotic homogenization[J]. International Journal of Solids and Structures,2020, 196-197: 99-128. doi:10.1016/j.ijsolstr.2020.03.024
doi: 10.1016/j.ijsolstr.2020.03.024
24 LI Y X, ZHOU B J, HU X L. A two-grid method for level-set based topology optimization with GPU-acceleration[J]. Journal of Computational and Applied Mathematics, 2021, 389: 113336. DOI:10.1016/j.cam.2020.113336
doi: 10.1016/j.cam.2020.113336
25 MUNK D J, KIPOUROS T, VIO G A. Multi-physics bi-directional evolutionary topology optimization on GPU-architecture[J]. Engineering with Computers, 2019, 35(3): 10591079. doi:10.1007/s00366-018-0651-1
doi: 10.1007/s00366-018-0651-1
26 QUINTELA B M, FARAGE M C R, LOBOSCO M. Evaluation of effective properties of heterogeneous media through a GPGPU based algorithm[J]. CILAMCE, 2010, XXIX: 7085-7094 .
27 QUINTELA M B, CALDAS D M, FARAGE MCR, et al. Multiscale modeling of heterogeneous media applying AEH to 3D bodies[C] // MURGANTE B, GERVASI O, MISRA S, et al. Computational Science and Its Applications-ICCSA 2012. Berlin/Heidelberg: Springer, 2012, 7333: 675-690. doi:10.1007/978-3-642-31125-3_51
doi: 10.1007/978-3-642-31125-3_51
28 FRITZEN F, HODAPP M, LEUSCHNER M. GPU accelerated computational homogenization based on a variational approach in a reduced basis framework[J]. Computer Methods in Applied Mechanics and Engineering, 2014, 278: 186-217. DOI:10.1016/j.cma.2014.05.006
doi: 10.1016/j.cma.2014.05.006
29 XIA Z H, WANG Y J, WANG Q F, et al. GPU parallel strategy for parameterized LSM-based topology optimization using isogeometric analysis[J]. Structural and Multidisciplinary Optimization, 2017, 56: 413-434. DOI:10.1007/s00158-017-1672-x
doi: 10.1007/s00158-017-1672-x
30 MARTÍNEZ-FRUTOS J, HERRERO-PÉREZ D. GPU acceleration for evolutionary topology optimization of continuum structures using isosurfaces[J]. Computers and Structures, 2017, 182 :119-136. DOI:10.1016/j.compstruc.2016.10.018
doi: 10.1016/j.compstruc.2016.10.018
31 HERRERO-PÉREZ D, MARTÍNEZ CASTEJÓN J. Multi-GPU acceleration of large-scale density-based topology optimization[J]. Advances in Engineering Software, 2021, 157/158: 103006. DOI:10.1016/j.advengsoft.2021.103006
doi: 10.1016/j.advengsoft.2021.103006
32 HE G, WANG H, LI E, et al. A multiple-GPU based parallel independent coefficient reanalysis method and applications for vehicle design[J]. Advances in Engineering Software, 2015, 85: 108-124. DOI:10.1016/j.advengsoft.2015.03.006
doi: 10.1016/j.advengsoft.2015.03.006
33 GUEDES J M. Nonlinear Computational Models for Composite Materials Using Homogenization[D]. Ann Arbor : The University of Michigan,1990.
34 BENDSØE M P, KIKUCHI N. Generating optimal topologies in structural design using a homogenization method[J]. Computer Methods in Applied Mechanics and Engineering, 1988, 71: 197-224. DOI:10.1016/0045-7825(88)90086-2
doi: 10.1016/0045-7825(88)90086-2
35 CHALLIS V J, ROBERTS A P, GROTOWSKI J F. High resolution topology optimization using graphics processing units (GPUs)[J]. Structural and Multidisciplinary Optimization, 2014, 49: 315-325. DOI:10.1007/s00158-013-0980-z
doi: 10.1007/s00158-013-0980-z
36 KUŹNIK K, PASZYŃSKI M, CALO V. Graph grammar-Based multi-frontal parallel direct solver for two-dimensional isogeometric analysis[J]. Procedia Computer Science,2012, 9: 1454-1463. DOI:10.1016/j.procs.2012.04.160
doi: 10.1016/j.procs.2012.04.160
37 HE L L, BAI H T, JIANG Y, et al. Revised simplex algorithm for linear programming on GPUs with CUDA[J]. Multimedia Tools and Applications, 2018, 77(22): 30035-30050. DOI:10.1007/s11042-018-5947-z
doi: 10.1007/s11042-018-5947-z
38 HU Q, GUMEROV N A, DURAISWAMI R. GPU accelerated fast multipole methods for vortex particle simulation[J]. Computers and Fluids, 2013, 88: 857-865. DOI:10.1016/j.compfluid.2013.08.008
doi: 10.1016/j.compfluid.2013.08.008
39 陈尧, 赵永华, 赵慰, 等. GPU加速不完全Cholesky分解预条件共轭梯度法[J]. 计算机研究与发展,2015, 52(4): 843-850. DOI:10.7544/issn1000-1239.2015.20131919
CHEN Y, ZHAO Y H, ZHAO W, et al. GPU-accelerated incomplete Cholesky factorization preconditioned conjugate gradient method[J]. Journal of Computer Research and Development, 2015, 52(4): 843-850. DOI:10.7544/issn1000-1239. 2015.20131919
doi: 10.7544/issn1000-1239. 2015.20131919
40 YUAN G L, LI T T, HU W J. A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems[J]. Applied Numerical Mathematics, 2020, 147: 129-141. DOI:10.1016/j.apnum.2019.08.022
doi: 10.1016/j.apnum.2019.08.022
[1] 汪飞,李伟鸿,杨彧,姜大志,赵宝全,罗笑南. 动脉粥样硬化斑块生成的高效流固耦合不可压缩SPH模拟方法[J]. 浙江大学学报(理学版), 2023, 50(6): 711-721.
[2] 张泽初,彭伟龙,唐可可,余朝阳,Khan Asad,方美娥. 面向CBCT图像的金字塔微分同胚变形牙齿网格重建方法[J]. 浙江大学学报(理学版), 2023, 50(6): 701-710.
[3] 毛涵杨,彭晨,李晨,王长波. 面向开放表面的神经移动立方体算法[J]. 浙江大学学报(理学版), 2023, 50(6): 692-700.
[4] 苏科华,刘百略,雷娜,李可涵,顾险峰. 基于最优质量传输的Focus+Context可视化[J]. 浙江大学学报(理学版), 2023, 50(6): 681-691.
[5] 刘圣军,滕子,王海波,刘新儒. 基于函数映射的二维形状内蕴对称检测算法[J]. 浙江大学学报(理学版), 2023, 50(6): 668-680.
[6] 刘泽润,尹宇飞,薛文灏,郭蕊,程乐超. 基于扩散模型的条件引导图像生成综述[J]. 浙江大学学报(理学版), 2023, 50(6): 651-667.
[7] 谭晓东,赵奇,文明珠,王小超. 基于BEMD、DCT和SVD的混合图像水印算法[J]. 浙江大学学报(理学版), 2023, 50(4): 442-454.
[8] 方于华,叶枫. MFDC-Net:一种融合多尺度特征和注意力机制的乳腺癌病理图像分类算法[J]. 浙江大学学报(理学版), 2023, 50(4): 455-464.
[9] 孔翔,陈军. 一类带4个形状参数的同次三角曲面构造算法[J]. 浙江大学学报(理学版), 2023, 50(2): 153-159.
[10] 张远鹏, 陈鸿韬, 王伟娜. 基于非凸非光滑变分模型的灰度图像泊松噪声移除算法[J]. 浙江大学学报(理学版), 2023, 50(2): 160-166.
[11] 李军成,刘成志,罗志军,龙志文. 空间参数曲线的双目标能量极小化方法及其应用[J]. 浙江大学学报(理学版), 2023, 50(1): 63-68.
[12] 全浩荣,刘成志,李军成,杨炼,胡丽娟. 张量积型Said-Ball曲面的预处理渐近迭代逼近法[J]. 浙江大学学报(理学版), 2022, 49(6): 682-690.
[13] 虞瑞麒,刘玉华,沈禧龙,翟如钰,张翔,周志光. 表征学习驱动的多重网络图采样[J]. 浙江大学学报(理学版), 2022, 49(3): 271-279.
[14] 律睿慜,张陶洁,席旭,王濛濛,孟磊,张克俊. 笔法与结构对楷书文字美学品质影响的量化研究[J]. 浙江大学学报(理学版), 2022, 49(3): 261-270.
[15] 钟颖,王松,吴浩,程泽鹏,李学俊. 基于SEMMA的网络安全事件可视探索[J]. 浙江大学学报(理学版), 2022, 49(2): 131-140.