Please wait a minute...
J4  2013, Vol. 47 Issue (11): 2057-2064    DOI: 10.3785/j.issn.1008-973X.2013.11.026
    
A parallel computing method for irregular work
YANG Xin1, XU Duan-qing2, YANG Bing2
1.  College of Computer Science, Dalian University of Technology, Dalian 116023, China;
 2.College of Computer Science, Zhejiang University, Hangzhou 310027,China
Download:   PDF(0KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

In order to effectively use the powerful computing provided by the heterogeneous multi-core architecture, re-organize the data and a reasonable schedule for the execution of tasks is very necessary, according to the characteristics of the hardware architecture. This paper presents a parallel computing method for irregular work, the method is an multiple parallel integration of data parallelism, task parallelism, pipeline parallelism, is particularly suitable for the implementation of the work with dynamic behavior and complex irregular data structures algorithms, and run the program according to the storage locality and SIMD character, using priority-based dynamic scheduling and data management, to maximize the efficient use of CPU and GPU hardware computing resources and storage resources. The experiments results show that the approach can improve the parallel rendering algorithm performance for the dynamic execution and irregular data structures construction and maintenance.



Published: 01 November 2013
CLC:     
  TP 312  
Cite this article:

YANG Xin, XU Duan-qing, YANG Bing. A parallel computing method for irregular work. J4, 2013, 47(11): 2057-2064.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2013.11.026     OR     http://www.zjujournals.com/eng/Y2013/V47/I11/2057


基于不规则性的并行计算方法

为了有效使用异构多核架构强大的并行计算能力,根据硬件架构的特点重新组织数据并合理调度任务的执行是非常有必要的.提出一个基于不规则性的并行计算方法,是一个融合数据并行、任务并行、管道并行的多重并行计算方法,特别适合具有动态特征执行行为和不规则数据结构的复杂算法,能够在程序运行时根据存储局部性原则和单指令多数据流(SIMD)操作机制对任务执行进行基于优先级的动态调度和数据管理,能够最大限度地有效使用CPU和GPU的硬件计算资源和存储资源.实验结果表明,该方法能够提高图形并行绘制算法关于动态执行过程和不规则数据结构构造和维护的性能.

[1] ZHE F, FENG Q, ARIE K, et al. GPU cluster for high performance computing [C]∥Proceedings of the ACM/IEEE Conference on Supercomputing, Pittsburgh, Pennsylvania. USA: IEEE Computer Society, 2004: 4-7.
[2] IROYUKI H T, IROAKI H K. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing [J]. The Journal of Supercomputing, 2006, 36(3): 219-234.
[3] DOMINIK G, ROBERT S, JAMALUDIN M, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster [J]. Parallel Computing, 2007, 33(10/11): 685-699.
[4] MICHAEL S, JEREMY E, AVNEESH P, et al. QP: A heterogeneous multi-accelerator cluster [C]∥Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing. Boulder, Colorado, USA: LCI, 2009: 34-41.
[5] JAMES P, JOHN S, KLAUS S. Adapting a message-driven parallel application to GPU-accelerated clusters [C]∥ Proceedings of the ACM/IEEE Conference on Supercomputing. Austin, Texas, USA : IEEE Computer Society, 2008: 19.
[6] ONEPPO M. HLSL shader model 4.0 [C]∥ACM SIGGRAPH 2007 Courses. San Diego, California, USA: ACM, 2007: 112-152.
[7] 杨晓奇,郑启龙,陈国良. 扩充OpenMP并行编程模型支持事务存储执行. [J].中国科技大学学报,2009,(11),1224-1231.
YANG Xiao-qi, ZHENG Qi-long, CHEN Guo-liang. The extension of OpenMP parallel programming model to support transactional memory execution [J]. Journal of University of Science and Technology of China, 2009, (11), 1224-1231.
[8] 单莹,吴建平,王正华. 基于SMP集群的多层次并行编程模型与并行优化技术[J]. 计算机应用研究,2006,23(10),254-256.
SHAN Ying, WU Jian-ping, WANG Zheng-hua. Hierarchical parallel programming model and parallelization and optimization techniques based on SMP cluster [J]. Application Research of Computers,2006,23(10),254-256.
[9] 曲洋,黄永忠,王磊. 流式缩减技术在GPU上的研究与应用[J].. 计算机工程与设计,2008,29(5),1268-1270.
QU Yang, HUANG Yong-zhong, WANG Lei. Disquisition and application of streaming curtailment technology on GPU [J]. Computer Engineering and Design,2008,29(5),1268-1270.
[10] 张舒,褚艳利. GPU高性能运算之CUDA[M].中国水利水电出版社,2009.
[11] ERIK L, JOHN N, STUART O, et al. NVIDIA Tesla: A unified graphics and computing architecture [J]. IEEE Micro, 2008, 28(2): 39-55.
[12] JOHN N, IAN B, MICHAEL G, et al. Scalable parallel programming with CUDA [J]. Queue, 2008, 6(2): 4053.
[13] MICHAEL M, STEFANUS T, TIBERIU P, et al. Shader algebra [C]∥ ACM SIGGRAPH. Los Angeles, California, USA: ACM, 2004: 787-795.
[14] DAVID T, SIDD P, JOSE O. Accelerator: Using data parallelism to program GPUs for general-purpose uses [J]. SIGPLAN, 2006, 41(11): 325-335.
[15] CHRISTIAN L, MICHAEL G, SHUBHABRATA S, et al. Fast bvh construction on gpus [J]. Computer Graphics Forum, 2009, 28(2): 375-384.
[16] 杨鑫, 许端清,赵磊. 基于GPU的BVH快速构造方法 [J]. 浙江大学学报:工学版, 2012, 46(1): 84-89.
YANG Xin, WANG Tian-ming, XU Duan-qing. Fast BVH construction on GPU [J]. Journal of Zhejiang University:Engineering Science , 2012, 46(1): 84-89.
[17] ZHOU K, HOU M, WANG R, et al. Real-time kd-tree construction on graphics hardware [J]. ACM Transactions on Graphics, 2008, 27(5): 111.
[18] PATNEY A , OWENS J. Real-time Reyes-style adaptive surface subdivision [J]. ACM Transactions on Graphics, 2008, 27(5): 18.
[19] NADATHUR S, MARK H, MICHAEL G. Designing efficient sorting algorithms for manycore GPUs [C]∥ Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing. Rome, Italy : IEEE Computer Society, 2009: 1-10.
[20] 杨鑫, 许端清, 赵磊. 二级光线跟踪的并行计算 [J]. 浙大学报, 2012, 46(10): 1796-1802.
YANG Xin, XU Duan-qing, ZHAO Lei. Secondary ray tracing in parallel [J]. Journal of Zhejiang University:Engineering Science , 2012, 46(10): 1796-1802.
[21] YANG X, XU Q A, ZHAO L. Efficient Data Management on GPU [J]. Applied Soft Computing, 2013, 13: 1-8.
[22] SOLOMON B, DAVE E, Dylan L, et al. Packet-based whitted and distribution ray tracing [C]∥Proceedings of Graphics Interface. Montreal, Canada: ACM, 2007: 177-184.

[1] NING Zhi-hua, HE Le-nian, HU Zhi-cheng. A high voltage high stability switching-mode controller chip[J]. J4, 2014, 48(3): 377-383.
[2] LI Lin, CHEN Jia-wang,GU Lin-yi, WANG Feng. Variable displacement distributor with valve control for axial piston pump/motor[J]. J4, 2014, 48(1): 29-34.
[3] CHEN Zhao, YU Feng, CHEN Ting-ting. Log-structured even recycle strategy for flash storage[J]. J4, 2014, 48(1): 92-99.
[4] JIANG Zhan, YAO Xiao-ming, LIN Lan-fen. Feature-based adaptive method of ontology mapping[J]. J4, 2014, 48(1): 76-84.
[5] CHEN Di-shi,ZHANG Yu , LI Ping. Ground effect modeling for small-scale unmanned helicopter[J]. J4, 2014, 48(1): 154-160.
[6] HUO Xin-xin, CHU Jin-kui,HAN Bing-feng, YAO Fei. Research on interface circuits of multiple piezoelectric generators[J]. J4, 2013, 47(11): 2038-2045.
[7] WANG Yu-qiang,ZHANG Kuan-di,CHEN Xiao-dong. Numerical analysis on interface behavior of
adhesive bonded steel-concrete composite beams
[J]. J4, 2013, 47(9): 1593-1598.
[8] CUI He-liang, ZHANG Dan, SHI Bin. Spatial resolution and its calibration method for Brillouin scattering based distributed sensors[J]. J4, 2013, 47(7): 1232-1237.
[9] PENG Yong, XU Xiao-jian. Numerical analysis of effect of aggregate distribution on splitting strength of asphalt mixtures[J]. J4, 2013, 47(7): 1186-1191.
[10] WU Xiao-rong, QIU Le-miao, ZHANG Shu-you, SUN Liang-feng, GUO Chuan-long. Correlated FMEA method of complex system with linguistic vagueness[J]. J4, 2013, 47(5): 782-789.
[11] JIN Bo, CHEN Cheng, LI Wei. Gait correction algorithm of hexapod walking robot
with semi-round rigid feet
[J]. J4, 2013, 47(5): 768-774.
[12] ZHONG Shi-ying, WU Xiao-jun, CAI Wu-jun, LING Dao-sheng. Development of horizontal sliding model test facility
 for footpad’s lunar soft landing
[J]. J4, 2013, 47(3): 465-471.
[13] YUAN Xing, ZHANG You-yun, ZHU Yong-sheng, HONG Jun,QI Wen-chang. Fault degree evaluation for rolling bearing combining
backward inference with forward inference
[J]. J4, 2012, 46(11): 1960-1967.
[14] YANG Fei, ZHU Zhu, GONG Xiao-jin, LIU Ji-lin. Real-time dynamic obstacle detection and tracking using 3D Lidar[J]. J4, 2012, 46(9): 1565-1571.
[15] WANG Lu-jun, LV Zheng-yu. Elevator traffic pattern fuzzy recognition based on
least squares support vector machine
[J]. J4, 2012, 46(7): 1333-1338.