基于不规则性的并行计算方法

doi:10.3785/j.issn.1008-973X.2013.11.026

2013, Vol. 47

Issue (11): 2057-2064 DOI: 10.3785/j.issn.1008-973X.2013.11.026

计算机科学技术

基于不规则性的并行计算方法

杨鑫1,许端清2,杨冰2

1.大连理工大学计算机学院,辽宁大连 116023; 2.浙江大学计算机学院,浙江杭州,310027

A parallel computing method for irregular work

YANG Xin1, XU Duan-qing2, YANG Bing2

1. College of Computer Science, Dalian University of Technology, Dalian 116023, China;
2.College of Computer Science, Zhejiang University, Hangzhou 310027,China

全文: PDF HTML

摘要：

为了有效使用异构多核架构强大的并行计算能力,根据硬件架构的特点重新组织数据并合理调度任务的执行是非常有必要的.提出一个基于不规则性的并行计算方法,是一个融合数据并行、任务并行、管道并行的多重并行计算方法,特别适合具有动态特征执行行为和不规则数据结构的复杂算法,能够在程序运行时根据存储局部性原则和单指令多数据流（SIMD）操作机制对任务执行进行基于优先级的动态调度和数据管理,能够最大限度地有效使用CPU和GPU的硬件计算资源和存储资源.实验结果表明,该方法能够提高图形并行绘制算法关于动态执行过程和不规则数据结构构造和维护的性能.

Abstract:

In order to effectively use the powerful computing provided by the heterogeneous multi-core architecture, re-organize the data and a reasonable schedule for the execution of tasks is very necessary, according to the characteristics of the hardware architecture. This paper presents a parallel computing method for irregular work, the method is an multiple parallel integration of data parallelism, task parallelism, pipeline parallelism, is particularly suitable for the implementation of the work with dynamic behavior and complex irregular data structures algorithms, and run the program according to the storage locality and SIMD character, using priority-based dynamic scheduling and data management, to maximize the efficient use of CPU and GPU hardware computing resources and storage resources. The experiments results show that the approach can improve the parallel rendering algorithm performance for the dynamic execution and irregular data structures construction and maintenance．

出版日期: 2013-11-01

基金资助:

国家科技支撑计划资助项目(2012BAH03F02）；国家自然科学基金资助项目（ 61300084）; 中国博士后科学基金资助项目（2012M520625）；大连理工大学基础科研经费(DUT12RC(3)63).

作者简介: 杨鑫（1984-），男，博士，从事计算机图形学、并行计算的研究.E-mail:xinyang@zju.edu.cn

	服务
	把本文推荐给朋友
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章

引用本文:

杨鑫,许端清,杨冰. 基于不规则性的并行计算方法[J]. J4, 2013, 47(11): 2057-2064.

YANG Xin, XU Duan-qing, YANG Bing. A parallel computing method for irregular work. J4, 2013, 47(11): 2057-2064.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2013.11.026 或 http://www.zjujournals.com/eng/CN/Y2013/V47/I11/2057

［1］ ZHE F, FENG Q, ARIE K, et al. GPU cluster for high performance computing ［C］∥Proceedings of the ACM/IEEE Conference on Supercomputing, Pittsburgh, Pennsylvania. USA: IEEE Computer Society, 2004: 4-7．
［2］ IROYUKI H T, IROAKI H K. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing ［J］. The Journal of Supercomputing, 2006, 36(3): 219-234．
［3］ DOMINIK G, ROBERT S, JAMALUDIN M, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster ［J］. Parallel Computing, 2007, 33(10/11): 685-699．
［4］ MICHAEL S, JEREMY E, AVNEESH P, et al. QP: A heterogeneous multi-accelerator cluster ［C］∥Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing. Boulder, Colorado, USA: LCI, 2009: 34-41．
［5］ JAMES P, JOHN S, KLAUS S. Adapting a message-driven parallel application to GPU-accelerated clusters ［C］∥ Proceedings of the ACM/IEEE Conference on Supercomputing. Austin, Texas, USA : IEEE Computer Society, 2008: 19．
［6］ ONEPPO M. HLSL shader model 4.0 ［C］∥ACM SIGGRAPH 2007 Courses. San Diego, California, USA: ACM, 2007: 112-152．
［7］杨晓奇,郑启龙,陈国良. 扩充OpenMP并行编程模型支持事务存储执行. ［J］.中国科技大学学报,2009,（11）,1224-1231.
YANG Xiao-qi, ZHENG Qi-long, CHEN Guo-liang. The extension of OpenMP parallel programming model to support transactional memory execution ［J］. Journal of University of Science and Technology of China, 2009, (11), 1224-1231．
［8］单莹,吴建平,王正华. 基于SMP集群的多层次并行编程模型与并行优化技术［J］. 计算机应用研究,2006,23（10）,254-256.
SHAN Ying, WU Jian-ping, WANG Zheng-hua. Hierarchical parallel programming model and parallelization and optimization techniques based on SMP cluster ［J］. Application Research of Computers,2006,23（10）,254-256．
［9］曲洋,黄永忠,王磊. 流式缩减技术在GPU上的研究与应用［J］.. 计算机工程与设计,2008,29（5）,1268-1270.
QU Yang, HUANG Yong-zhong, WANG Lei. Disquisition and application of streaming curtailment technology on GPU ［J］. Computer Engineering and Design,2008,29（5）,1268-1270.
［10］张舒,褚艳利. GPU高性能运算之CUDA［M］.中国水利水电出版社,2009.
［11］ ERIK L, JOHN N, STUART O, et al. NVIDIA Tesla: A unified graphics and computing architecture ［J］. IEEE Micro, 2008, 28(2): 39-55．
［12］ JOHN N, IAN B, MICHAEL G, et al. Scalable parallel programming with CUDA ［J］. Queue, 2008, 6(2): 4053．
［13］ MICHAEL M, STEFANUS T, TIBERIU P, et al. Shader algebra ［C］∥ ACM SIGGRAPH. Los Angeles, California, USA: ACM, 2004: 787-795．
［14］ DAVID T, SIDD P, JOSE O. Accelerator: Using data parallelism to program GPUs for general-purpose uses ［J］. SIGPLAN, 2006, 41(11): 325-335．
［15］ CHRISTIAN L, MICHAEL G, SHUBHABRATA S, et al. Fast bvh construction on gpus ［J］. Computer Graphics Forum, 2009, 28(2): 375-384．
［16］杨鑫, 许端清,赵磊. 基于GPU的BVH快速构造方法［J］. 浙江大学学报：工学版, 2012, 46(1): 84-89.
YANG Xin, WANG Tian-ming, XU Duan-qing. Fast BVH construction on GPU ［J］. Journal of Zhejiang University:Engineering Science , 2012, 46(1): 84-89．
［17］ ZHOU K, HOU M, WANG R, et al. Real-time kd-tree construction on graphics hardware ［J］. ACM Transactions on Graphics, 2008, 27(5): 111．
［18］ PATNEY A , OWENS J. Real-time Reyes-style adaptive surface subdivision ［J］. ACM Transactions on Graphics, 2008, 27(5): 18．
［19］ NADATHUR S, MARK H, MICHAEL G. Designing efficient sorting algorithms for manycore GPUs ［C］∥ Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing. Rome, Italy : IEEE Computer Society, 2009: 1-10．
［20］杨鑫, 许端清, 赵磊. 二级光线跟踪的并行计算［J］. 浙大学报, 2012, 46(10): 1796-1802.
YANG Xin, XU Duan-qing, ZHAO Lei. Secondary ray tracing in parallel ［J］. Journal of Zhejiang University:Engineering Science , 2012, 46(10): 1796-1802．
［21］ YANG X, XU Q A, ZHAO L. Efficient Data Management on GPU ［J］. Applied Soft Computing, 2013, 13: 1-8．
［22］ SOLOMON B, DAVE E, Dylan L, et al. Packet-based whitted and distribution ray tracing ［C］∥Proceedings of Graphics Interface. Montreal, Canada: ACM, 2007: 177-184．

[1]	宁志华,何乐年,胡志成. 一种高压高可靠性开关电源控制芯片[J]. J4, 2014, 48(3): 377-383.
[2]	李林,陈家旺,顾临怡,王峰. 轴向柱塞泵/马达变量阀配流机构[J]. J4, 2014, 48(1): 29-34.
[3]	陈钊,余锋,陈婷婷. 基于日志结构的闪存均衡回收策略[J]. J4, 2014, 48(1): 92-99.
[4]	蒋湛,姚晓明,林兰芬. 基于特征自适应的本体映射方法[J]. J4, 2014, 48(1): 76-84.
[5]	陈迪仕 ,张宇,李平. 微小型无人直升机地面效应建模[J]. J4, 2014, 48(1): 154-160.
[6]	霍新新,褚金奎,韩冰峰,姚斐. 基于多个压电换能器的接口电路[J]. J4, 2013, 47(11): 2038-2045.
[7]	王玉强,张宽地,陈晓东. 胶黏钢-混凝土组合梁的界面行为数值分析[J]. J4, 2013, 47(9): 1593-1598.
[8]	崔何亮, 张丹, 施斌. 布里渊分布式传感的空间分辨率及标定方法[J]. J4, 2013, 47(7): 1232-1237.
[9]	彭勇,徐小剑. 集料分布对沥青混合料劈裂强度影响数值分析[J]. J4, 2013, 47(7): 1186-1191.
[10]	伍晓榕,裘乐淼,张树有,孙良峰,郭传龙. 模糊语境下的复杂系统关联FMEA方法[J]. J4, 2013, 47(5): 782-789.
[11]	金波,陈诚,李伟. 具有半球形足端的六足机器人步态修正算法[J]. J4, 2013, 47(5): 768-774.
[12]	钟世英, 吴晓君, 蔡武军, 凌道盛, 蒋祝金, 王顺玉. 月面软着陆足垫水平拖曳模型试验装置研制[J]. J4, 2013, 47(3): 465-471.
[13]	袁幸,朱永生,张优云,洪军,祁文昌. 基于正反问题的滚动轴承损伤程度评估[J]. J4, 2012, 46(11): 1960-1967.
[14]	杨飞,朱株,龚小谨,刘济林. 基于三维激光雷达的动态障碍实时检测与跟踪[J]. J4, 2012, 46(9): 1565-1571.
[15]	王鹿军, 吕征宇. 基于LSSVM的电梯交通模式的模糊识别[J]. J4, 2012, 46(7): 1333-1338.

Viewed

Full text

Abstract

Cited

Shared

Discussed