[1] ZHE F, FENG Q, ARIE K, et al. GPU cluster for high performance computing [C]∥Proceedings of the ACM/IEEE Conference on Supercomputing, Pittsburgh, Pennsylvania. USA: IEEE Computer Society, 2004: 4-7.
[2] IROYUKI H T, IROAKI H K. Hierarchical parallel processing of large scale data clustering on a PC cluster with GPU co-processing [J]. The Journal of Supercomputing, 2006, 36(3): 219-234.
[3] DOMINIK G, ROBERT S, JAMALUDIN M, et al. Exploring weak scalability for FEM calculations on a GPU-enhanced cluster [J]. Parallel Computing, 2007, 33(10/11): 685-699.
[4] MICHAEL S, JEREMY E, AVNEESH P, et al. QP: A heterogeneous multi-accelerator cluster [C]∥Proceeding of the 10th LCI International Conference on High-Performance Clustered Computing. Boulder, Colorado, USA: LCI, 2009: 34-41.
[5] JAMES P, JOHN S, KLAUS S. Adapting a message-driven parallel application to GPU-accelerated clusters [C]∥ Proceedings of the ACM/IEEE Conference on Supercomputing. Austin, Texas, USA : IEEE Computer Society, 2008: 19.
[6] ONEPPO M. HLSL shader model 4.0 [C]∥ACM SIGGRAPH 2007 Courses. San Diego, California, USA: ACM, 2007: 112-152.
[7] 杨晓奇,郑启龙,陈国良. 扩充OpenMP并行编程模型支持事务存储执行. [J].中国科技大学学报,2009,(11),1224-1231.
YANG Xiao-qi, ZHENG Qi-long, CHEN Guo-liang. The extension of OpenMP parallel programming model to support transactional memory execution [J]. Journal of University of Science and Technology of China, 2009, (11), 1224-1231.
[8] 单莹,吴建平,王正华. 基于SMP集群的多层次并行编程模型与并行优化技术[J]. 计算机应用研究,2006,23(10),254-256.
SHAN Ying, WU Jian-ping, WANG Zheng-hua. Hierarchical parallel programming model and parallelization and optimization techniques based on SMP cluster [J]. Application Research of Computers,2006,23(10),254-256.
[9] 曲洋,黄永忠,王磊. 流式缩减技术在GPU上的研究与应用[J].. 计算机工程与设计,2008,29(5),1268-1270.
QU Yang, HUANG Yong-zhong, WANG Lei. Disquisition and application of streaming curtailment technology on GPU [J]. Computer Engineering and Design,2008,29(5),1268-1270.
[10] 张舒,褚艳利. GPU高性能运算之CUDA[M].中国水利水电出版社,2009.
[11] ERIK L, JOHN N, STUART O, et al. NVIDIA Tesla: A unified graphics and computing architecture [J]. IEEE Micro, 2008, 28(2): 39-55.
[12] JOHN N, IAN B, MICHAEL G, et al. Scalable parallel programming with CUDA [J]. Queue, 2008, 6(2): 4053.
[13] MICHAEL M, STEFANUS T, TIBERIU P, et al. Shader algebra [C]∥ ACM SIGGRAPH. Los Angeles, California, USA: ACM, 2004: 787-795.
[14] DAVID T, SIDD P, JOSE O. Accelerator: Using data parallelism to program GPUs for general-purpose uses [J]. SIGPLAN, 2006, 41(11): 325-335.
[15] CHRISTIAN L, MICHAEL G, SHUBHABRATA S, et al. Fast bvh construction on gpus [J]. Computer Graphics Forum, 2009, 28(2): 375-384.
[16] 杨鑫, 许端清,赵磊. 基于GPU的BVH快速构造方法 [J]. 浙江大学学报:工学版, 2012, 46(1): 84-89.
YANG Xin, WANG Tian-ming, XU Duan-qing. Fast BVH construction on GPU [J]. Journal of Zhejiang University:Engineering Science , 2012, 46(1): 84-89.
[17] ZHOU K, HOU M, WANG R, et al. Real-time kd-tree construction on graphics hardware [J]. ACM Transactions on Graphics, 2008, 27(5): 111.
[18] PATNEY A , OWENS J. Real-time Reyes-style adaptive surface subdivision [J]. ACM Transactions on Graphics, 2008, 27(5): 18.
[19] NADATHUR S, MARK H, MICHAEL G. Designing efficient sorting algorithms for manycore GPUs [C]∥ Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing. Rome, Italy : IEEE Computer Society, 2009: 1-10.
[20] 杨鑫, 许端清, 赵磊. 二级光线跟踪的并行计算 [J]. 浙大学报, 2012, 46(10): 1796-1802.
YANG Xin, XU Duan-qing, ZHAO Lei. Secondary ray tracing in parallel [J]. Journal of Zhejiang University:Engineering Science , 2012, 46(10): 1796-1802.
[21] YANG X, XU Q A, ZHAO L. Efficient Data Management on GPU [J]. Applied Soft Computing, 2013, 13: 1-8.
[22] SOLOMON B, DAVE E, Dylan L, et al. Packet-based whitted and distribution ray tracing [C]∥Proceedings of Graphics Interface. Montreal, Canada: ACM, 2007: 177-184. |