A new GPU based shadow volume generation algorithm based on CUDA structure was proposed for fast generation and real-time rendering of shadow of subdivision surfaces in computer games and virtual reality applications. The algorithm introduces CUDA-based surface subdivision algorithm. Generation of surface subdivisions can run faster by using shared memory structure. CUDA-based shadow volume algorithm was introduced to generate the shadow silhouette line and extrude the shadow volume. CUDA-based stream reduction algorithm was introduced to reduce the shadow volume array. An optimized interoperation between CUDA and OPENGL was introduced to simplify the rendering step of the algorithm from three steps to two steps. Implemented on a standard PC with CUDA hardware, experiments show that the algorithm can generate the shadow volume of more complex subdivision surfaces compared with former GPU-based ones. The algorithm needs smaller video memory for the shadow volume array to less than 2%, and the rendering performance can gain acceleration up to more than four times.
[1] 唐敏,童若锋,董金祥. 基于GPU的曲面自适应细分[J]. 浙江大学学报:工学版,2008, 42(7): 1145-1149.
TANG Min, TONG Ruofeng, DONG Jinxiang. Graphics processing units based adaptive subdivision [J]. Journal of Zhejiang University: Engineering Science, 2008, 42(7) : 1145-1149.
[2] SHIUE L J, JONES I, PETERS J. A realtime GPU subdivision kernel [J]. ACM Transactions on Graphics (TOG), 2005, 24(3):1010-1015.
[3] SCHWARZ M, STAMMINGER M. Fast GPUbased adaptive tessellation with CUDA [J]. Computer Graphics Forum, 2009, 28(2): 365-374.
[4] WILLIAMS L. Casting curved shadows on curved surfaces [J]. ACM SIGGRAPH Computer Graphics, 1978, 12(3): 270-274.
[5] CROW F C. Shadow algorithms for computer graphics [J]. ACM SIGGRAPH Computer Graphics, 1977, 11(2): 242-248.
[6] EVERITT C, KILGARD M. Practical and robust stencil shadow volumes for hardware accelerated rendering [R]. Austin: NVIDIA Corporation, 2002.
[7] BRABEC S, SEIDEL H. Shadow volumes on programmable graphics hardware [J]. Computer Graphics Forum, 2003, 22(3): 433-440.
[8] TANG M, DONG J X. Geometry imagebased shadow volume algorithm for subdivision surfaces [C]∥Computer Graphics International. Petrópolis, Brazil: [s.n.], 2007: 21-28.
[9] TANG M, DONG J X, CHOU S C. Realtime shadow volumes for subdivision surface based models [C]∥Computer Graphics International. Hangzhou: [s.n.], 2006: 538-545.
[10] KIRK D. Nvidia CUDA software and GPU parallel computing architecture[C]∥The 6th International Symposium on Memory Management. New York: ACM, 2007.
[11] BLELLOCH G. Prefix sums and their applications [R]. Pittsburgh: Carnegie Mellon University, 1990.
[12] HORN D. Stream reduction operations for GPGPU applications [M]. [S.l.]: Wesley, 2005: 573-589.
[13] CLARA S. NVIDIA CUDA programming guide [M]. Santa Clara: NVIDIA, 2010.
[14] CHRIS K. DirectX 11: learn the latest tricks [C]∥ ACM SIGGRAPH ASIA. New York: ACM, 2010.
[15] STONE J, GOHARA D, SHI G. OpenCL: a parallel programming standard for heterogeneous computing systems [J]. Computing in Science and Engineering, 2010, 12(3): 66-73.