Please wait a minute...
JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE)
    
Power-efficient image blending engine design based on self-adaptive pipeline
TAN Teng-fei1, MA De1, HUANG Kai2, MA Qi1
1. Microelectronic CAD Center, Hangzhou Dianzi University, Hangzhou 310018, China; 2. Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China
Download:   PDF(3492KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

A power efficient self-adaptive pipeline and corresponding on-chip buffer architecture was proposed according to the characteristics of multi-image blending. The architecture supports four-image inputted with both RGB and YCbCr color format based on ITU-R BT.601 and ITU-R BT.709 standard. Depending on color format of each input image, the proposed blending engine automatically adapted pipeline architecture and work status of each stage to improve power efficiency. Bi-controllable circular buffer structure was adopted to decrease pipeline stalls, keeping pipeline smooth. Selective-fetching pixel and self-adaptive color space conversion techniques were adopted to reduce the power consumption. Experimental results show that the proposed work can achieve better tradeoff of power, area and performance compared with the related works. At the hardware cost of 136000 gate with SMIC90 technology, three-channel blending can be realized for 1080@30fp images in real time with 150 MHz frequency and the lowest power dynamic consumption can achieve 0.065 mW.



Published: 06 June 2018
CLC:  TN 47  
Cite this article:

TAN Teng-fei, MA De, HUANG Kai, MA Qi. Power-efficient image blending engine design based on self-adaptive pipeline. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(1): 27-35.

URL:

http://www.zjujournals.com/eng/10.3785/j.issn.1008-973X.2015.01.005     OR     http://www.zjujournals.com/eng/Y2015/V49/I1/27


多层图像叠加处理的低功耗自适应流水线设计

针对多层图像叠加处理技术的特点,提出低功耗自适应流水线及片上缓存架构,支持ITU-R BT.601和ITU-R BT.709标准下RGB和YCbCr格式的4层图像叠加显示.该架构根据各层图像格式,自适应调整流水线及各级逻辑工作状态以提高能效比.采用双向可控环形缓存,减少由于缓存状态导致的流水线停顿,保证流水线顺畅工作.采用像素选择性读取、色度空间转换(CSC)自适应等技术进一步降低功耗.实验结果表明:与其他相关设计相比,提出的流水线架构能够取得较好的处理效率和资源消耗比,在SMIC90工艺下硬件资源代价为136000门,工作频率达到150 MHz,能够满足3路1080p@30帧/s图像的实时叠加处理,最低动态功耗达到0.065 mW.

[1] KIMMEL J, HAUTANEN J, LEVOLA T. Display technologies for portable communication device [C]∥ Proceedings of IEEE. [S. l.]: IEEE, 2002, 90(4): 581-590.
[2] 赵俊,张克环,李仁发. 嵌入式通用图形加速芯片的研究与设计[J]. 计算机工程与应用, 2008, 44(26): 74-76.
ZHAO Jun, ZHANG Ke-huan, LI Ren-fa. Research and design of embedded general in age enhancement chip [J]. Computer Engineering and Application, 2008, 44(26): 74-76.
[3] 周海燕. 片上LCD控制器中多层显示的设计与实现[D]. 南京: 东南大学, 2010: 18-41.
ZHOU Hai-yan. Design and implantation of a multi-layer LCD controller based on SoC [D]. Nanjing: Southeast University, 2010: 18-41.
[4] HOLM K, GUSTAFSS O. Low-complexity and low-power color space conversion for digital video [C]∥ Norchip Conference. Linkoping: [s. n.],2006: 179-182.
[5] LE T M, AKIE K, HORI T, et al. Three images blending engine supporting multicolor formats, various color depths with small-gates size and high-quality image for SoC design [C]∥ IEEE 8th International Conference on ASIC. Changsha: IEEE, 2009: 187-190.
[6] ITU-R Rec. BT.601-5. Studio encoding parameters of digital television for standard 4:3 and widescreen 16:9 aspect ratios [S]. [S. l.]: International Radio Consultative Committee, 1995.
[7] ITU-R Rec. BT.709-5. Parameter values for the HDTV standard for production and international programme exchange [S]. [S. l.]: International Radio Consultative Committee, 2002.
[8] PANTUWANG N, CHOTIKAKAMTHORN N. Alpha channel digital image watermarking method [C]∥ ICSP 2008. 9th International Conference on Signal Processing. Beijing: [s. n.], 2008: 880-883.
[9] RAFAEL C G. 数字图像处理[M]. 3版. 北京:电子工业出版社, 2011.
[10] INOUE K, NAKAMURA H, KAWAI H, et al. A 10Mb 3D frame buffer memory with Z-compare and alpha-blend units [C]∥Solid-State Circuits Conference. San Francisco: [s. n.], 1995: 302-303.
[11] XU K, CHOY C S, CHAN C F, et al. Power-efficient VLSI Implementation of bit stream parsing in H.264/AVC decoder [C]∥IEEE International Symposium on Circuits and Systems. Island of Kos: IEEE, 2006: 5339-5442.

[1] CHEN Chao, LUO Xiao-hua, CHEN Shu-qun, YU Guo-jun. Optimizing implementation of Gaussian filter based on field programmable gate array[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(5): 969-975.
[2] LAN Fan, PAN Yun, YAN Xiao lang, HUAN Ruo hong, CHENG Kwang ting. GPU acceleration for network-on-chip yield evaluation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2017, 51(1): 160-167.
[3] XIA Kai feng, ZHOU Xiao ping, WU Bin. Memory-based FFT processor for arbitrary 2k-point FFT computations[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2016, 50(11): 2239-2244.
[4] WANG Shu peng, HUANG Kai, YAN Xiao lang. Coverage directed test generation based on genetic algorithm[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2016, 50(3): 580-588.
[5] HAN Xiao xia, HAN Yan. Layout optimization of parametric yield by filling dummy polysilicon pattern[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(12): 2333-2339.
[6] GAO Shi-yi, LUO Xiao-hua, LU Yu-feng, LIU Fu-chun, ZHANG Chen-qiu. Functional coverage convergence technique based on genetic algorithm[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(8): 1509-1515.
[7] XIU Si-wen, LI Yan-zhe, HUANG Kai, MA De, YAN Rong-jie, YAN Xiao-lang. Cache modeling for MPSoC performance estimation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(7): 1367-1375.
[8] XIU Si-wen, HUANG Kai, YU Min, XIE Tian-yi,GE Hai-tong, YAN Xiao-lang. Cache coherence protocol and implementation for multiprocessors with no-write-allocate caches[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2015, 49(2): 351-359.
[9] WANG Yu-bo, HUANG Kai, CHEN Chen, FENG Jiong, GE Hai-tong, YAN Xiao-lang. Embedded Flash data fetching acceleration techniques and implementation[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2014, 48(9): 1570-1579.
[10] XIU Si-wen, HUANG Kai, YU Min, XIE Tian-yi, GE Hai-tong, YAN Xiao-lang. Cache coherence protocol and implementation for multiprocessors with no-write-allocate caches[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2014, 48(9): 1-9.
[11] HUANG Kai-jie, HUANG Kai, MA De, WANG Yu-bo,FENG Jiong, GE Hai-tong, YAN Xiao-la. IP-XACT standard based SoC design methodology[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2013, 47(10): 1770-1776.
[12] XIANG Xiao-yan, CHEN Zhi-jian, MENG Jian-yi, YAN Xiao-lang. Low power instruction cache based on adjacent line linking access[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2013, 47(7): 1213-1217.
[13] CHEN Zhi-jian, MENG Jian-yi, GE Hai-tong, YAN Xiao-lang. Translation lookaside buffer  design  based on
dynamic memory page merging
[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2012, 46(1): 118-122.
[14] CHEN Zhi-jian, MENG Jian-yi, GE Hai-tong, YAN Xiao-lang. High performance hardware stack for seamless context switching[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2011, 45(9): 1587-1592.
[15] ZHANG Yang, WANG Xiu-min, CHEN Hao-wei. FPGA based design of LDPC encoder[J]. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 2011, 45(9): 1582-1586.