Please wait a minute...
J4  2011, Vol. 45 Issue (7): 1187-1193    DOI: 10.3785/j.issn.1008-973X.2011.07.008
电子、通信与自动控制技术     
H.264/AVC子像素插值的高性能流水线设计及实现
李春澍1,黄凯1,修思文1,马德1,葛海通2,严晓浪1
1. 浙江大学 超大规模集成电路设计研究所,浙江 杭州 310027; 2. 杭州中天微系统有限公司,浙江 杭州 310027
High efficient pipeline design and implementation for sub-pixel
interpolation process in H.264/AVC
LI Chun-shu1, HUANG Kai1, XIU Si-wen1, MA De1, GE Hai-tong2, YAN Xiao-lang1
1.Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China;
2.Hangzhou csky Microsystem Corporation, Hangzhou 310027, China
 全文: PDF  HTML
摘要:

针对在H.264/AVC视频解码系统中子像素插值过程复杂度高的问题,提出一种子像素插值的2层流水线设计方法.第1层流水机制是当8×8分割块内部4个4×4块具有相同的运动信息时,基于4×4分割块参考像素读取和插值运算的两级流水,实现了不同4×4块插值过程的并行操作.第2层流水机制利用插值运算算法中1/2像素值之间的无依赖性以及水平和垂直插值运算过程的对称性,加速了各子像素位置处的像素插值运算过程.核心插值运算单元包括13个6阶滤波器、4个双线性插值滤波器和4个色度插值滤波器.插值运算过程的并行流水机制至少缩减了75%的插值运算时间.实验结果表明,与其他同领域工作相比,该架构设计的硬件开销较小,外部存储器访问量降低了47%,子像素插值性能提高了30%.

Abstract:

A twolevel pipeline architecture was proposed in order to decrease the high complexity of subpixel interpolation process in H.264/AVC decoding system. The first level pipeline scheme was utilized to explore the parallelism for the interpolation processes of different 4×4 blocks with two stages of fetching 4×4 block’s reference pixels and interpolation computation operation when the four 4×4 blocks inside one 8×8 block share the same motion information. The second level pipeline scheme was used to accelerate the subpixel interpolation computation operation of different pixels by using the independence of adjacent halfpixels and the symmetry between horizontal and vertical interpolation computation processes. The kernel interpolation computation unit was implemented with 13 sixtap filters, 4 bilinear interpolation filters and 4 chroma interpolation filters. The pipelining and parallelism in interpolation computation process can reduce computation time by at least 75%. Experimental results show that the proposed architecture design can reduce the external memory bandwidth by 47% and improve the performance of subpixel interpolation by 30% at a lower hardware cost compared with other designs.

出版日期: 2011-07-01
:  TN 919.8  
通讯作者: 黄凯,男,博士后.     E-mail: huangk@vlsi.zju.edu.cn
作者简介: 李春澍(1985-), 男, 硕士生, 从事视频编解码和系统芯片设计的研究. E-mail: lics@vlsi.zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  

引用本文:

李春澍,黄凯,修思文,马德,葛海通,严晓浪. H.264/AVC子像素插值的高性能流水线设计及实现[J]. J4, 2011, 45(7): 1187-1193.

LI Chun-shu, HUANG Kai, XIU Si-wen, MA De, GE Hai-tong, YAN Xiao-lang. High efficient pipeline design and implementation for sub-pixel
interpolation process in H.264/AVC. J4, 2011, 45(7): 1187-1193.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2011.07.008        https://www.zjujournals.com/eng/CN/Y2011/V45/I7/1187

[1] ITUT Rec. H.264 and ISO/IEC 1448610 AVC. Draft ITUT recommendation and final draft international standard of joint video specification [S]. [S. l.]: JVT, 2003.
[2] LIN Chienchang, CHEN Jiawei, CHANG Hsiucheng, et al. A 160K gates/45 kB SRAM H.264 video decoder for HDTV applications [J]. IEEE Journal of SolidState Circuits, 2007, 42(1): 170-182.
[3] XU Ke, CHOY Chiusing. A powerefficient and selfadaptive prediction engine for H.264/AVC decoding [J]. IEEE Transactions on Very Large Scale Integration Systems, 2008, 16(3): 302-313.
[4] LEI Yu, LI Hui, HUANG Kai, et al. A H.264 video decoder with scheme of efficient bandwidth optimization for motion compensation [C]∥ International Symposium on Communications and Information Technologies. Sydney, Australia: IEEE, 2007: 531-534.
[5] YANG Kun, ZHANG Chun, DU Guoze, et al. A hardwaresoftware codesign for H.264/AVC decoder [C]∥ Asia SolidState Circuit Conference. Hangzhou, China: IEEE, 2006: 119-122.
[6] 戴郁,李冬晓,郑伟,等. H.264/AVC运动补偿的高效插值结构设计[J]. 浙江大学学报:工学版, 2009, 43(2): 255-260.
DAI Yu, LI Dongxiao, ZHENG Wei, et al. Efficient interpolation architecture design for motion compensation in H.264/AVC [J]. Journal of Zhejiang University: Engineering Science, 2009, 43(2): 255-260.
[7] WANG Ronggang, LI Mo, LI Jintao, et al. High throughput and low memory access subpixel interpolation architecture for H.264/AVC HDTV decoder [J]. IEEE Transactions on Consumer Electronics, 2005, 51(3): 1006-1013.
[8] CHUANG Tzuder, CHANG Lomei, CHIU Tsaiwei, et al. Bandwidthefficient cachebased motion compensation architecture with dramfriendly data access control [C]∥ Acoustics, Speech and Signal Processing. Taipei, Taiwan, China: IEEE, 2009: 2009-2012.
[9] 姚栋,虞露. MPEG4运动补偿的亚像素内插过程及其硬件实现[J]. 浙江大学学报:工学版, 2005, 39(11): 1703-1707.
YAO Dong, YU Lu. Subpixel interpolation of MPEG4 motion compensation and its hardware implementation [J]. Journal of Zhejiang University: Engineering Science, 2005, 39(11): 1703-1707.
[10] KIM J H, HYUN G H, LEE H J. Cache organization for H.264/AVC motion compensation [C]∥ Embedded and RealTime Computing Systems and Applications. Daegu, Korea: IEEE, 2007: 534-541.
[11] FINCHELSTEIN D F, SZE V, CHANDRAKASAN A P. Multicore processing and efficient onchip caching for H.264 and future video decoders [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(11): 1704-1722.

[1] 刘云鹏, 张三元, 王仁芳, 张引. 适于交通视频的时间可伸缩帧间快速编码算法[J]. J4, 2013, 47(3): 400-408.
[2] 张申,王维东,赵亚飞,吴祖成,王曰海,张明. 基于三维离散余弦变换的体三维视频数据压缩[J]. J4, 2012, 46(1): 112-117.
[3] 马德,黄凯,陈华锋,余慜,严晓浪. H.264去块效应滤波器的混合递增滤波流水线设计[J]. J4, 2011, 45(7): 1206-1214.
[4] 杜娟,丁丹丹,虞露. 基于FPGA的可重构视频编码器设计[J]. J4, 2012, 46(5): 905-911.