|
|
Unified hardware architecture for 2D transform in H.266/VVC |
Jun-yu CHEN1( ),Bin SUN1,*( ),Xiao-feng HUANG2,Qing-hua SHENG3,Chang-cai LAI3,Xin-yu JIN1 |
1. Polytechnic Institute, Zhejiang University, Hangzhou 310015, China 2. School of Communication Engineering, Hangzhou Dianzi University, Hangzhou 310018, China 3. School of Electronics and Information, Hangzhou Dianzi University, Hangzhou 310018, China |
|
|
Abstract A unified hardware architecture was proposed in order to reduce the hardware implementation area and the power of the 2D transform in H.266/VVC. The architecture supported the full-size discrete cosine transform (DCT-II, DCT-VIII) and the discrete sine transform (DST-VII). The architecture consisted of two parallel 1D transform modules and one transpose memory. The 1D transform module was designed based on the multiple constant multiplication (MCM), and a reusable MCM computing unit was designed for all transform types and sizes. The transpose memory was proposed in order to support the pipeline input of the mixed blocks. And the transpose memory was implemented based on static random-access memory (SRAM), used a diagonal storage method with read and write pointers, and used first input first output (FIFO) to cache block information. Experimental results showed that the unified computing unit reduced the area of the transform architecture by 1.3% and the power consumption by 49.5%, and the transpose memory reduced the SRAM storage space by half with the high-frequency zeroing feature of VVC.
|
Received: 21 November 2022
Published: 16 October 2023
|
|
Fund: 国家自然科学基金资助项目(61901150);浙江省科技计划资助项目(LGG18F010004);科技部-科技创新2030重大项目(2021ZD0109802) |
Corresponding Authors:
Bin SUN
E-mail: chenjunyu@zju.edu.cn;shg@zju.edu.cn
|
H.266/VVC二维变换的统一硬件结构
为了降低H.266/VVC中二维变换部分的硬件实现面积和功耗,提出统一的硬件结构,支持全尺寸的离散余弦变换(DCT-II, DCT-VIII)和离散正弦变换(DST-VII). 所提结构包括2个并行的一维变换模块和1个转置存储器,其中一维变换模块基于多常量乘法(MCM)设计,针对所有的变换类型和尺寸设计可复用的MCM计算单元. 为了能够支持混合块的流水输入,设计支持流水线处理的转置存储器. 该转置存储器基于静态随机存储器(SRAM)实现,使用对角线存储方案并配合读写指针进行操作,利用先入先出队列(FIFO)进行块信息缓存. 实验结果表明,统一的计算单元可以减小变换结构1.3%的面积和49.5%的功耗,转置存储器能够结合VVC高频置零的特性减少SRAM一半的存储空间.
关键词:
H.266/VVC,
离散余弦变换(DCT),
离散正弦变换(DST),
硬件结构,
专用集成电路(ASIC),
流水线
|
|
[1] |
BROSS B, CHEN J, LIU S, et al. Versatile video coding editorial refinements on draft 10 [EB/OL]. (2020-11-24) [2022-11-15]. https://jvet-experts.org/doc_end_user/documents/20_Teleconference/wg11/JVET-T2001-v2.zip.
|
|
|
[2] |
SULLIVAN G J, OHM J R, HAN W J, et al Overview of the high efficiency video coding (HEVC) standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2012, 22 (12): 1649- 1668
doi: 10.1109/TCSVT.2012.2221191
|
|
|
[3] |
BROSS B, CHEN J, OHM J R, et al Developments in international video coding standardization after AVC, with an overview of versatile video coding (VVC)[J]. Proceedings of the IEEE, 2021, 109 (9): 1463- 1493
doi: 10.1109/JPROC.2020.3043399
|
|
|
[4] |
BOSSEN F, SÜHRING K, WIECKOWSKI F, et al VVC complexity and software implementation analysis[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31 (10): 3765- 3778
doi: 10.1109/TCSVT.2021.3072204
|
|
|
[5] |
ZHAO X, KIM S H, ZHAO Y, et al Transform coding in the VVC standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31 (10): 3878- 3890
doi: 10.1109/TCSVT.2021.3087706
|
|
|
[6] |
CHEN J, YE Y, KIM S. Algorithm description for versatile video coding and test model 11 (VTM 11) [EB/OL]. (2021-01-15) [2022-11-15]. https://jvet-experts.org/doc_end_user/documents/20_Teleconference/wg11/JVET-T2002-v5.zip.
|
|
|
[7] |
GARRIDO M J, PESCADOR F, CHAVARRÍAS M, et al A high performance FPGA-based architecture for the future video coding adaptive multiple core transform[J]. IEEE Transactions on Consumer Electronics, 2018, 64 (1): 53- 60
doi: 10.1109/TCE.2018.2812459
|
|
|
[8] |
GARRIDO M J, PESCADOR F, CHAVARRÍAS M, et al A 2-D multiple transform processor for the versatile video coding standard[J]. IEEE Transactions on Consumer Electronics, 2019, 65 (3): 274- 283
doi: 10.1109/TCE.2019.2913327
|
|
|
[9] |
GARRIDO M J, PESCADOR F, CHAVARRÍAS M, et al An FPGA-based architecture for the versatile video coding multiple transform selection core[J]. IEEE Access, 2020, 8: 81887- 81903
doi: 10.1109/ACCESS.2020.2991299
|
|
|
[10] |
FAN Y B, ZENG Y X, SUN H M, et al A pipelined 2D transform architecture supporting mixed block sizes for the VVC standard[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30 (9): 3289- 3295
doi: 10.1109/TCSVT.2019.2934752
|
|
|
[11] |
DEMPSTER A G, MACLEOD M D Use of minimum-adder multiplier blocks in FIR digital filters[J]. IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing, 1995, 42 (9): 569- 577
doi: 10.1109/82.466647
|
|
|
[12] |
FARHAT I, HAMIDOUCHE W, GRILL A, et al. Lightweight hardware implementation of VVC transform block for ASIC decoder [C]// ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Barcelona: IEEE, 2020: 1663-1667.
|
|
|
[13] |
FARHAT I, HAMIDOUCHE W, GRILL A, et al Lightweight hardware transform design for the versatile video coding 4K ASIC decoders[J]. IEEE Transactions on Consumer Electronics, 2021, 67 (4): 329- 340
doi: 10.1109/TCE.2021.3126549
|
|
|
[14] |
MERT A C, KALALI E, HAMZAOGLU I High performance 2D transform hardware for future video coding[J]. IEEE Transactions on Consumer Electronics, 2017, 63 (2): 117- 125
doi: 10.1109/TCE.2017.014862
|
|
|
[15] |
HAO Z J, XU F, XIANG G Q, et al. A multiplier-less transform architecture with the diagonal data mapping transpose memory for the AVS3 standard [C]// 2021 IEEE 14th International Conference on ASIC (ASICON). Kunming: IEEE, 2021: 1-4.
|
|
|
[16] |
HAO Z J, ZHENG Q, FAN Y B, et al. An area-efficient unified transform architecture for VVC [C]// 2022 IEEE International Symposium on Circuits and Systems (ISCAS). Austin: IEEE, 2022: 2012-2016.
|
|
|
[17] |
MEHER P K, PARK S Y, MOHANTY B K, et al Efficient integer DCT architectures for HEVC[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24 (1): 168- 178
doi: 10.1109/TCSVT.2013.2276862
|
|
|
[18] |
ZHENG M K, ZHENG J Y, CHEN Z F, et al A reconfigurable architecture for discrete cosine transform in video coding[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30 (3): 810- 821
doi: 10.1109/TCSVT.2019.2896294
|
|
|
[19] |
VORONENKO Y, PÜSCHEL Multiplierless multiple constant multiplication[J]. ACM Transactions on Algorithms, 2007, 3 (2): 11
doi: 10.1145/1240233.1240234
|
|
|
[20] |
CHEN W H, SMITH C, FRALICK S A fast computational algorithm for the discrete cosine transform[J]. IEEE Transactions on Communications, 1977, 25 (9): 1004- 1009
doi: 10.1109/TCOM.1977.1093941
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|