Abstract This study presents a new method of 4-pipelined high-performance split multiply-accumulator (MAC) architecture, which is capable of supporting multiple precisions developed for media processors. To speed up the design further, a novel partial product compression circuit based on interleaved adders and a modified hybrid partial product reduction tree (PPRT) scheme are proposed. The MAC can perform 1-way 32-bit, 4-way 16-bit signed/unsigned multiply or multiply-accumulate operations and 2-way parallel multiply add (PMADD) operations at a high frequency of 1.25 GHz under worst-case conditions and 1.67 GHz under typical-case conditions, respectively. Compared with the MAC in 32-bit microprocessor without interlocked piped stages (MIPS), the proposed design shows a great advantage in speed. Moreover, an improvement of up to 32% in throughput is achieved. The MAC design has been fabricated with Taiwan Semiconductor Manufacturing Company (TSMC) 90-nm CMOS standard cell technology and has passed a functional test.
Bing-jie XIA, Peng LIU, Qing-dong YAO. New method for high performance multiply-accumulator design. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2009, 10(7): 1067-1074.
Nikola Stosic, Ian K. Smith, Ahmed Kovacevic, Elvedin Mujic. Geometry of screw compressor rotors and their tools[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2011, 12(4): 310-326.
[5]
Li-zhong Wang, Feng Yuan, Zhen Guo, Ling-ling Li. Numerical analysis of pipeline in J-lay problem[J]. Journal of Zhejiang University-SCIENCE A (Applied Physics & Engineering), 2010, 11(11): 908-920.