Please wait a minute...
浙江大学学报(工学版)
电气工程     
基于现场可编程门阵列的高斯滤波算法优化实现
陈超, 罗小华, 陈淑群, 俞国军
1. 浙江大学 电气工程学院, 浙江 杭州 310027; 
2. 展讯科技(杭州)有限公司,浙江 杭州 310052
Optimizing implementation of Gaussian filter based on field programmable gate array
CHEN Chao, LUO Xiao-hua, CHEN Shu-qun, YU Guo-jun
1. College of Electrical Engineering, Zhejiang University, Hangzhou 310027, China;
2. Spreadtrum Technology(Hangzhou) limited company, Hangzhou 310052, China
 全文: PDF(1704 KB)   HTML
摘要:

针对传统高斯滤波算法硬件设计方法中关键路径较长、逻辑延时较大的问题,提出加数压缩的硬件优化实现方法.在高斯滤波算法优化实现过程中,采用移位操作来实现乘法与除法计算,避免使用乘法器与除法器.并引入保留进位加法器(CSA)、基于多路选择器(MUX)的4-2压缩器、加数压缩的树型结构,对9个加数进行3个层次的压缩.经过优化后,只需1个全加器便可得求和结果.结果表明,经过加数压缩设计可以达到缩短关键路径、减少逻辑延时的目标,使逻辑延时缩小32.48%,同时还极大节省所需加法器宏单元数,为后续图像处理模块提供更大的设计自由度.

Abstract:

A hardware optimizing implementation with compressing attends was proposed for the problem that the critical path of traditional Gaussian filter hardware implementation is long and the logic delay is large. In the Gaussian filter hardware optimizing implementation process, shift operation was used to achieve multiplication and division calculation in order to avoid the use of multipliers and dividers. Three kinds of circuit structure, i.e. Carry Save Adder (CSA), 4-2 compressor based on two multiplexer (MUX), and tree structure for compressing attends, were used to compress the nine attends at three levels. After optimizing, only one full adder was needed to obtain the sum of results. The experimental results show that the hardware optimizing implementation of Gaussian filter with compressing attends can shorten the critial path and reduce the logic delay. The proposed method reduces the logic delay by32.48% and occupies less Macro Statistics of Adders compared with the traditional implementation. The saved Macro Statistics of Adders can provide greater design freedom for the following image processing units.

出版日期: 2017-05-01
CLC:  TN 47  
基金资助:

国家“863”高技术研究发展计划资助项目(2012AA041701).

通讯作者: 罗小华,男,副教授.ORCID: 0000-0002-2807-2386.     E-mail: luoxh@vlsi.zju.edu.cn
作者简介: 陈超(1992—),男,硕士,从事超大规模集成电路等研究. ORCID: 0000-0002-5975-3651. E-mail: chchenchao@zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  

引用本文:

陈超, 罗小华, 陈淑群, 俞国军. 基于现场可编程门阵列的高斯滤波算法优化实现[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2017.05.017.

CHEN Chao, LUO Xiao-hua, CHEN Shu-qun, YU Guo-jun. Optimizing implementation of Gaussian filter based on field programmable gate array. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2017.05.017.

参考文献(References):
[1] DRAPER B A, BEVERIDGE J R, BOHM A P W, ROSS C, et al. Accelerated image processing on FPGAs [J]. IEEE Transactions on Image Processing, 2003, 12(12): 1543-1551.
[2] VEGA-RODRIGUEZ M A, SANCHEZPEREZ J M, GOMEZ-PULIDO J.A. Real time image processing with reconfigurable hardware [C]∥8th IEEE International Conference on Electronics, Circuits and Systems. Malta: IEEE, 2001: 213-216.
[3] LEESER M L, MILLER S, YU H. Smart camera based on reconfigurable hardware enables diverse real time applications [C]∥ Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. Napa: IEEE,2004: 147-155.
[4] KALOMIROS J A, LYGOURAS J. Design and evaluation of a hardware/software FPGA-based system for fast image processing [J]. Microprocessors and Microsystems, 2008, 32(2): 95-106.
[5] LEE P S, LEE C S, JU H L. Development of FPGA-based digital signal processing system for radiation spectroscopy [J]. Radiation Measurements, 2013, 48(1): 12-17.
[6] 杨帆,张皓,马新文,等.基于FPGA的图像处理系统[J].华中科技大学学报:自然科学版,2015,43(2): 119-123.
YANG Fan, ZHANG Hao, MA Xin-wen, JIANG Yong, et al. Image processing system based on FPGA [J]. Journal of Huazhong University of Science & Technology :Natural Science Edition, 2015, 43(2): 119-123.
[7] 付昱强.基于FPGA的图像算法的研究与硬件实现[D].南昌:南昌大学,2006.
FU Yu-qiang. Research and hardware design of image processing algorithms based on FPGA [D]. Nanchang: Nanchang University , 2006.
[8] NIDHI, GURINDERPAL S. Efficient design of ripple carry adder and carry skip adder with low quantum cost and low power consumption [J].International Journal of Engineering Research and Applications, 2014, 4(7):247-251.
[9] EFSTATHIOU C, OWDA Z, TSIATOUHAS Y. New high-speed multioutput carry look-ahead adders [J]. IEEE Transactions on Circuits and Systems II: Express Briefs, 2013, 60(10): 667-671.
[10] ANAND V, VIJAYAKUMAR V. Optimizing area of vedic multiplier using Brent-Kung adder [J]. Research Journal of Pharmaceutical, Biological and Chemical Sciences, 2016, 7(3): 1178-1185.
[11] PRIYADHARSHINI M, THIRUVENI M. Performance evaluation of carry select adder-review [J]. International Journal of Engineering Sciences & Research Technology, 2014, 3(11): 469-475.
[12] VARUN P.S, SHIV D, MANISH R. Efficient carry skip Adder design using full adder and carry skip block based on reversible Logic [J]. American Journal of Engineering Research, 2015, 4(12): 95-100.
[13] 赵忠明,林正浩.一种改进的Wallace树型乘法器的设计[J].电子设计应用,2006: 113-116.
ZHAO Zhong-ming, LIN Zheng-hao. Optimizing multiplier design and implementation based on optimizing Wallace tree [J]. Electronic Design & Application, 2006(8): 113-116.
[14] 李军强,李东生,李奕磊,等.32×32高速乘法器的设计与实现[J].微电子学与计算机,2009,26(12): 23-26.
LI Jun-qiang, LI Dong-sheng, LI Yi-lei, et al. 32×32 High-speed Multiplier Design and Implementation [J]. Microelectronics & Computer, 2009, 26(12): 23-26.
[15] 王红君,施楠,赵辉,等.改进中值滤波方法的图像预处理技术[J].计算机系统应用,2015, 24(5): 237-240.
WANG Hong-jun, SHI Nan, ZHAO Hui, et al. Image preprocessing technology based on improved median filter [J]. Computer Systems & Applications, 2015, 24(5): 237-240.
[16] ITOH N, NAEMURA Y, MAKINO H. A 600-MHz 54×54-bit multiplier with rectangular-styled Wallace tree[J]. IEEE Journal of Solid-State Circuits, 2001,36(2): 249-257.
[1] 刘尚典,赵毅强,刘燕江,何家骥,原义栋,于艳艳. 基于遗传算法的少态节点活性提升方法[J]. 浙江大学学报(工学版), 2019, 53(8): 1546-1551.
[2] 蓝帆, 潘赟, 严晓浪, 宦若虹, CHENG Kwang ting. 片上网络良率评估的GPU加速[J]. 浙江大学学报(工学版), 2017, 51(1): 160-167.
[3] 夏凯锋,周小平,吴斌. 任意2k点存储器结构傅里叶处理器[J]. 浙江大学学报(工学版), 2016, 50(11): 2239-2244.
[4] 王树朋,黄凯,严晓浪. 基于遗传算法的覆盖率驱动测试产生器[J]. 浙江大学学报(工学版), 2016, 50(3): 580-588.
[5] 韩晓霞, 韩雁. 填充辅助多晶硅图形的参数成品率版图优化[J]. 浙江大学学报(工学版), 2015, 49(12): 2333-2339.
[6] 高史义, 罗小华, 卢宇峰, 刘富春, 张晨秋. 基于遗传算法的功能覆盖率收敛技术[J]. 浙江大学学报(工学版), 2015, 49(8): 1509-1515.
[7] 修思文, 李彦哲, 黄凯, 马德, 晏荣杰, 严晓浪. 面向MPSoC性能评估的高速缓存建模技术[J]. 浙江大学学报(工学版), 2015, 49(7): 1367-1375.
[8] 修思文, 黄凯, 余慜, 谢天艺, 葛海通, 严晓浪. 面向非写分配高速缓存的一致性协议及实现[J]. 浙江大学学报(工学版), 2015, 49(2): 351-359.
[9] 谭腾飞,马德,黄凯,马琪. 多层图像叠加处理的低功耗自适应流水线设计[J]. 浙江大学学报(工学版), 2015, 49(1): 27-35.
[10] 王钰博,黄凯,陈辰,冯炯,葛海通,严晓浪. 嵌入式Flash读取加速技术及实现[J]. 浙江大学学报(工学版), 2014, 48(9): 1570-1579.
[11] 修思文, 黄凯, 余慜, 谢天艺, 葛海通, 严晓浪. 面向非写分配高速缓存的一致性协议及实现[J]. 浙江大学学报(工学版), 2014, 48(9): 1-9.
[12] 黄凯杰, 黄凯, 马德, 王钰博, 冯炯, 葛海通, 严晓浪. 基于IP-XACT标准的SoC集成方法[J]. J4, 2013, 47(10): 1770-1776.
[13] 项晓燕,陈志坚,孟建熠,严晓浪. 基于邻行链接访问的低功耗指令高速缓存[J]. J4, 2013, 47(7): 1213-1217.
[14] 陈志坚,孟建熠,葛海通,严晓浪. 基于内存页面动态合并的旁路转换缓冲器设计[J]. J4, 2012, 46(1): 118-122.
[15] 张洋, 王秀敏, 陈豪威. 基于FPGA的低密度奇偶校验码编码器设计[J]. J4, 2011, 45(9): 1582-1586.