Please wait a minute...
浙江大学学报(工学版)
计算机技术﹑电信技术     
嵌入式Flash读取加速技术及实现
王钰博1,黄凯1,陈辰1,冯炯2,葛海通2,严晓浪1
1.浙江大学 超大规模集成电路研究所,浙江 杭州 310027;2. 杭州中天微系统有限公司,浙江 杭州 310027
Embedded Flash data fetching acceleration techniques and implementation
WANG Yu-bo1, HUANG Kai1, CHEN Chen1, FENG Jiong2, GE Hai-tong2, YAN Xiao-lang1
1. Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China; 2. Hangzhou C-SKY Micro-system Company, Hangzhou 310027, China
 全文: PDF(5152 KB)   HTML
摘要:

为了解决低成本和低功耗应用中的嵌入式Flash读取速度问题,提出多种基于缓存结构的嵌入式Flash读取加速技术及实现,包括低频快速访问技术、回填隐藏技术和改进型关键字优先预取策略,以及具有自适应预取功能的缓存锁定技术、预查找技术等,通过这些技术的整合应用,在提高Flash读取性能的同时,保持较低的功耗.仿真实验证明:在占用资源(缓存容量)较少,频率较低(用于部分低功耗应用)的环境下,这些技术的应用使加速控制器的加速性能与传统的2路组相联缓存相比得到了明显的提升(20%~40%),同时加速控制器中读加速单元的动态功耗与传统2路组相联缓存相比降低了40%左右.

Abstract:

Some embedded Flash data fetching acceleration techniques based on cache were proposed and implemented, which are used for low-cost, low-power consumption application, including low frequency fast access, backfill hidden with modified critical-word-first strategy, cache-lock with adaptive prefetching, and pre-lookup. With the combination of these techniques, the Flash data fetching performance is improved and the power dissipation is kept low. Simulations show that when the resource on chip (cache size) is limited and the system frequency is low (for some low-power consumption applications), the embedded Flash accelerator with these techniques has higher performance(20%-40% higher) and lower dynamic power consumption (about 40% lower) compared with conventional two-way set-associative cache.

出版日期: 2014-09-01
:  TN 47  
通讯作者: 黄凯, 男, 副教授     E-mail: huangk@vlsi.zju.edu.cn
作者简介: 王钰博(1987-), 男,硕士生, 主要研究方向为SoC设计.E-mail:wangyb@vlsi.zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

王钰博,黄凯,陈辰,冯炯,葛海通,严晓浪. 嵌入式Flash读取加速技术及实现[J]. 浙江大学学报(工学版), 10.3785/j.issn.1008-973X.2014.09.005.

WANG Yu-bo, HUANG Kai, CHEN Chen, FENG Jiong, GE Hai-tong, YAN Xiao-lang. Embedded Flash data fetching acceleration techniques and implementation. JOURNAL OF ZHEJIANG UNIVERSITY (ENGINEERING SCIENCE), 10.3785/j.issn.1008-973X.2014.09.005.

链接本文:

http://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2014.09.005        http://www.zjujournals.com/eng/CN/Y2014/V48/I9/1570

[1] BREWER J, GILL M. Nonvolatile memory technologies with emphasis on flash: a comprehensive guide to understanding and using flash memory devices [M]. Hoboken: Wiley, 2011: 1962.
[2] 周立功. ARM嵌入式系统基础教程[M]. 2版. 北京:北京航空航天大学出版社, 2008: 158-162.
[3] STM32F401xB/STM32F401xC datasheet [EB/OL]. [2013-04]. http:∥www.st.com/st-web-ui/static/active/en/resource/technical/document/data_brief/DM00071938.pdf.
[4] GOODHUE G K, KHAN A R, WHARTON J H, et al. Memory accelerator for ARM processors: US, 0021928[P].2005-01-27.
[5] GSMC Embedded FLASH IP datasheet (ESF2-130E 320Kx8 E-Flash IP (FLS2P5M28DA)) [EB/OL]. [2012-05]. http:∥sso.gracesemi.com/domino/servlet/GetCVSFile.
[6] VeriSilicon GSMC 013 μm single-port register file compiler [CP/OL]. [2006]. http:∥www.verisilicon.com/.
[7] HENNESSY J L, PATTERSON D A. Computer architecture: a quantitative approach [M]. 5th Edition. \[S.l.\]:Elsevier, 2012: C1C58.
[8] 潘赟. CK-CPU嵌入式系统开发教程[M]. 北京:科学出版社, 2011: 54-74.
[9] LIU T, LI M, XUE C J. Instruction cache locking for multi-task real-time embedded systems [J]. Real-Time Systems, 2012, 48(2): 166-197.
[10] APARICIO L C, SEGARRA J, RODRIGUEZ C, et al. Improving the WCET computation in the presence of a lockable instruction cache in multitasking real-time systems[J]. Journal of Systems Architecture, 2011, 57(7): 695-706.
[11] PLAZAR S, KLEINSORGE J C, MARWEDEL P, et al. WCET-aware static locking of instruction caches[C]∥Proceedings of the Tenth International Symposium on Code Generation and Optimization.[S.l.]:ACM, 2012: 44-52.
[12] JUNG-WOOK P, CHEONG-GHIL K, JUNG-HOON L, et al. An energy efficient cache memory architecture for embedded systems [C]∥ Proceedings of the 2004 ACM Symposium on Applied Computing. New York, USA: ACM, 2004: 884-890.
[13] LEE C J. DRAM-Aware Prefetching and Cache Management [D]. Austin: University of Texas, 2010.
[14] ZANG W, GORDON-ROSS A. A survey on cache tuning from a power/energy perspective [J]. ACM Computing Surveys (CSUR), 2013, 45(3): 32:132:49.
[15] GSMC GRA_FLS2P5M28DA IP overview [EB/OL]. [2011-03]. http:∥sso.gracesemi.com/qra/TDISDocs.nsf/TDISRecord/160F56A1EE02A38248257B 36000972F2?opendocument.

[1] 陈超, 罗小华, 陈淑群, 俞国军. 基于现场可编程门阵列的高斯滤波算法优化实现[J]. 浙江大学学报(工学版), 2017, 51(5): 969-975.
[2] 蓝帆, 潘赟, 严晓浪, 宦若虹, CHENG Kwang ting. 片上网络良率评估的GPU加速[J]. 浙江大学学报(工学版), 2017, 51(1): 160-167.
[3] 夏凯锋,周小平,吴斌. 任意2k点存储器结构傅里叶处理器[J]. 浙江大学学报(工学版), 2016, 50(11): 2239-2244.
[4] 王树朋,黄凯,严晓浪. 基于遗传算法的覆盖率驱动测试产生器[J]. 浙江大学学报(工学版), 2016, 50(3): 580-588.
[5] 韩晓霞, 韩雁. 填充辅助多晶硅图形的参数成品率版图优化[J]. 浙江大学学报(工学版), 2015, 49(12): 2333-2339.
[6] 高史义, 罗小华, 卢宇峰, 刘富春, 张晨秋. 基于遗传算法的功能覆盖率收敛技术[J]. 浙江大学学报(工学版), 2015, 49(8): 1509-1515.
[7] 修思文, 李彦哲, 黄凯, 马德, 晏荣杰, 严晓浪. 面向MPSoC性能评估的高速缓存建模技术[J]. 浙江大学学报(工学版), 2015, 49(7): 1367-1375.
[8] 修思文, 黄凯, 余慜, 谢天艺, 葛海通, 严晓浪. 面向非写分配高速缓存的一致性协议及实现[J]. 浙江大学学报(工学版), 2015, 49(2): 351-359.
[9] 谭腾飞,马德,黄凯,马琪. 多层图像叠加处理的低功耗自适应流水线设计[J]. 浙江大学学报(工学版), 2015, 49(1): 27-35.
[10] 修思文, 黄凯, 余慜, 谢天艺, 葛海通, 严晓浪. 面向非写分配高速缓存的一致性协议及实现[J]. 浙江大学学报(工学版), 2014, 48(9): 1-9.
[11] 黄凯杰, 黄凯, 马德, 王钰博, 冯炯, 葛海通, 严晓浪. 基于IP-XACT标准的SoC集成方法[J]. J4, 2013, 47(10): 1770-1776.
[12] 项晓燕,陈志坚,孟建熠,严晓浪. 基于邻行链接访问的低功耗指令高速缓存[J]. J4, 2013, 47(7): 1213-1217.
[13] 陈志坚,孟建熠,葛海通,严晓浪. 基于内存页面动态合并的旁路转换缓冲器设计[J]. J4, 2012, 46(1): 118-122.
[14] 陈志坚,孟建熠,葛海通,严晓浪. 支持程序无缝切换的高性能硬件堆栈[J]. J4, 2011, 45(9): 1587-1592.
[15] 张洋, 王秀敏, 陈豪威. 基于FPGA的低密度奇偶校验码编码器设计[J]. J4, 2011, 45(9): 1582-1586.