Please wait a minute...
J4  2011, Vol. 45 Issue (9): 1587-1592    DOI: 10.3785/j.issn.1008-973X.2011.09.013
计算机技术﹑电信技术     
支持程序无缝切换的高性能硬件堆栈
陈志坚,孟建熠,葛海通,严晓浪
浙江大学 超大规模集成电路设计研究所,浙江 杭州310027
High performance hardware stack for seamless context switching
CHEN Zhi-jian, MENG Jian-yi, GE Hai-tong, YAN Xiao-lang
Institute of VLSI Design Zhejiang University, Hangzhou 310027, China
 全文: PDF  HTML
摘要:

针对函数调用中上下文切换产生的性能损失,提出一种支持程序无缝切换的嵌入式处理器高性能硬件堆栈.高性能硬件堆栈包括数据栈和返回栈,采用动态可重构的两级缓存机制,消除程序切换的性能开销.数据栈实现单周期多数据压栈/出栈,隐藏程序切换中的堆栈操作;返回栈实现指令超前预取,消除程序返回时流水线气泡.数据栈与返回栈分别复用数据和指令高速暂存器,实现用户可重构的二级缓存.实验结果显示:本方法平均提升性能10%以上,功耗降低2%.

Abstract:

A new hardware stack of embedded processor was proposed to support seamless context switching and remove the performance loss during function call. The high-performance hardware stack is composed of data stack(DS)and returning stack (RS), and both of them are designed to be reconfigurable two-level buffer scheme to eliminate the overhead of process switching. DS utilizes two alternative general purpose register (GPR) to construct a new virtual GPR, which operates multiple data in/out stack in one cycle and performs switch automatically,hiding the performance cost of stack operations during program switching. RS preserves the function return address and corresponding instruction when function is called to eliminate the pipeline bubbles during the function returnes. Both DS and RS reuse partial memory space of scratchpad memory (SPM) as the second level buffers to provide support for user reconfiguration and sufficient buffer space for specified embedded software. Experiment results show that the performance is improved by over 10% while the power cost reduced by 2 % with the new hardware stack.

出版日期: 2011-09-01
:  TN 332  
通讯作者: 孟建熠,男,讲师,博士.     E-mail: mengjy@vlsi.zju.edu.cn
作者简介: 陈志坚(1984-),男,博士生,主要从事计算机体系结构研究与高性能嵌入式处理器设计. E-mail:chenzj@vlsi.zju.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
作者相关文章  

引用本文:

陈志坚,孟建熠,葛海通,严晓浪. 支持程序无缝切换的高性能硬件堆栈[J]. J4, 2011, 45(9): 1587-1592.

CHEN Zhi-jian, MENG Jian-yi, GE Hai-tong, YAN Xiao-lang. High performance hardware stack for seamless context switching. J4, 2011, 45(9): 1587-1592.

链接本文:

https://www.zjujournals.com/eng/CN/10.3785/j.issn.1008-973X.2011.09.013        https://www.zjujournals.com/eng/CN/Y2011/V45/I9/1587

[1] BOUYSSOUNOUSE B, SIFAKIS J. The artist roadmap for research and development [C]∥ Embedded Systems Design. Secaucus, NJ, USA: SpringerVerlag New York, Inc, 2005:1-4.
[2] YAU S S, KARIM F. An adaptive middleware for contextsensitive communications for realtime applications in ubiquitous computing environments [J]. RealTime Systems, 2004, 26(1):29-61.
[3] MAMIDIPAKA M, DUTT N. Onchip stack based memory organization for low power embedded architectures. design automation and test in Europe conference and exhibition [C]∥ Proceedings of the Conference on Design, Automation and Test in Europe. Washington, DC, USA: IEEE Computer Society, 2003:11082.
[4] JANG S J, CHUNG M K, KIM J,et al. Cache missaware dynamic stack allocation [J]. Circuits and Systems, IEEE International Symposium on, 2007:3494-3497.
[5] GHOSH A, GIVARGIS T. Cache optimization for embedded processor cores: an analytical approach [J]. ACM Transactions on Design Automation of Electronic Systems (TODAES), 2004, 9 (4):419-440.
[6]ARM Ltd. ARM11 Processor Introduction [EB/OL]. [2008-09-01]. http:∥www.arm.com/products/CPUs/ARM1176.html.
[7] MIPS Technologies, Inc. MIPS 4KE Specification and User Guide [EB/OL]. [2008-09-01]. http:∥www.mips.com/products/cores/32-bit-cores/mips32-4ke/.
[8] CSKY Microsystems. 32bit High Performance and Low Power Microprocessor CK510 [EB/OL]. [2003-08-01]. http:∥ www.c-sky.com

[1] 项晓燕,陈志坚,孟建熠,严晓浪. 基于邻行链接访问的低功耗指令高速缓存[J]. J4, 2013, 47(7): 1213-1217.
[2] 陈志坚,孟建熠,葛海通,严晓浪. 基于内存页面动态合并的旁路转换缓冲器设计[J]. J4, 2012, 46(1): 118-122.