|
|
Translation lookaside buffer design based on
dynamic memory page merging |
CHEN Zhi-jian, MENG Jian-yi, GE Hai-tong, YAN Xiao-lang |
Institute of VLSI Design, Zhejiang University, Hangzhou 310027, China |
|
|
Abstract Virtual memory pages and physical memory pages are often sequentially allocated in traditional memory management algorithms. A translation lookaside buffer (TLB) design method was proposed to merge two sequential small size memory pages into a large one during the processor execution. The mapping size of each TLB entry is automatically enlarged with recursive memory page merging. Consequently, the utilization efficiency of TLB can be improved and the TLB miss rate can be reduced. A new uTLB architecture composed of fuTLB and suTLB was proposed. Both fuTLB and suTLB are not only used as the first level address translation buffer of the two-level TLB architecture, but also provided as the temporary buffer for hardware based dynamic page merging. The page merging operation is processed by hardware and not affected by software. Experimental results from Mibench show that the TLB miss ratio can be reduced by 27% with the new TLB design method compared with the filter-TLB design method.
|
Published: 22 February 2012
|
|
基于内存页面动态合并的旁路转换缓冲器设计
针对内存管理中虚拟页面和物理页面连续分配的特性,提出可对相邻页面进行动态合并的旁路转换缓冲器(TLB)设计方法.该方法的核心思想是在处理器运行过程中,通过对相邻页面的递归合并,动态扩展单个TLB表项的地址映射范围,提高TLB表项的利用率并降低TLB缺失率.在两级TLB架构中,提出基于快速uTLB(fuTLB)和影子uTLB(suTLB)动态切换的新型uTLB结构,作为两级TLB架构的一级缓存,为页面动态合并提供现场和载体,页面合并过程对软件透明.基于Mibench测试基准的实验结果表明,与filter-TLB架构相比,该页面动态合并方法可以平均降低TLB缺失率达27%.
|
|
[1] ARM Ltd. ARM processor specifications [EB/OL].2008-09-01. http:∥www.arm.com.
[2] TALLURI M, HILL M D. Surpassing the TLB performance of superpages with less operating system support [C]∥ Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems. San Jose: IEEE, 1994: 171-182.
[3] MIPS Technologies. MIPS processor features [EB/OL].2008-09-01. http:∥www.mips.com.
[4] LEE J H, LEE J S, KIM S D. A dynamic TLB management structure to support different page sizes [C]∥ Proceedings of the 2nd IEEE Asia Pacific Conference on ASICs. Cheju, Korea: IEEE, 2000:299-302.
[5] SAMANTA R, SURPRISE J, MAHAPATR R. Dynamic aggregation of virtual addresses in TLB using TCAM cells [C]∥ Proceedings of the 21st International Conference on VLSI Design. Washington D.C.: IEEE, 2008: 243-248.
[6] PETERSON J L, NORMAN T A. Buddy systems [J]. Communications of the ACM, 1977, 20(6): 421-431.
[7] MCKENNEY P E, SLINGWINE J. Efficient kernel memory allocation on sharedmemory multiprocessors [C]∥ Proceedings of the USENIX Winter 1993 Technical Conference. San Diego: USENIX, 1993: 295-305.
[8] BONWICK J. The slab allocator:an objectcaching kernel memory allocator [C]∥ Proceedings of the USENIX Summer 1994 Technical Conference. Berkeley: USENIX, 1994: 87-98.
[9] CSKY microsystems. 32bit high performance and low power embedded processor [EB/OL].2003-08-01. http:∥www.csky.com.
[10] LEE J H, PARK G H, PARK S B. A selective filterbank TLB system [C]∥Proceedings of the 2003 International Symposium on Low Power Electronics and Design. Seoul: IEEE, 2003: 312-317. |
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|