Please wait a minute...
Front. Inform. Technol. Electron. Eng.  2015, Vol. 16 Issue (12): 1018-1033    DOI: 10.1631/FITEE.1500035
    
同构多核处理器中考虑制造差异的调度优化
Zhi-xiang Chen, Zhao-lin Li, Shan Cao, Fang Wang, Jie Zhou
Department of Automation, Tsinghua University, Beijing 100084, China; Institute of Microelectronics, Tsinghua University, Beijing 100084, China; Research Institute of Information Technology, Tsinghua University, Beijing 100084, China; Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China; The School of Information and Electronics, Beijing Institute of Technology, Beijing 100084, China
Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity
Zhi-xiang Chen, Zhao-lin Li, Shan Cao, Fang Wang, Jie Zhou
Department of Automation, Tsinghua University, Beijing 100084, China; Institute of Microelectronics, Tsinghua University, Beijing 100084, China; Research Institute of Information Technology, Tsinghua University, Beijing 100084, China; Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing 100084, China; The School of Information and Electronics, Beijing Institute of Technology, Beijing 100084, China
 全文: PDF 
摘要: 目的:面向具有多个同构核心的处理器平台,考虑纳米级工艺下制造导致的差异性,实现性能最佳的调度优化。
创新点:提出一种离线生成多个候选调度结合在线调度绑定的方案,从而充分开采了制造差异性下的核心最大可工作频率的变化,取得了整体上的高性能。
方法:首先,考虑制造差异导致的性能变化,提出一种离线结合在线的调度优化方案。在离线阶段,考虑制造差异的分布情况,以期望性能为指标,选择代表性的芯片工作点并得到其对应的最佳调度,用于生成候选调度并存储在芯片上。其中,通过芯片工作点采样来解决芯片工作点数量的指数增长问题,并且将期望性能的最优化求解在一定的约束下转化为芯片工作点之间的关系,从而降低整体方案的复杂度。在在线阶段,芯片启动时,根据当前芯片的工作点与候选调度对应的芯片工作点之间的关系确定性能最优的调度。
结论:针对纳米工艺下呈现制造差异的多核处理器平台,提出了一种自适应的调度优化策略,实现了性能上的提升。
关键词: 调度优化多核处理器差异性代表芯片工作点    
Abstract: Multi-core homogeneous processors have been widely used to deal with computation-intensive embedded applications. However, with the continuous down scaling of CMOS technology, within-die variations in the manufacturing process lead to a significant spread in the operating speeds of cores within homogeneous multi-core processors. Task scheduling approaches, which do not consider such heterogeneity caused by within-die variations, can lead to an overly pessimistic result in terms of performance. To realize an optimal performance according to the actual maximum clock frequencies at which cores can run, we present a heterogeneity-aware schedule refining (HASR) scheme by fully exploiting the heterogeneities of homogeneous multi-core processors in embedded domains. We analyze and show how the actual maximum frequencies of cores are used to guide the scheduling. In the scheme, representative chip operating points are selected and the corresponding optimal schedules are generated as candidate schedules. During the booting of each chip, according to the actual maximum clock frequencies of cores, one of the candidate schedules is bound to the chip to maximize the performance. A set of applications are designed to evaluate the proposed scheme. Experimental results show that the proposed scheme can improve the performance by an average value of 22.2%, compared with the baseline schedule based on the worst case timing analysis. Compared with the conventional task scheduling approach based on the actual maximum clock frequencies, the proposed scheme also improves the performance by up to 12%.
Key words: Schedule refining    Multi-core processor    Heterogeneity    Representative chip operating point
收稿日期: 2015-02-01 出版日期: 2015-12-07
CLC:  TP302  
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
Zhi-xiang Chen
Zhao-lin Li
Shan Cao
Fang Wang
Jie Zhou

引用本文:

Zhi-xiang Chen, Zhao-lin Li, Shan Cao, Fang Wang, Jie Zhou. Schedule refinement for homogeneous multi-core processors in the presence of manufacturing-caused heterogeneity. Front. Inform. Technol. Electron. Eng., 2015, 16(12): 1018-1033.

链接本文:

http://www.zjujournals.com/xueshu/fitee/CN/10.1631/FITEE.1500035        http://www.zjujournals.com/xueshu/fitee/CN/Y2015/V16/I12/1018

[1] Pawe? Czarnul. 具有动态变化服务可用性的工作流应用程序调度算法比较[J]. Front. Inform. Technol. Electron. Eng., 2014, 15(6): 401-422.
[2] Hong-ze Leng, Jun-qiang Song, Fu-kang Yin, Xiao-qun Cao. Notes and correspondence on ensemble-based three-dimensional variational filters[J]. Front. Inform. Technol. Electron. Eng., 2013, 14(8): 634-641.
[3] Dan Wu, Xue-cheng Zou, Kui Dai, Jin-li Rao, Pan Chen, Zhao-xia Zheng. Implementation and evaluation of parallel FFT on Engineering and Scientific Computation Accelerator (ESCA) architecture[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(12): 976-989.
[4] Che-Wei Lin, Chang Hong Lin, Wei Jhih Wang. [J]. Frontiers of Information Technology & Electronic Engineering, 2011, 12(8): 629-637.
[5] Ai-lian Cheng, Yun Pan, Xiao-lang Yan, Ruo-hong Huan. A general communication performance evaluation model based on routing path decomposition[J]. Front. Inform. Technol. Electron. Eng., 2011, 12(7): 561-573.