|
|
Optimizing checkpoint for scientific simulations |
Xi-sheng Xiao, Ying-ping Huang, Xi-hui Zhang |
Economics & Management College, Southwest Jiaotong University, Chengdu 610031, China; Industrial and Commercial College, Guizhou University of Finance and Economics, Guiyang 550003, China; College of Business, University of North Alabama, Florence, AL 35632, USA |
|
|
Abstract It is extremely time-consuming to restart a long-running simulation from the beginning when a failure occurs. Checkpointing is a viable solution that enables simulations to be resumed from the point of failure. We study three models to determine the optimal checkpoint interval between contiguous checkpoints so that the total execution time is minimized and we demonstrate that optimal checkpointing can facilitate self-optimizing. This study greatly advances our knowledge of and practice in optimizing long-running scientific simulations.
|
Received: 12 May 2012
Published: 09 December 2012
|
|
Optimizing checkpoint for scientific simulations
It is extremely time-consuming to restart a long-running simulation from the beginning when a failure occurs. Checkpointing is a viable solution that enables simulations to be resumed from the point of failure. We study three models to determine the optimal checkpoint interval between contiguous checkpoints so that the total execution time is minimized and we demonstrate that optimal checkpointing can facilitate self-optimizing. This study greatly advances our knowledge of and practice in optimizing long-running scientific simulations.
关键词:
Checkpoint,
Long-running,
Optimizing,
Simulation
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|