Energy awareness through conversational interfaces and prediction on dynamic resid...
![]() | |
Author(s): |
Ulisses Furquim Freire da Silva
Total Authors: 1
|
Document type: | Master's Dissertation |
Press: | Campinas, SP. |
Institution: | Universidade Estadual de Campinas (UNICAMP). Instituto de Computação |
Defense date: | 2005-05-05 |
Examining board members: |
Islene Calciolari Garcia;
Edson Caceres;
Edmundo Roberto Mauro Madeira
|
Advisor: | Islene Calciolari Garcia |
Abstract | |
A fault-tolerant distributed system based on rollback-recovery has to checkpoints of its processes are stored. Besides this selection, that is controlled checkpointing protocol, the system has to do garbage collection, in order to eliminate that become obsolete while the application executes. The garbage collection because checkpoints require the use of storage resources and the storage has limited capacity. So, when some fault occurs, the whole distributed be restored to a consistent global state previously stored. This dissertation practical and theoretical aspects of a fault-tolerant distributed system quasisynchronous checkpointing protocols and also garbage collection and algorithms. There are several checkpointing protocols proposed in the literature, quasisynchronous ones were studied in this dissertation. These protocols information in the application's messages and can induce forced checkpoints, need any synchronization or exchanging of control messages among on that study, a framework for quasi-synchronous checkpointing implemented in a message passing library called LAM/MPI. Moreover, a based on rollback-recovery from faults named Curupira was also implemented in that library. Curupira is the _rst software architecture exchanging of control messages or any synchronization among the execution of the checkpointing and garbage collection protocols (AU) |