Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Improving reliability of cooperative concurrent systems with exception flow analysis

Full text
Author(s):
Castor Filho, Fernando [1] ; Romanovsky, Alexander [2] ; Rubira, Cecilia Mary F. [3]
Total Authors: 3
Affiliation:
[1] Univ Fed Pernambuco, Inforrnat Ctr, BR-50740540 Recife, PE - Brazil
[2] Newcastle Univ, Sch Comp Sci, Newcastle Upon Tyne NE1 7RU, Tyne & Wear - England
[3] Univ Estadual Campinas, Inst Comp, BR-13084971 Campinas, SP - Brazil
Total Affiliations: 3
Document type: Journal article
Source: JOURNAL OF SYSTEMS AND SOFTWARE; v. 82, n. 5, p. 874-890, MAY 2009.
Web of Science Citations: 3
Abstract

Developers of fault-tolerant distributed systems need to guarantee that fault tolerance mechanisms they build are in themselves reliable. Otherwise, these mechanisms might in the end negatively affect overall system dependability, thus defeating the purpose of introducing fault tolerance into the system. To achieve the desired levels of reliability, mechanisms for detecting and handling errors should be developed rigorously or formally. We present an approach to modeling and verifying fault-tolerant distributed systems that use exception handling as the main fault tolerance mechanism. In the proposed approach, a formal model is employed to specify the structure of a system in terms of cooperating participants that handle exceptions in a coordinated manner, and coordinated atomic actions serve as representatives of mechanisms for exception handling in concurrent systems. We validate the approach through two case studies: (i) a system responsible for managing a production cell, and (ii) a medical control system. In both systems, the proposed approach has helped us to uncover design faults in the form of implicit assumptions and omissions in the original specifications. (C) 2008 Elsevier Inc. All rights reserved. (AU)

FAPESP's process: 06/04976-9 - Fault tolerance in large-scale computational grids
Grantee:Fernando José Castor de Lima Filho
Support Opportunities: Scholarships in Brazil - Post-Doctoral