Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

An Online Data Access Prediction and Optimization Approach for Distributed Systems

Full text
Author(s):
Ishii, Renato Porfirio [1] ; de Mello, Rodrigo Fernandes [2]
Total Authors: 2
Affiliation:
[1] Fed Univ Mato Grosso do Sul UFMS, Fac Comp Sci, BR-79070900 Campo Grande, MS - Brazil
[2] Univ Sao Paulo, Dept Comp Sci, Inst Math & Comp Sci, BR-13560970 Sao Carlos, SP - Brazil
Total Affiliations: 2
Document type: Journal article
Source: IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS; v. 23, n. 6, p. 1017-1029, JUN 2012.
Web of Science Citations: 13
Abstract

Current scientific applications have been producing large amounts of data. The processing, handling and analysis of such data require large-scale computing infrastructures such as clusters and grids. In this area, studies aim at improving the performance of data-intensive applications by optimizing data accesses. In order to achieve this goal, distributed storage systems have been considering techniques of data replication, migration, distribution, and access parallelism. However, the main drawback of those studies is that they do not take into account application behavior to perform data access optimization. This limitation motivated this paper which applies strategies to support the online prediction of application behavior in order to optimize data access operations on distributed systems, without requiring any information on past executions. In order to accomplish such a goal, this approach organizes application behaviors as time series and, then, analyzes and classifies those series according to their properties. By knowing properties, the approach selects modeling techniques to represent series and perform predictions, which are, later on, used to optimize data access operations. This new approach was implemented and evaluated using the OptorSim simulator, sponsored by the LHC-CERN project and widely employed by the scientific community. Experiments confirm this new approach reduces application execution time in about 50 percent, specially when handling large amounts of data. (AU)

FAPESP's process: 11/02655-9 - Analysis of influences of centralized and distributed process scheduling decisions
Grantee:Rodrigo Fernandes de Mello
Support Opportunities: Regular Research Grants