Busca avançada
Ano de início
Entree


Survivability in Lambda Grids by means of Ant Colony Optimization

Autor(es):
Pavani, Gustavo Sousa ; Frederic, Andre Ricardo ; Ahmed, T ; Festor, O ; Ghamri-Doudane, Y ; Kang, JM ; Schaeffer-Filho, AE ; Lahmadi, A ; Madeira, E
Número total de Autores: 9
Tipo de documento: Artigo Científico
Fonte: 2021 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2021); v. N/A, p. 7-pg., 2021-01-01.
Resumo

Meta-scheduling in lambda grids is often a complex task because it typically comprises the discovery, monitoring, co-allocation, and orchestration of networking and computing resources. The support of advance reservations typically improves the performance of the lambda grid, but it also turns the meta-scheduling process much more complicated. All those mechanisms should deal with failures that may happen in the optical network or the computing infrastructure. Therefore, in this work, we propose a survivable, distributed grid meta-scheduler based on an Ant Colony Optimization (ACO) algorithm. By using restoration as the recovery mechanism, resilience against link, network node, and server node failure can be achieved. We evaluated the restorability for different combinations of meta- and local scheduling policies, and resource co-allocation algorithms under single link or single node failures. Besides, we assessed some of the parameters that may influence the restorability against server node failures, where the affected jobs are rescheduled to the remaining nodes of the grid. The results demonstrated that the ACO algorithm is capable of recovering near 100% of the jobs affected by link or server node failures for many of the combinations of meta- and local scheduling policies presented for the Server First-Relaxed (SF-R) and Network First (NF) co-allocation algorithms. (AU)

Processo FAPESP: 15/24341-7 - Novas estratégias para enfrentar a ameaça de exaustão da capacidade
Beneficiário:Helio Waldman
Modalidade de apoio: Auxílio à Pesquisa - Temático