Busca avançada
Ano de início
Entree
(Referência obtida automaticamente do Web of Science, por meio da informação sobre o financiamento pela FAPESP e o número do processo correspondente, incluída na publicação pelos autores.)

Scaling Up Modulo Scheduling for High-Level Synthesis

Texto completo
Autor(es):
Rosa, Leandro de Souza [1] ; Bouganis, Christos-Savvas [2] ; Bonato, Vanderlei [1]
Número total de Autores: 3
Afiliação do(s) autor(es):
[1] Univ Sao Paulo, Inst Math & Comp Sci, BR-05508900 Sao Carlos, SP - Brazil
[2] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ - England
Número total de Afiliações: 2
Tipo de documento: Artigo Científico
Fonte: IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS; v. 38, n. 5, p. 912-925, MAY 2019.
Citações Web of Science: 0
Resumo

High-level synthesis (HLS) tools have been increasingly used within the hardware design community to bridge the gap between productivity and the need to design large and complex systems. When targeting heterogeneous systems, where the CPU and the field-programmable gate array (FPGA) fabric are both available to perform computations, a design space exploration (DSE) is usually carried out for deciding which parts of the initial code should be mapped to the FPGA fabric such as the overall system's performance is enhanced by accelerating its computation via dedicated processors. As the targeted systems become more complex and larger, leading to a large DSE, the fast estimative of the possible acceleration that can be obtained by mapping certain functionality into the FPGA fabric is of paramount importance. Loop pipelining, which is responsible for the majority of HLS compilation time, is a key optimization toward achieving high-performance acceleration kernels. A new modulo scheduling algorithm is proposed, which reformulates the classical modulo scheduling problem and leads to a reduced number of integer linear problems solved, resulting in large computational savings. Moreover, the proposed approach has a controlled tradeoff between solution quality and computation time. Results show the scalability is improved efficiently from quadratic, for the state-of-the-art method, to linear, for the proposed approach, while the optimized loop suffers a 1% (geomean) increment in the total number of cycles. (AU)

Processo FAPESP: 16/13327-6 - Exploração do espaço de projeto em sistemas heterogêneos para aplicações de alto desempenho
Beneficiário:Leandro de Souza Rosa
Modalidade de apoio: Bolsas no Exterior - Estágio de Pesquisa - Doutorado Direto