Busca avançada
Ano de início
Entree


Parallelizing Git Checkout: a Case Study of I/O Parallelism

Texto completo
Autor(es):
Bernardino, Matheus Tavares ; Goldman, Alfredo ; IEEE Comp Soc
Número total de Autores: 3
Tipo de documento: Artigo Científico
Fonte: 2022 IEEE 34TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2022); v. N/A, p. 12-pg., 2022-01-01.
Resumo

Version control systems (VCS) are tools used to track and manage the changes made to a set of files over time. Among the VCS tools available today, Git has become the most popular for software development. Being used in small personal projects of a few megabytes and massive corporate repositories with more than 300 GB and 3.5 million files, speed and scalability are among the top priorities for the tool. However, its performance sometimes falls short of what is desired on networked file systems (e.g. NFS), where input and output (I/O) operations tend to be more costly. In particular, that is the case for the checkout command, which is responsible for restoring files from specific versions of a project. Despite the optimizations implemented over the years, the sequential processing of files still carried a large time penalty for NFS, as well as being suboptimal for local file systems on SSDs. In this project, we worked to parallelize the Git checkout machinery, resulting in speedups of up to 4.5x on NFS and 3.6x on SSDs. We also studied how parallelism affects the I/O requests performed by checkout on different storage systems. The optimization was submitted upstream and made available to all Git users starting at version 2.32.0, from June 2021. (AU)

Processo FAPESP: 19/26702-8 - Tendências em computação de alto desempenho, do gerenciamento de recursos a novas arquiteturas de computadores
Beneficiário:Alfredo Goldman vel Lejbman
Modalidade de apoio: Auxílio à Pesquisa - Temático