Busca avançada
Ano de início
Entree


Generalizations of the genomic rank distance to indels

Texto completo
Autor(es):
Pereira Zanetti, Joao Paulo ; Oliveira, Lucas Peres ; Chindelevitch, Leonid ; Meidanis, Joao
Número total de Autores: 4
Tipo de documento: Artigo Científico
Fonte: Bioinformatics; v. 39, n. 3, p. 10-pg., 2023-03-01.
Resumo

Motivation: The rank distance model represents genome rearrangements in multi-chromosomal genomes as matrix operations, which allows the reconstruction of parsimonious histories of evolution by rearrangements. We seek to generalize this model by allowing for genomes with different gene content, to accommodate a broader range of biological contexts. We approach this generalization by using a matrix representation of genomes. This leads to simple distance formulas and sorting algorithms for genomes with different gene contents, but without duplications.Results: We generalize the rank distance to genomes with different gene content in two different ways. The first approach adds insertions, deletions and the substitution of a single extremity to the basic operations. We show how to efficiently compute this distance. To avoid genomes with incomplete markers, our alternative distance, the rank-indel distance, only uses insertions and deletions of entire chromosomes. We construct phylogenetic trees with our distances and the DCJ-Indel distance for simulated data and real prokaryotic genomes, and compare them against reference trees. For simulated data, our distances outperform the DCJ-Indel distance using the Quartet metric as baseline. This suggests that rank distances are more robust for comparing distantly related species. For real prokaryotic genomes, all rearrangement-based distances yield phylogenetic trees that are topologically distant from the reference (65% similarity with Quartet metric), but are able to cluster related species within their respective clades and distinguish the Shigella strains as the farthest relative of the Escherichia coli strains, a feature not seen in the reference tree. (AU)

Processo FAPESP: 17/02748-3 - Avanços na Teoria de Rearranjos de Genomas Baseada em Matrizes
Beneficiário:João Paulo Pereira Zanetti
Modalidade de apoio: Bolsas no Brasil - Pós-Doutorado
Processo FAPESP: 20/00740-8 - Extensões da distância de posto e contagem de cenários de ordenação
Beneficiário:Lucas Peres Oliveira
Modalidade de apoio: Bolsas no Brasil - Mestrado
Processo FAPESP: 18/00031-7 - Estudos sobre comparação de genomas
Beneficiário:João Meidanis
Modalidade de apoio: Auxílio à Pesquisa - Regular