Advanced search
Start date
Betweenand


Signed rearrangement distances considering repeated genes, intergenic regions, and indels

Full text
Author(s):
Siqueira, Gabriel ; Alexandrino, Alexsandro Oliveira ; Dias, Zanoni
Total Authors: 3
Document type: Journal article
Source: JOURNAL OF COMBINATORIAL OPTIMIZATION; v. 46, n. 2, p. 36-pg., 2023-09-01.
Abstract

Genome rearrangement distance problems allow to estimate the evolutionary distance between genomes. These problems aim to compute the minimum number of mutations called rearrangement events necessary to transform one genome into another. Two commonly studied rearrangements are the reversal, which inverts a sequence of genes, and the transposition, which exchanges two consecutive sequences of genes. Seminal works on this topic focused on the sequence of genes and assumed that each gene occurs exactly once on each genome. More realistic models have been assuming that a gene may have multiple copies or may appear in only one of the genomes. Other models also take into account the nucleotides between consecutive pairs of genes, which are called intergenic regions. This work combines all these generalizations defining the signed intergenic reversal distance (SIRD), the signed intergenic reversal and transposition distance (SIRTD), the signed intergenic reversal and indels distance (SIRID), and the signed intergenic reversal, transposition, and indels distance (SIRTID) problems. We show a relation between these problems and the signed minimum common intergenic string partition (SMCISP) problem. From such relation, we derive Theta(k)-approximation algorithms for the SIRD and the SIRTD problems, where k is maximum number of copies of a gene in the genomes. These algorithms also work as heuristics for the SIRID and SIRTID problems. Additionally, we present some parametrized algorithms for SMCISP that ensure constant approximation factors for the distance problems. Our experimental tests on simulated genomes show an improvement on the rearrangement distances with the use of the partition algorithms. (AU)

FAPESP's process: 21/13824-8 - Generalization of string partition problems
Grantee:Gabriel Henriques Siqueira
Support Opportunities: Scholarships in Brazil - Doctorate
FAPESP's process: 13/08293-7 - CCES - Center for Computational Engineering and Sciences
Grantee:Munir Salomao Skaf
Support Opportunities: Research Grants - Research, Innovation and Dissemination Centers - RIDC
FAPESP's process: 17/12646-3 - Déjà vu: feature-space-time coherence from heterogeneous data for media integrity analytics and interpretation of events
Grantee:Anderson de Rezende Rocha
Support Opportunities: Research Projects - Thematic Grants
FAPESP's process: 15/11937-9 - Investigation of hard problems from the algorithmic and structural stand points
Grantee:Flávio Keidi Miyazawa
Support Opportunities: Research Projects - Thematic Grants