The sorting permutations problem using prefix and suffix operations
Rearrangement distances in unbalanced genomes considering intergenic regions
Full text | |
Author(s): |
Total Authors: 4
|
Affiliation: | [1] Univ Estadual Campinas, Inst Comp, BR-13083970 Campinas, SP - Brazil
[2] Univ Estadual Campinas, Sch Technol, BR-13484350 Limeira, SP - Brazil
Total Affiliations: 2
|
Document type: | Journal article |
Source: | IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS; v. 18, n. 6, p. 2094-2108, NOV 1 2021. |
Web of Science Citations: | 1 |
Abstract | |
In comparative genomics, one goal is to find similarities between genomes of different organisms. Comparisons using genome features like genes, gene order, and regulatory sequences are carried out with this purpose in mind. Genome rearrangements are mutational events that affect large extensions of the genome. They are responsible for creating extant species with conserved genes in different positions across genomes. Close species - from an evolutionary point of view - tend to have the same set of genes or share most of them. When we consider gene order to compare two genomes, it is possible to use a parsimony criterion to estimate how close the species are. We are interested in the shortest sequence of genome rearrangements capable of transforming one genome into the other, which is named rearrangement distance. Reversal is one of the most studied genome rearrangements events. This event acts in a segment of the genome, inverting the position and the orientation of genes in it. Transposition is another widely studied event. This event swaps the position of two consecutive segments of the genome. When the genome has no gene repetition, a common approach is to map it as a permutation such that each element represents a conserved block. When genomes have replicated genes, this mapping is usually performed using strings. The number of replicas depends on the organisms being compared, but in many scenarios, it tends to be small. In this work, we study the rearrangement distance between genomes with replicated genes considering that the orientation of genes is unknown. We present four heuristics for the problem of genome rearrangement distance with replicated genes. We carry out experiments considering the exclusive use of the reversals or transpositions events, as well as the version in which both events are allowed. We developed a database of simulated genomes and compared our results with other algorithms from the literature. The experiments showed that our heuristics with more sophisticated rules presented a better performance than the known algorithms to estimate the evolutionary distance between genomes with replicated genes. In order to validate the application of our algorithms in real data, we construct a phylogenetic tree based on the distance provided by our algorithm and compare it with a know tree from the literature. (AU) | |
FAPESP's process: | 17/16246-0 - Sensitive media analysis through deep learning architectures |
Grantee: | Sandra Eliza Fontes de Avila |
Support Opportunities: | Regular Research Grants |
FAPESP's process: | 15/11937-9 - Investigation of hard problems from the algorithmic and structural stand points |
Grantee: | Flávio Keidi Miyazawa |
Support Opportunities: | Research Projects - Thematic Grants |
FAPESP's process: | 17/12646-3 - Déjà vu: feature-space-time coherence from heterogeneous data for media integrity analytics and interpretation of events |
Grantee: | Anderson de Rezende Rocha |
Support Opportunities: | Research Projects - Thematic Grants |