Advanced search
Start date
Betweenand


Métodos de reconstrução filogenética

Full text
Author(s):
João Paulo Pereira Zanetti
Total Authors: 1
Document type: Doctoral Thesis
Press: Campinas, SP.
Institution: Universidade Estadual de Campinas (UNICAMP). Instituto de Computação
Defense date:
Examining board members:
João Meidanis; Maria Emilia Machado Telles Walter; Cleber Valgas Gomes Mira; Zanoni Dias; Ulisses Martins Dias
Advisor: João Meidanis
Abstract

Phylogenetic reconstruction is the field of computational biology dedicated to infer evolutionary history from given extant biologic data. In this PhD thesis, we present three works on different issues related to the topic. The first aims to reconstruct the common ancestor to three input genomes. Despite the apparently limited input, this problem can be applied to determine ancestors in trees whose topology is already known. Our work introduces a way to represent genomes using matrices. From this, we define the matrix median problem, which consists of, given three matrices A, B and C, find a matrix M that minimizes the sum d(A,M) + d(B,M) + d(C,M), where d(X,Y) is the rank of the matrix Y-X. To solve this problem, we present an approximate algorithm that ensures results at least as good as one of the input genomes, but that, in the experiments simulating the evolution of genomes, got much closer to optimal results. Furthermore, we also show a heuristic that translates any matrix, like the ones returned by our algorithm, back to a genome. In the second, the goal is to reconstruct ancestral genomes from the evolutionary histories of their respective gene families. For this, we extend the dynamic programming scheme of DeCo, that builds forests depicting the evolution of adjacencies. Our implementation, instead of choosing just the adjacencies of one most parsimonious solution, explores the entire solution space, choosing adjacencies in function of their larger frequency. Using this approach, we managed significantly reduce the number of inconsistencies while evaluating multiple instances on the same genes. Finally, we discuss a tool to assist in the reconstruction of ancient genomes, considering the genomes of descendant species, without the need for the histories of all the gene families. This tool is the data structure called PQR-tree, which not only detects whether the entry has the consecutive ones property, but also the obstacles that possibly prevent the instance from having said property. This is important because, while reconstructing ancient genomes, entry errors are very common and can very easily prevent the instance from having the consecutive ones property. Here, we consolidate the knowledge on the online and offline algorithms for PQR trees in a single paper. With these, we moved knowledge in phylogenetic reconstruction and computational biology forward in three different directions (AU)

FAPESP's process: 12/13865-7 - Algebraic Model for Genome Rearrangements
Grantee:João Paulo Pereira Zanetti
Support Opportunities: Scholarships in Brazil - Doctorate