Advanced search
Start date
Betweenand


Genoma comparison problems

Full text
Author(s):
Ulisses Martins Dias
Total Authors: 1
Document type: Doctoral Thesis
Press: Campinas, SP.
Institution: Universidade Estadual de Campinas (UNICAMP). Instituto de Computação
Defense date:
Examining board members:
Zanoni Dias; João Carlos Setubal; Nalvo Franco de Almeida Júnior; João Meidanis; Guilherme Pimentel Telles
Advisor: Zanoni Dias
Abstract

In this PhD thesis, we work on three aspects of genome comparison: first, transposition events; second, inversion and almost-symmetric inversion events; third, whole-genome distance measures that are not connected to any specific kind of rearrangement event. The study of transposition events (first aspect) allowed us to create a new 1.375-approximation algorithm and some exact models using constraint logic programming. These approaches were compared to other published methods and in all cases our methods perform best. The second aspect of this thesis concerns inversion and almost-symmetric inversion events. In this regard, we developed a simulation tool for the study of symmetric inversions in bacterial genomes. Through this work we were able to contribute to the understanding of the evolutionary differentiation process in species of the following groups: the Pseudomonadaceae family, the Xanthomonas genus, the Shewanella genus, and the Mycobacterium genus. We used the knowledge acquired in building our simulation tool to establish a method that uses inversion signatures to generate draft genome sequence scaffolds using a complete genome as a reference. Apart from the practical applications of this research, we contribute to the computer science field by providing a theoretical framework for the almost-symmetric distance problem that can be improved in the future and can serve as a basis for approximation and heuristic algorithms. This framework is comprised of a greedy algorithm for any permutation, exact algorithms for specific families of permutation, and several lemmas and conjectures related to these problems. The third and last aspect of this thesis addresses the need for methods that can quickly and effectively compare large sets of genome sequences. We propose two new methods for efficiently determining whole genome sequence distance measures. One of them is aimed at comparing closely related genomes, and the other is meant to compare more distant genomes. Both measures were evaluated in order to find their limitations and their efficacy. It is our hope that thiswork represents a contribution to knowledge of the genome comparison field in general, and the genome rearrangement field, in particular (AU)