Advanced search
Start date
Betweenand


Origin of recent genes, an approach with deteriorated PSSMs and protein domain architectures

Full text
Author(s):
Diego Trindade de Souza
Total Authors: 1
Document type: Doctoral Thesis
Press: São Paulo.
Institution: Universidade de São Paulo (USP). Instituto de Matemática e Estatística (IME/SBI)
Defense date:
Examining board members:
Sergio Russo Matioli; João Marcelo Pereira Alves; Marcelo Ribeiro da Silva Briones; Francisco Prosdocimi; João Carlos Setubal
Advisor: Sergio Russo Matioli
Abstract

The origin of new genes is an important process for the evolution of organisms because they provide singular sources for evolutionary innovation. The approaches that show how these new genes arise and acquire new functions in the course of evolution are of fundamental importance: they can help to correlate mutations with metabolic, physiological, and morphological changes, indicating which mutations are likely to be functionally important. Recently, a new approach, named phylostratigraphy, was developed to establish the evolutionary origin of the genes. In this method the emergence of novel sequences at a particular phylogenetic node in a descendent-ancestor lineage is infer usually by using the similarity search algorithm. In the present work, we did a phylostratigraphical search of two protein domain databases, CATH and Pfam, for all human entries and depicted the origin of human domains and architectures. We also conducted a new approach to refine results by Male-PSI-BLAST in a case study of prions and ADH\'s domains. The analysis of two databases showed that there are three important periods of appearance of human gene domains: the origin of cellular organism, Eukaryote, and Euteleostomi appear to be important for production of new genes at the ancestor-descendent lineages that lead to the human species. However, when we analyze the appearance of architectures, they are by far more recent than the appearance of domains, although they might contain very ancient domains mixed with recent ones. We did not notice a bias of addition of new domains to architectures comprising either ancient or recent domains. To measure the degree of domain versatility, we used the weighted bigram frequency, where bigram is defined as a specific combination of two adjacent domains. The Spearman correlation test showed that there is a low negative correlation between the age of domains and versatility indexes. In the study of case, we have demonstrated that it is possible to characterize evolutionary ruptures in results of Male-PSI- BLAST between domains that emerged from others. For the first time the origin of all protein domains and architectures was depicted, providing an evolutionary scenario that can be further explored. (AU)

FAPESP's process: 13/06592-7 - Origin of recent genes, an approach with deteriorated PSSM matrices and protein domain architectures
Grantee:Diego Trindade de Souza
Support Opportunities: Scholarships in Brazil - Doctorate