Advanced search
Start date

Probabilistic annotation of metabolite profiles obtained by liquid chromatography coupled to mass spectrometry

Full text
Ricardo Roberto da Silva
Total Authors: 1
Document type: Doctoral Thesis
Press: Ribeirão Preto.
Institution: Universidade de São Paulo (USP). Faculdade de Medicina de Ribeirão Preto (PCARP/BC)
Defense date:
Examining board members:
Ricardo Zorzetto Nicoliello Vencio; Carlos Alberto Labate; Houtan Noushmehr; Carlos Alberto de Braganca Pereira; Antonio Rossi Filho
Advisor: Ricardo Zorzetto Nicoliello Vencio

Metabolomics is an emerging science field in the post-genomic era, which aims at a comprehensive analysis of small organic molecules in biological systems. Techniques of liquid chromatography coupled to mass spectrometry (LC-MS) figure as the most widespread approaches to metabolomics studies. The metabolite detection by LC-MS produces complex data sets, that require a series of preprocessing steps to ensure that the information can be extracted efficiently and accurately. In order to be effectively related to alterations in the metabolism of interest, is absolutely necessary that the metabolites sampled by untargeted metabolic profiling approaches are annotated with reliability and that their relationship are interpreted under the assumption of a connected metabolism sample. Faced with the presented challenge, this thesis developed a software framework, which has as its central component a probabilistic method for metabolite annotation that allows the incorporation of independent sources of spectral information and prior knowledge about metabolism. After the probabilistic classification, a new method to represent the a posteriori probability distribution in the form of a graph has been proposed. A library of methods for R environment, called ProbMetab (Probilistic Metabolomics), was created and made available as an open source software. Using the ProbMetab software to analyze a set of benchmark data with compound identities known beforehand, we demonstrated that up to 90% of the correct metabolite identities were present among the top-three higher probabilities, emphasizing the efficiency of a posteriori probability distribution display, in place of a simplistic classification with only the most probable candidate, usually adopted in the field of metabolomics. In an application to real data, changes in a known metabolic pathway related to abiotic stresses in plants (Biosynthesis of Flavone and Flavonol) were automatically detected on sugar cane data, demonstrating the importance of a view centered on the posterior distribution of metabolite annotation network. (AU)