Advanced search
Start date
Betweenand


Application of chemoinformatic tools in the study of plant metabolic profiles and dereplication

Full text
Author(s):
Tiago Branquinho Oliveira
Total Authors: 1
Document type: Doctoral Thesis
Press: Ribeirão Preto.
Institution: Universidade de São Paulo (USP). Faculdade de Ciências Farmacêuticas de Ribeirão Preto (PCARP/BC)
Defense date:
Examining board members:
Fernando Batista da Costa; Leonardo Gobbo Neto; Edenir Rodrigues Pereira Filho; Marcus Tullius Scotti
Advisor: Fernando Batista da Costa
Abstract

After the emergence of the computing era with special application in chemistry, all substances from natural sources might have their information stored in databases. Therefore, the opportunity arises to employ natural product databases and some chemoinformatic tools such as QSRR studies to speed up the identification of substances from metabolomic studies. This paper proposes the development of three QSRR studies as well as the building of a database (AsterDB) with chemical structures from the Asteraceae family and related information (i.e.: botanical and taxonomic occurrences, biological activity, analytical information, etc.) aiming to assist the dereplication of substances in plant extracts. The first study was carried out with 39 sesquiterpene lactones (STLs) analysed using two different solvent systems (MeOH-H2O 55:45 and MeCN-H2O 35:65), three groups of structural descriptors (2D-descr, 3D-1conf, and 3D-weigh), two different sets for training and testing (26:13 and 29:10), four algorithms for selection of descriptors (best first, LFS, greedy stepwise, and GA), three different model sizes (four, five, and six descriptors) and two modelling methods (PLS and ANN). The second study was developed with 50 compounds of different chemical classification in order to assess the differences between individual and mixed compounds analysed in three different equipments and two chromatographic methods. The third was elaborated with 2,635 chemical structures with a common external test to all models (25%, n = 656), three separation methods for testing- and training-set (based on response and on 2D and 3D predictors partitions), three different sizes of models selected by GA and two modelling methods (MLR and BrNN). The AsterDB database was developed to be populated gradually and currently, it has about 2,000 chemical structures. The first QSRR study generated good models, able to estimate the logarithm of the retention factor (logk) of STLs with P2>0.81 for the MeCN-H2O system. The second study showed that there was no statistical difference between the substances analysed individually and mixed (p-value>0.95) and the correlation between the two chromatographic methods and equipments used was reproducible (R>0.95). These analyses showed that it was possible to develop QSRR models for a chromatographic method and equipment and translate them into other equipment following the use of substances in common. The third study produced models with good predictive capacity (P2>0.81) using a high range of chemical space and statistical accuracy. In conclusion, this information can be used as a pilot platform for data analysis in order to assist in plant dereplication in metabolomics studies (AU)

FAPESP's process: 11/17860-7 - Application of chemoinformatic tools in the study of plant metabolic profiles and dereplication
Grantee:Tiago Branquinho Oliveira
Support Opportunities: Scholarships in Brazil - Doctorate