Advanced search
Start date
Betweenand


Neural Network and Genetic Algorithms in the Chemosystematic study of Asteraceae Family

Full text
Author(s):
Mauro Vicentini Correia
Total Authors: 1
Document type: Master's Dissertation
Press: São Paulo.
Institution: Universidade de São Paulo (USP). Conjunto das Químicas (IQ e FCF) (CQ/DBDCQ)
Defense date:
Examining board members:
Vicente de Paulo Emerenciano; Marcelo José Pena Ferreira; Maria Auxiliadora Coelho Kaplan
Advisor: Vicente de Paulo Emerenciano
Abstract

In this study two methods of artificial intelligence (neural network and genetic algorithms) were used to work out a Chemosystematic study of the Asteraceae family. The family Asteraceae is one of the largest families among the Angiosperms, having about 24,000 species. The species of the family produce a large diversity of secondary metabolites, and some worth mentioning are the terpenoids, polyacetylenes, flavonoids and coumarins. For a better understanding of the chemical diversity of the family a database was built up with the occurrences of twelve classes of metabolites (monoterpenes, sesquiterpenes, lactonizadossesquiterpenes, diterpenes, triterpenes, coumarins, flavonoids, polyacetylenes, Benzofurans, benzopyrans, acetophenones and phenylpropanoids) produced by species of the family. From this database three different studies were conducted. In the first study, using the Kohonen self-organized map and the chemical data classified according to two of the most recent phylogenies of the family, it was possible to successfully separatethe tribes and genera of the Asteraceae family. It was also possible to indicate that the chemical information agrees with the phylogeny of Funk (Funk et al. 2009) than with the phylogeny of Bremer (Bremer 1994, 1996). In the next study, which aims at creating models to predict the number of occurrences of the twelve classes of metabolites using multi-layer perceptron with backpropagation algorithm error, the result was found unsatisfactory. Although in some classes of metabolites the training phase of the network has satisfactory results, the test phase showed that the models created are not able to make prevision for data to which they were submitted in the training phase and thus are not suitable models for predictions. Finally, the third study was the creation of linear regression models using a genetic algorithm method of variable selection. This study could indicate that the monoterpenes and sesquiterpenes are closely related biosynthetically, and was also possible to indicate that there are biosynthetic relations between monoterpenes and diterpenes and between sesquiterpenes and triterpenes (AU)