Advanced search
Start date
Betweenand

Genetic programming for evolving decision tree induction algorithms

Grant number: 10/20255-5
Support type:Research Grants - Young Investigators Grants
Duration: September 01, 2011 - August 31, 2015
Field of knowledge:Physical Sciences and Mathematics - Computer Science
Principal Investigator:Márcio Porto Basgalupp
Grantee:Márcio Porto Basgalupp
Home Institution: Instituto de Ciência e Tecnologia (ICT). Universidade Federal de São Paulo (UNIFESP). Campus São José dos Campos. São José dos Campos , SP, Brazil
Assoc. researchers: Alex Alves Freitas ; André Carlos Ponce de Leon Ferreira de Carvalho ; Marcos Gonçalves Quiles ; Rodrigo Coelho Barros ; Vili Podgorelec
Associated grant(s):15/05218-0 - Evolving decision-tree induction algorithms with a multi-objective hyper-heuristic using the Pareto dominance approach, AV.EXT
Associated scholarship(s):15/13245-7 - Incorporating multi-test decision trees into the HEAD-DT algorithm, BE.PQ
14/21737-4 - Identification of different alternatives for building blocks of genetic programming algorithm for evolving decision tree induction algorithms, BP.IC

Abstract

Decision trees are a well-known and widely-used data mining technique for classification. Decision tree induction algorithms usually rely on a greedy top-down divide-and-conquer strategy for tree growth. There are at least two major problems related to this strategy: (i) greedy strategies usually provide local rather than global optimal solutions, (ii) recursive partitioning iteratively degrades the dataset quality and thus the quality of the obtained results. Different approaches were suggested to handle with these difficulties, such as Option Trees, ensembles (e.g. bagging, boosting) and evolutionary algorithms. Nevertheless, the previously mentioned evolutionary algorithms are used to evolve decision trees which are specific to each classification problem, i.e., they do not evolve decision tree induction algorithms capable of inducing trees for any classification problem. Genetic programming (GP), a branch of evolutionary algorithms, is a suitable tool for evolving computer programs. A GP evolved program can provide similar solutions to those developed by humans, but also something completely different and perhaps better. Hence, this project aims to study GP in order to make use of its features for evolving decision tree induction algorithms. In a nut shell, the solution provided by the GP is a decision tree induction algorithm capable of solving any classification problem rather than a single decision tree tailored according to a specific dataset. With that in mind, this project is actually proposing the development of a meta-learning algorithm, since it is an algorithm that learns a learning algorithm. (AU)