| Grant number: | 15/01382-0 |
| Support Opportunities: | Scholarships in Brazil - Doctorate |
| Start date: | October 01, 2016 |
| End date: | November 30, 2020 |
| Field of knowledge: | Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques |
| Principal Investigator: | André Carlos Ponce de Leon Ferreira de Carvalho |
| Grantee: | Victor Hugo Barella |
| Host Institution: | Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil |
| Associated research grant: | 13/07375-0 - CeMEAI - Center for Mathematical Sciences Applied to Industry., AP.CEPID |
| Associated scholarship(s): | 19/13015-2 - Meta-Learning Applied to Imbalanced Datasets Using Data Complexity Measures, BE.EP.DR |
Abstract The pre-processing of data is one of the most important steps in the data mining process, and one of the most neglected. Data collection may suffer from manual errors and equipment problems creating inconsistent, noisy or missing data. There are some other aspects, as imbalance and overlapping classes, which may difficult the analysis. Ignoring these aspects in the learning process can impair the induction of a suitable model, as traditional machine learning algorithms have difficulties to induce a good model in these contexts. Furthermore, most of these problems are commonly processed independently and are interrelated. The aim of this PhD project is to analyze and address the noise problem, unbalanced data, overlapping classes and high dimensionality in an integrated manner, observing the relations between them. Data with these characteristics are often found in Molecular Biology. Thus, it is considered to use molecular biology data during analysis. | |
| News published in Agência FAPESP Newsletter about the scholarship: | |
| More itemsLess items | |
| TITULO | |
| Articles published in other media outlets ( ): | |
| More itemsLess items | |
| VEICULO: TITULO (DATA) | |
| VEICULO: TITULO (DATA) | |