Classification in data streams: dealing with anomalies, novelties and scarcity of ...
Evaluation of novelty detection algorithms for multi-label data streams classifica...
Novelty detection in multi-label data streams classification
Grant number: | 23/08406-8 |
Support Opportunities: | Scholarships in Brazil - Doctorate (Direct) |
Start date: | August 01, 2023 |
End date: | January 31, 2028 |
Field of knowledge: | Physical Sciences and Mathematics - Computer Science - Computing Methodologies and Techniques |
Principal Investigator: | Ricardo Cerri |
Grantee: | Hiago Freire Oliveira |
Host Institution: | Instituto de Ciências Matemáticas e de Computação (ICMC). Universidade de São Paulo (USP). São Carlos , SP, Brazil |
Associated research grant: | 22/02981-8 - Novelty detection in multi-label data streams classification, AP.PNGP.PI |
Abstract In scenarios of continuous data streams, classification faces challenges that batch learning does not, such as concept evolutions and concept drifts, so novelty detection is a necessary task. Multi-label classification in continuous data streams, still little investigated, brings even more challenges, especially in contexts with infinite latency of labels. However, ensembles of classifiers, through the combination of individual methods, aim to improve the performance of the final prediction. In this sense, ensembles of classifiers together with unsupervised novelty detection techniques for multi-label classification in data streams with infinite latency of labels can bring innovations and advantages when dealing with more complex problems. Therefore, this project aims to develop a multi-label novelty detection method capable of identifying, distinguishing and adapting to concept drifts and evolutions in scenarios with infinite label latency. For this purpose, synthetic data with different multi-label characteristics should be generated, as well as the use of real-world multi-label datasets with adaptations for data stream processing. For validation, classification quality, concept change detection quality, concept evolution detection quality, noise detection quality, computational complexity, and time and memory complexity will be considered. Several existing methods, which partially meet the objective, must be adapted and compared for the total fulfillment of the final objective. For evaluation, the methodology needs i) to penalize the classifier when too many novelty patterns are identified to represent a class, ii) to penalize novelty patterns that are created before concept evolutions occur, and iii) to be able to associate novelty patterns to multiple classes of the problem. (AU) | |
News published in Agência FAPESP Newsletter about the scholarship: | |
More itemsLess items | |
TITULO | |
Articles published in other media outlets ( ): | |
More itemsLess items | |
VEICULO: TITULO (DATA) | |
VEICULO: TITULO (DATA) | |