| Full text | |
| Author(s): |
Passos, Leandro Aparecido S.
;
Jodas, Danilo S.
;
Ribeiro, Luiz C. F.
;
Akio, Marco
;
De Souza, Andre Nunes
;
Papa, Joao Paulo
Total Authors: 6
|
| Document type: | Journal article |
| Source: | KNOWLEDGE-BASED SYSTEMS; v. 242, p. 13-pg., 2022-04-22. |
| Abstract | |
In the last decade, machine learning-based approaches became capable of performing a wide range of complex tasks sometimes better than humans, demanding a fraction of the time. Such an advance is partially due to the exponential growth in the amount of data available, which makes it possible to extract trustworthy real-world information from them. However, such data is generally imbalanced since some phenomena are more likely than others. Such a behavior yields considerable influence on the machine learning model's performance since it becomes biased on the more frequent data it receives. Despite the considerable amount of machine learning methods, a graph-based approach has attracted considerable notoriety due to the outstanding performance over many applications, i.e., the Optimum-Path Forest (OPF). In this paper, we propose three OPF-based strategies to deal with the imbalance problem: the (OPF)-P-2 and the OPF-US, which are novel approaches for oversampling and undersampling, respectively, as well as a hybrid strategy combining both approaches. The paper also introduces a set of variants concerning the strategies mentioned above. Results compared against several state-of-the-art techniques over public and private datasets confirm the robustness of the proposed approaches.& nbsp; (C)& nbsp;2022 Elsevier B.V. All rights reserved. (AU) | |
| FAPESP's process: | 18/21934-5 - Network statistics: theory, methods, and applications |
| Grantee: | André Fujita |
| Support Opportunities: | Research Projects - Thematic Grants |
| FAPESP's process: | 14/12236-1 - AnImaLS: Annotation of Images in Large Scale: what can machines and specialists learn from interaction? |
| Grantee: | Alexandre Xavier Falcão |
| Support Opportunities: | Research Projects - Thematic Grants |
| FAPESP's process: | 20/12101-0 - Support for computational environments and experiments execution: data acquisition, categorization and maintenance |
| Grantee: | Leandro Aparecido Passos Junior |
| Support Opportunities: | Scholarships in Brazil - Technical Training Program - Technical Training |
| FAPESP's process: | 19/18287-0 - Real-time Urban Forest Management Using Machine Learning |
| Grantee: | Danilo Samuel Jodas |
| Support Opportunities: | Scholarships in Brazil - Post-Doctoral |
| FAPESP's process: | 19/07665-4 - Center for Artificial Intelligence |
| Grantee: | Fabio Gagliardi Cozman |
| Support Opportunities: | Research Grants - Research Program in eScience and Data Science - Research Centers in Engineering Program |
| FAPESP's process: | 17/02286-0 - Probabilistic models for commercial losses detection |
| Grantee: | André Nunes de Souza |
| Support Opportunities: | Regular Research Grants |
| FAPESP's process: | 13/07375-0 - CeMEAI - Center for Mathematical Sciences Applied to Industry |
| Grantee: | Francisco Louzada Neto |
| Support Opportunities: | Research Grants - Research, Innovation and Dissemination Centers - RIDC |