| Texto completo | |
| Autor(es): |
Valeriano, Maria Gabriela
;
Marzagao, David Kohan
;
Montelongo, Alfredo
;
Kiffer, Carlos Roberto Veiga
;
Katz, Natan
;
Lorena, Ana Carolina
Número total de Autores: 6
|
| Tipo de documento: | Artigo Científico |
| Fonte: | MACHINE LEARNING; v. 115, n. 1, p. 40-pg., 2026-01-06. |
| Resumo | |
Machine Learning (ML) models are widely used in high-stakes domains such as healthcare, where the reliability of predictions is critical. However, these models often fail to account for uncertainty, providing predictions even with low confidence. This work proposes a novel two-step data-centric approach to enhance the performance of ML models by improving data quality and filtering low-confidence predictions. The first step involves leveraging Instance Hardness (IH) to filter problematic instances during training, thereby refining the dataset. The second step introduces a confidence-based rejection mechanism during inference, ensuring that only reliable predictions are retained. We evaluate our approach using three real-world healthcare datasets, demonstrating its effectiveness at improving model reliability while balancing predictive performance and rejection rate. Additionally, we use alternative criteria-influence values for filtering and uncertainty for rejection-as baselines to evaluate the efficiency of the proposed method. The results demonstrate that integrating IH filtering with confidence-based rejection effectively enhances model performance while preserving a large proportion of instances. This approach provides a practical method for deploying ML systems in safety-critical applications. (AU) | |
| Processo FAPESP: | 21/06870-3 - Além da seleção de algoritmos: meta-aprendizado para análise e entendimento de dados e algoritmos |
| Beneficiário: | Ana Carolina Lorena |
| Modalidade de apoio: | Auxílio à Pesquisa - Jovens Pesquisadores - Fase 2 |