Advanced search
Start date
Betweenand


Even small correlation and diversity shifts pose dataset-bias issues

Full text
Author(s):
Bissoto, Alceu ; Barata, Catarina ; Valle, Eduardo ; Avila, Sandra
Total Authors: 4
Document type: Journal article
Source: PATTERN RECOGNITION LETTERS; v. 179, p. 7-pg., 2024-02-07.
Abstract

Distribution shifts hinder the deployment of deep learning in real-world problems. Distribution shifts appear when train and test data come from different sources, which commonly happens in practice. Despite shifts occurring concurrently in many forms (e.g., correlation and diversity shifts) and intensities, the literature focuses only on severe and isolated shifts. In this work, we propose a comprehensive examination of distribution shifts across different intensity levels, investigating the nuanced impacts of both mild and severe shifts on the learning process and assessing the interplay between correlation and diversity shifts. We train models in three different scenarios considering synthetic and real correlation and diversity shifts, spamming across eight different levels of correlation shift, and evaluate them in both in-distribution and diversity-shifted test sets. Our experiments reveal three major findings: (1) Even small correlation shifts pose dataset-bias issues, presenting a risk of accumulating and combining unaccountable weak biases; (2) Models learn robust features in high- and low-shift scenarios but prefer spurious ones during test regardless; (3) Diversity shift can attenuate the reliance on spurious correlations. Our work has implications for distribution shift research and practice, providing new insights into how models learn and rely on spurious correlations under different types and intensities of shifts. (AU)

FAPESP's process: 13/08293-7 - CCES - Center for Computational Engineering and Sciences
Grantee:Munir Salomao Skaf
Support Opportunities: Research Grants - Research, Innovation and Dissemination Centers - RIDC
FAPESP's process: 19/19619-7 - Generating unlimited skin lesion images with generative adversarial networks
Grantee:Alceu Emanuel Bissoto
Support Opportunities: Scholarships in Brazil - Doctorate
FAPESP's process: 22/09606-8 - Understanding the role of shortcuts and distribution shifts in deep learning generalization
Grantee:Alceu Emanuel Bissoto
Support Opportunities: Scholarships abroad - Research Internship - Doctorate
FAPESP's process: 20/09838-0 - BI0S - Brazilian Institute of Data Science
Grantee:João Marcos Travassos Romano
Support Opportunities: Research Grants - Research Centers in Engineering Program