Advanced search
Start date
Betweenand


Predictive and interpretable machine learning for COVID-19 resurgences: the role of SARS-CoV-2 variants in the post-pandemic era

Full text
Author(s):
Ferreira, Rafaella S. ; Colnago, Marilaine ; Casaca, Wallace
Total Authors: 3
Document type: Journal article
Source: BMC INFECTIOUS DISEASES; v. 25, n. 1, p. 21-pg., 2025-12-29.
Abstract

Background Traditional COVID-19 forecasting often misses the rapid dynamics of viral competition, limiting timely ublic health responses. This study demonstrates the value of incorporating SARS-CoV-2 variant data into recurrent neural networks, using an interpretable, data-driven approach to improve accuracy in the current pandemic phase. Methods We validated our approach on post-pandemic data (2022-2025) from New York City and the United Kingdom, integrating epidemiological time series with genomic surveillance of variants. We implemented and compared several neural network structures, with LSTM achieving the best performance. To assess the contribution of variant-specific data, we compared models with and without variant inputs. For interpretability and understanding model decisions, we applied XAI techniques to quantify variant influence on predictions. Results Incorporating variant data markedly improved forecasting accuracy across all horizons. In New York City, MAPE dropped from 32.15% to 7.35% during periods of rapid variant change, while in the UK it fell from 35.62% to 7.73%. XAI analyses revealed the dominant role of specific variants and captured their competitive displacement dynamics, with model explanations closely matching observed epidemiological trends. Conclusion This study introduces a variant-aware methodology that improves COVID-19 prediction in the current endemic phase. The main contributions are: (i) ablation studies demonstrating the value of incorporating variant data to model case resurgences and declines; (ii) interpretable results into variant-driven dynamics via XAI; and (iii) validation across multiple geographical scales. Our approach establishes a scalable paradigm for genomic-informed epidemic forecasting, adaptable to evolving respiratory viruses. (AU)

FAPESP's process: 24/04718-8 - Enhancing Efficiency in Sugarcane Industry Forecasting through Integrated Data Analysis and Artificial Intelligence Techniques
Grantee:Marilaine Colnago
Support Opportunities: Regular Research Grants
FAPESP's process: 24/04492-0 - Integrated Deep Learning Solutions for Image Segmentation and Deforestation Detection
Grantee:Wallace Correa de Oliveira Casaca
Support Opportunities: Regular Research Grants
FAPESP's process: 23/14427-8 - Data Science for Smart Industry (CDII)
Grantee:José Alberto Cuminato
Support Opportunities: Research Grants - Applied Research Centers Program