Advanced search
Start date
Betweenand
(Reference retrieved automatically from Web of Science through information on FAPESP grant and its corresponding number as mentioned in the publication by the authors.)

Dynamic ensemble mechanisms to improve particulate matter forecasting

Full text
Author(s):
Bueno, Andres [1] ; Coelho, Guilherme Palermo [1] ; Bertini Junior, Joao Roberto [1]
Total Authors: 3
Affiliation:
[1] Univ Campinas UNICAMP, Sch Technol, Rua Paschoal Marmo 1888, BR-13484332 Limeira, SP - Brazil
Total Affiliations: 1
Document type: Journal article
Source: APPLIED SOFT COMPUTING; v. 91, JUN 2020.
Web of Science Citations: 0
Abstract

Respirable solid particles and liquid droplets suspended in the air, known as particulate matter (PM), may have a significant impact on human health, urban infrastructure, and natural and agricultural systems. The adverse effects of PM have raised public concern, especially in heavily polluted areas in the world, making it imperative the development of strategies to keep the concentration levels of these pollutants below harmful thresholds. Traditional machine learning approaches have been used to forecast PM concentrations. However, complex chemical processes may be involved in the composition of PM in the atmosphere and influenced by many meteorological parameters. Thus, underlying data distributions of PM data, uninterruptedly collected, may evolve over time. This phenomenon, known as concept drift, implies an important challenge for traditional machine learning techniques since they do not have mechanisms to handle changes on data distribution at the running time, thus limiting their forecasting capabilities. The overall goal of this work is to evaluate whether the incorporation of mechanisms to deal with concept drift, together with online sequential learning approaches, can improve the accuracy of PM forecasting. To do so, new mechanisms that enable online dynamic ensembles to handle and retain knowledge from different concepts for more time were proposed and adapted to EOS and DOER algorithms, resulting in three approaches: EOS-rank, EOS-D and DOER-rank. These ensemble strategies, which were based on Online Sequential Extreme Learning Machines (OS-ELM), were compared with five algorithms from the literature. To evaluate their performance, real-world and artificial datasets, with known dynamic behaviors, and PM concentration datasets from different cities of the State of Sao Paulo, Brazil, were used in the experiments. The obtained results showed that the proposed approaches can handle dynamic environments with different rates of drift and that EOS-rank was capable of outperforming most approaches from the literature in scenarios with higher rates of drift. The results also indicate that PM data distributions slowly evolve over time and, consequently, the proposed mechanisms that keep information of past concepts and slowly adapt the ensemble tend to present better results when applied to forecast PM concentration. (C) 2020 Published by Elsevier B.V. (AU)

FAPESP's process: 17/00219-3 - Classification in data streams: dealing with anomalies, novelties and scarcity of labeled data
Grantee:João Roberto Bertini Junior
Support Opportunities: Regular Research Grants