From spreadsheets to sugar content modeling: A data mining approach

Gravina de Oliveira, Monique Pires; Bocca, Felipe Ferreira; Antunes Rodrigues, Luiz Henrique

Full text
Author(s):	Gravina de Oliveira, Monique Pires ; Bocca, Felipe Ferreira ; Antunes Rodrigues, Luiz Henrique Total Authors: 3
Document type:	Journal article
Source:	COMPUTERS AND ELECTRONICS IN AGRICULTURE; v. 132, p. 14-20, JAN 2017.
Web of Science Citations:	6
Abstract
Sugarcane mills need sugar content estimates in advance to establish their commercial strategy. To obtain these estimates, mills rely on historical averages or maturation curves. Crop models have also been developed to provide those estimates. Leveraging mill data about fields and sugar content at harvest, we developed empirical models using different data Mining techniques along with the RReliefF algorithm for feature selection. The best model was attained with Random Forest with features selected by RReliefF, having a mean absolute error of 2.02 kg Mg-1. This model outperformed Support Vector Regression And Regression Trees with and without feature selection. Models were also evaluated by the Regression Error Characteristic Curves, which showed that the best model was able to predict 90% of the observations within a precision of 5.40 kg Mg-1. (C) 2016 Elsevier B.V. All rights reserved. (AU)

FAPESP's process:	12/50049-3 - Tecnicas de mineracao de dados aplicadas a analise e previsao da produtividade da cana-de-acucar. (fapesp-eth bioenergia)
Grantee:	Luiz Henrique Antunes Rodrigues
Support Opportunities:	Program for Research on Bioenergy (BIOEN) - Research Partnership for Technological Innovation (PITE)

Short URL