Abstract
Data collected directly from storage systems often present high rate noise resulting from internal and external factors. When used in the induction of classifiers by machine learning techniques, these noisy data may reduce the predictive accuracy, increase the complexity of the hypothesis obtained and its induction time. This paper aims to investigate two research directions regarding thi…