Measuring the Shattering coefficient of Decision Tree models

de Mello, Rodrigo E.; Manapragada, Chaitanya; Bifet, Albert

Full text
Author(s):	de Mello, Rodrigo E. ^[1] ; Manapragada, Chaitanya ^[2] ; Bifet, Albert ^[3] Total Authors: 3
Affiliation:	^[1] Av Trabalhador Saocarlense 400, BR-13560970 Sao Carlos, SP - Brazil ^[2] Monash Univ, Wellington Rd, Clayton, Vic 3800 - Australia ^[3] Telecom ParisTech, LTCI, Off C201-2, 46 Rue Barrault, F-75634 Paris 13 - France Total Affiliations: 3
Document type:	Journal article
Source:	EXPERT SYSTEMS WITH APPLICATIONS; v. 137, p. 443-452, DEC 15 2019.
Web of Science Citations:	0
Abstract
In spite of the relevance of Decision Trees (DTs), there is still a disconnection between their theoretical and practical results while selecting models to address specific learning tasks. A particular criterion is provided by the Shattering coefficient, a growth function formulated in the context of the Statistical Learning Theory (SLT), which measures the complexity of the algorithm bias as sample sizes increase. In attempt to establish the basis for a relative theoretical complexity analysis, this paper introduces a method to compute the Shattering coefficient of DT models using recurrence equations. Next, we assess the bias of models provided by DT algorithms while solving practical problems as well as their overall learning bounds in light of the SLT. As the main contribution, our results support other researchers to decide on the most adequate DT models to tackle specific supervised learning tasks. (C) 2019 Elsevier Ltd. All rights reserved. (AU)

FAPESP's process:	17/16548-6 - Providing theoretical guarantees to the detection of concept drift in data streams
Grantee:	Rodrigo Fernandes de Mello
Support Opportunities:	Scholarships abroad - Research

Short URL