Abstract
Clustering and classification methods are routinely used in machine learning. This application on a massive scale of techniques devised for low dimensional data may yield mediocre results. These scalability issues are true for iid data, time series and images. Some of the classical methods include K-th nearet neighbors, K-means, support vector machines, trees and florests, Schur measures,…