Toward a Quantitative Survey of Dimension Reduction Techniques

Espadoto, Mateus; Martins, Rafael M.; Kerren, Andreas; Hirata, Nina S. T.; Telea, Alexandru C.

Full text
Author(s):	Espadoto, Mateus ^[1] ; Martins, Rafael M. ^[2] ; Kerren, Andreas ^[2] ; Hirata, Nina S. T. ^[1] ; Telea, Alexandru C. ^[3] Total Authors: 5
Affiliation:	^[1] Univ Sao Paulo, Inst Math & Stat, BR-05508090 Sao Paulo - Brazil ^[2] Linnaeus Univ, Dept Comp Sci & Media Technol, S-35195 Vaxjo - Sweden ^[3] Univ Utrecht, Dept Informat & Comp Sci, NL-3584 CS Utrecht - Netherlands Total Affiliations: 3
Document type:	Journal article
Source:	IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS; v. 27, n. 3, p. 2153-2173, MAR 1 2021.
Web of Science Citations:	3
Abstract
Dimensionality reduction methods, also known as projections, are frequently used in multidimensional data exploration in machine learning, data science, and information visualization. Tens of such techniques have been proposed, aiming to address a wide set of requirements, such as ability to show the high-dimensional data structure, distance or neighborhood preservation, computational scalability, stability to data noise and/or outliers, and practical ease of use. However, it is far from clear for practitioners how to choose the best technique for a given use context. We present a survey of a wide body of projection techniques that helps answering this question. For this, we characterize the input data space, projection techniques, and the quality of projections, by several quantitative metrics. We sample these three spaces according to these metrics, aiming at good coverage with bounded effort. We describe our measurements and outline observed dependencies of the measured variables. Based on these results, we draw several conclusions that help comparing projection techniques, explain their results for different types of data, and ultimately help practitioners when choosing a projection for a given context. Our methodology, datasets, projection implementations, metrics, visualizations, and results are publicly open, so interested stakeholders can examine and/or extend this benchmark. (AU)

FAPESP's process:	17/25835-9 - Understanding images and deep learning models
Grantee:	Nina Sumiko Tomita Hirata
Support Opportunities:	Research Grants - Research Partnership for Technological Innovation - PITE

Short URL