Dimensionality reduction: theoretical perspective on practical measures

Part of Advances in Neural Information Processing Systems 32 (NeurIPS 2019)

AuthorFeedback Bibtex MetaReview Metadata Paper Reviews Supplemental

Authors

Yair Bartal, Nova Fandina, Ofer Neiman

Abstract

Dimensionality reduction plays a central role in real-world applications for Machine Learning, among many fields. In particular, metric dimensionality reduction where data from a general metric is mapped into low dimensional space, is often used as a first step before applying machine learning algorithms. In almost all these applications the quality of the embedding is measured by various average case criteria. Metric dimensionality reduction has also been studied in Math and TCS, within the extremely fruitful and influential field of metric embedding. Yet, the vast majority of theoretical research has been devoted to analyzing the worst case behavior of embeddings and therefore has little relevance to practical settings. The goal of this paper is to bridge the gap between theory and practice view-points of metric dimensionality reduction, laying the foundation for a theoretical study of more practically oriented analysis. This paper can be viewed as providing a comprehensive theoretical framework addressing a line of research initiated by VL [NeuroIPS' 18] who have set the goal of analyzing different distortion measurement criteria, with the lens of Machine Learning applicability, from both theoretical and practical perspectives. We complement their work by considering some important and vastly used average case criteria, some of which originated within the well-known Multi-Dimensional Scaling framework. While often studied in practice, no theoretical studies have thus far attempted at providing rigorous analysis of these criteria. In this paper we provide the first analysis of these, as well as the new distortion measure developed by [VL18] designed to possess Machine Learning desired properties. Moreover, we show that all measures considered can be adapted to possess similar qualities. The main consequences of our work are nearly tight bounds on the absolute values of all distortion criteria, as well as first approximation algorithms with provable guarantees.