H-tSNE: Hierarchical Nonlinear Dimensionality Reduction

Published: Oct. 7, 2020, 1:01 a.m.

Link to bioRxiv paper: http://biorxiv.org/cgi/content/short/2020.10.05.324798v1?rss=1 Authors: VanHorn, K. C., Cobanoglu, M. C. C. Abstract: Dimensionality reduction (DR) is often integral when analyzing high-dimensional data across scientific, economic, and social networking applications. For data with a high order of complexity, nonlinear approaches are often needed to identify and represent the most important components. We propose a novel DR approach that can incorporate a known underlying hierarchy. Specifically, we extend the widely used t-Distributed Stochastic Neighbor Embedding technique (t-SNE) to include hierarchical information and demonstrate its use with known or unknown class labels. We term this approach "H-tSNE." Such a strategy can aid in discovering and understanding underlying patterns of a dataset that is heavily influenced by parent-child relationships. Without integrating information that is known a priori, we suggest that DR cannot function as effectively. In this regard, we argue for a DR approach that enables the user to incorporate known, relevant relationships even if their representation is weakly expressed in the dataset. Copy rights belong to original authors. Visit the link for more info