Uniform Manifold Approximation and Projection (UMAP)

Overview

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.

Researchers at the Tutte Institute developed both mathematical theory and an efficient software implementation. It is already being used in a variety of fields, including:

Single-cell biology: evaluation of UMAP as an alternative to t-SNE for single-cell data

materials science

Condensed matter physics: Interpretable machine learning for inferring the phase boundaries in a nonequilibrium system

Machine learning: Gaussian mixture models with Wasserstein distance

The algorithm itself is founded on three assumptions about the data:

The data is uniformly distributed on Riemannian manifold.
The Riemannian metric is locally constant, or can be approximated as such.
The manifold is locally connected.

From these assumptions, it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

The strong mathematical foundations ensure a robust and interpretable algorithm, and are being generalized to broader problems in unsupervised learning.