Uniform Manifold Approximation and Projection (UMAP)


Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction.

Researchers at the Tutte Institute developed both mathematical theory and an efficient software implementation. It is already being used in a variety of fields, including single-cell biology, materials science, condensed matter physics, machine learning, and many others.

The algorithm itself is founded on three assumptions about the data:

  1. The data is uniformly distributed on Riemannian manifold.
  2. The Riemannian metric is locally constant, or can be approximated as such.
  3. The manifold is locally connected.

From these assumptions, it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

The strong mathematical foundations ensure a robust and interpretable algorithm, and are being generalized to broader problems in unsupervised learning.

Get it!

The latest release of the software can be found on GitHub under the account of Dr. Leland McInnes. Documentation is also available online and a preprint of the paper describing the underlying mathematical foundation is available.

If you would like to know more, you can contact the Tutte Institute.