Methods and Theory : Unsupervised Joint Alignment and Clustering


Joint alignment of a collection of functions is the process of independently transforming the functions so that they appear more similar to each other. Here, functions can be binary, grayscale or color images, 1D or multidimensional curves, or even 3D scans. The alignment process is helpful for many scenarios, ranging from a pre-processing step to clean up a data set to learning the set of transformations that generate a data set. Typically, such unsupervised alignment algorithms fail when presented with complex data sets arising from multiple modalities or make restrictive assumptions about the form of the functions or transformations, limiting their generality.

We developed a nonparametric Bayesian model that can automatically align and cluster a data set. The clustering component can explicitly handle the multi-modality of complex data sets. Our model learns the number of clusters in a data-driven fashion and is applicable to a wide range of function types (e.g. images and curves) and any transformation function.

Sample Results

Given 100 unlabeled images (top), without any other information, our model chooses to represent the data with two clusters, aligns the images and clusters them as shown (bottom). On this data set, our models clustering accuracy is 94%, compared to 54% with K-means using two clusters.

Input: 100 unaligned, unlabelled images

Output: 2 clusters, with each image aligned to its cluster

In our paper, we also present state-of-the-art results on a full 10-digit data set, where 12 clusters were discovered resulting in a clustering accuracy of 87%.

Our model is not limited to affine transformations or even images. Consider, for example, the following data set of EKG heart data (top). Each curve represent a normal or abnormal heart beat, although the data set does not visually cluster into two groups only. In this case our model discovers 5 appropriate clusters and aligns each curve to its cluster (bottom).

Input: EKG curves representing normal and abnormal heart beats

Output: 5 clusters, with each curve aligned to its cluster


Graduate Students


  • Marwan Mattar, Allen Hanson, and Erik Learned-Miller
    Unsupervised Joint Alignment and Clustering using Bayesian Nonparametrics.
    Conference on Uncertainty in Artificial Intelligence (UAI), 2012.