Unsupervised learning is a method of machine learning where a model is fit to observations. It is distinguished from supervised learning by the fact that there is no a priori output. In unsupervised learning, a data set of input objects is gathered. Unsupervised learning then typically treats input objects as a set of random variables. A joint density model is then built for the data set.

Unsupervised learning can be used in conjunction with Bayesian inference to produce conditional probabilities (i.e., supervised learning) for any of the random variables given the others.

Unsupervised learning is also useful for data compression: fundamentally, all data compression algorithms either explicitly or implicitly rely on a probability distribution over a set of inputs.

Another form of unsupervised learning is clustering, which is sometimes not probabilistic. Also see formal concept analysis.

Bibliography