In statistics, principal components analysis (PCA) is a technique to simplify a dataset; more formally it is a transform used for reducing dimensionality in a dataset while retaining those characteristics of the dataset that contribute most to its variance. These characteristics may be the 'most important', but this is not necessarily the case, depending on the application.
PCA is also called the Karhunen-Loève transform or the Hotelling transform. PCA has the speciality of being the optimal linear transform for keeping the subspace that has largest variance. However this comes at the price of greater computational requirement, e.g. if compared to the discrete cosine transform. Unlike other linear transforms, the PCA does not have a fixed set of basis vectors. Its basis vectors depend on the data set.
The principal component w1 of a dataset x can be defined as (assuming zero mean, i.e. E(x)=0)
Related (or even more similar than related?) is the calculus of empirical orthogonal functions (EOF).
Another method of dimension reduction is a self-organizing map.
If PCA is used in pattern recognition an often useful alternative is the linear discriminant analysis that takes into account the class separability, which is not the case for PCA.
See also: