aims to find the projection that best separates the classes in the data.
used for supervised data
works by first calculating the mean and covariance matrix for each class in the data. It then calculates the between-class scatter matrix and the within-class scatter matrix. The goal is to find a projection that maximizes the ratio of the between-class scatter matrix to the within-class scatter matrix
projection is the position on the k dimension surface which is orthogonal
e.g. a 2-D point when projected on a 1-D line is the point on the line which is at the shortest distance from the original point
Algorithm
Pre-processing
- mean normalization ensure every feature has zero mean
- optionally feature-scaling i.e. make all features comparable magnitude
- e.g. size of house in sq. feet and number of bedrooms are on different scales
- usually done as x(j) = (x(j) - Mu(j))/s(j) when x(j) is feature j of x and Mu(j) is average and s(j) is some measure of beta value (Max-Min, or usually std deviation)
Compute covariant matrixSigma
Compute U the eigenvectors of Sigma as: [U, S, V] = svd(Sigma)
Compute Ureduced as first k columns U(:, 1:k)
Compute z as Ureduced' * x
Reconstruction is going from compressed k dimensions to original n dimensions by
Choosing right value of k for n dimensions: find smallest k that retains 99% of variance
i.e. after reducing to k dimension, less than 1% of accuracy is lost
svd function returns a diagonal matrix S as one of the return value, which can be used to efficiently calculate konly
Mapping from x(i) -> z(i) where x is n dimensional and z is reduced k dimensional that must be derived