PCA is a good starting point for complex data. It models a linear subspace of the data by capturing the greatest variability. It does this by assessing the data’s covariance structure using matrix calculations and eigenvectors to compute the best unique features to describe the samples.
The first step finds the mean of the data, then search for the direction with the most variance. This direction is the principle component vectors, so it is added to a list. The next principle component is the orthogonal direction that has the next highest variance and so on.
This has a lot of practical uses including reducing the number of features you are working with for more processor intensive applications and noise reduction.
PCA is sensitive to the scale of features but, luckily for us on this occasion, our features are all of similar scale.