Back Navigation Next Navigation Principal Component Analysis (page 5 of 5)

In Python, PCA may be done in a few steps (many of which are done by the PCA algorithm in sklearn).

1. Standardize the features (categorical features optional - continuous features mandatory) because the algorithm will perform euclidean distance calculations.
2. Create the covariance matrix to see the correlations between the features.
3. Calculate the eigenvectors and eigenvalues of the covariance matrix to identify the principal components.
4. Create a feature vector to decide which principal components to keep based on variance exaplained.
5. Transform the data along the principal components axes for the selected principal components.


Reading Check: Bad decisions made with good intentions are still bad decisions.

InClass Example: Credit Example