ml

🟠 ML: PCA in Simple Terms

What it does: Finds new axes (principal components) that capture the most variance. The first component captures the most, the second captures the most remaining variance perpendicular to the first, and so on.

Keep only the top k components → reduce dimensions while retaining most information.

It's a rotation + projection, not feature selection. Each component is a LINEAR COMBINATION of original features, not a single feature.

When to use: High-dimensional data where features are correlated (100 survey questions → 5 underlying factors). Also for visualization (reduce to 2D/3D).

Practice Questions

Q: You have 200 features. After PCA, the first 15 components explain 95% of variance. What do you do?