ml

🟠 ML: Feature Engineering

Transforming raw data into features that help the model learn. Often the highest-ROI activity in ML.

Common techniques: - Log transform: Reduce skewness (income, prices → log(income)) - Binning: Age → age_group (18-25, 26-35, ...) - Interactions: height × weight, price × quantity - Time-based: day_of_week, is_weekend, hours_since_last_event - Aggregation: customer's avg order value, total orders in last 30 days - One-hot encoding: categorical → binary columns

Feature selection interview answer: "I'd start by removing zero-variance features, then check pairwise correlation (drop one of highly correlated pairs). Use L1 regularization or tree feature importance to rank features. Validate the selected subset with cross-validation."