🟠 ML: Logistic Regression — Classification, Not Regression

How it works: Takes linear combination z = β₀ + β₁x₁ + ... and passes it through the sigmoid function:

P(Y=1) = 1 / (1 + e^(-z))

Sigmoid squashes any real number to (0, 1) → interpreted as probability.

Interpreting coefficients: A one-unit increase in xᵢ multiplies the odds by e^βᵢ. If β = 0.7, odds multiply by e^0.7 ≈ 2.01 → odds roughly double.

Q: "Why not use linear regression for classification?" A: Linear regression can predict values outside [0,1], which don't work as probabilities. It also minimizes squared error, which isn't the right objective for classification. Logistic regression constrains output to [0,1] and uses log loss (binary cross-entropy).