🟣 Terminology: Deep Learning (When They Ask)
Neural Networks Layers of connected nodes. Data flows through layers; weights are adjusted via backpropagation to minimize loss. "Deep" = many layers.
CNN (Convolutional Neural Networks) For spatial data (images). Learnable filters slide across input detecting patterns: edges → textures → objects. Parameter sharing makes them efficient.
RNN / LSTM For sequential data. RNNs maintain hidden state but suffer from vanishing gradients (forget long sequences). LSTMs add gating to selectively remember/forget.
Transformers Process entire sequences in parallel via self-attention (each element attends to all others). Much faster than RNNs. Foundation of BERT, GPT.
The key interview answer
"When do you use deep learning vs gradient boosting?"
For tabular data (most DS work): gradient-boosted trees almost always win. Faster to train, more interpretable, needs less data.
For unstructured data (images, text, audio): deep learning wins. Feature engineering is impractical — let the network learn features.