Round 5: Python/Pandas (What's Wrong?)
21. What's wrong with: df[df['x'] > 5]['y'] = 10
Chained indexing — may modify a copy. Fix: df.loc[df['x'] > 5, 'y'] = 10
22. You need each employee's department average as a new column. agg(), transform(), or apply()?
transform() — keeps the same number of rows.
23. df.loc[0:3] returns how many rows?
4 rows (labels 0, 1, 2, 3 — loc is inclusive on both ends).
24. Why is NumPy faster than Python lists? Contiguous memory (cache-friendly), homogeneous types (no per-element type checking), vectorized C operations (no Python loop overhead). 10-100x faster.
25. List vs generator for 10M items? Generator — O(1) memory (yields one at a time) vs O(n) memory for list (stores all at once).
Practice Questions
Q: What's wrong with:
Q: You need each employee's department average as a new column.
Q:
df.loc[0:3] returns how many rows?
Q: Why is NumPy faster than Python lists?
Q: List vs generator for 10M items?