terminology

🟣 Terminology: Normalization vs Denormalization

Normalization = split data into related tables to reduce redundancy. - Example: instead of storing customer_name in every order row, store customer_id and look up the name in a separate customers table. - Good for: OLTP/write-heavy systems (update customer name in one place) - Downside: queries need many JOINs

Denormalization = intentionally add redundancy by pre-joining tables. - Example: store customer_name directly in the orders table - Good for: OLAP/analytics (fewer JOINs = faster reads) - Downside: update customer name → must update it everywhere

Practice Questions

Q: You're designing a data warehouse for analytics. Normalized or denormalized? Why?