7. All Big Picture Concepts

7.1. From Chapter 1 - Introduction to Data Science

  • The importance of Learning on Your Own

  • The importance of communication

7.2. From Chapter 2 - Mathematical Foundations

  • Functions and relations

  • Every table represents a relation.

7.3. From Chapter 3 - Jupyter

  • The structure of Jupyter

  • How to shut down Jupyter

7.4. From Chapter 4 - Review of Python and pandas

  • Writing to a slice of a DataFrame

7.5. From Chapter 5 - Before and After

  • Explanations before and after code

7.6. From Chapter 6 - Single-Table Verbs

  • The relationship between tall and wide data

7.7. From Chapter 7 - Abstraction

  • The value of abstraction in programming

7.8. From Chapter 8 - Version Control

  • Why people use tools like git

7.9. From Chapter 9 - Mathematics and Statistics in Python

  • Vectorization and its benefits

  • Models vs. fit models

7.10. From Chapter 10 - Visualization

  • Visualizing relations vs. functions

7.11. From Chapter 11 - Processing the Rows of a DataFrame

  • Informally, map is the same as apply

  • Important phrases: map-reduce and split-apply-combine

7.12. From Chapter 12 - Concatenating and Merging DataFrames

  • Concat adds rows and merge adds columns (usually!)

7.13. From Chapter 13 - Miscellaneous Munging Methods (ETL)

  • Munging/ETL is a large portion of data work

  • Information = Data + Context

  • Summary of key points about missing values

7.14. From Chapter 14 - Dashboards

  • Uses for data dashboards

7.15. From Chapter 15 - Relations as Graphs - Network Analysis

  • A graph depicts a binary relation of a set with itself

  • How pivoting/melting impacts graph data

7.16. From Chapter 16 - Relations as Matrices

  • What is a recommender system?

  • The SVD and approximation

7.17. From Chapter 17 - Introduction to Machine Learning

  • Supervised vs. unsupervised machine learning

  • A central issue: overfitting vs. underfitting

  • Why we split data into train and test sets