STA 199: Introduction to Data Science and Statistical Thinking
This page contains an outline of the topics, content, and assignments for the semester. Note that this schedule will be updated as the semester progresses and the timeline of topics and assignments might be updated throughout the semester.
WEEK | DATE | TOPIC | PREPARE | MATERIALS | DUE |
---|---|---|---|---|---|
1 | Mon, Aug 25 | Lab 0: Mise en place | π» lab 0 | Lab 0 due at the end of lab session (not graded) | |
Tue, Aug 26 | Hello World and Hello STA 199! | π Syllabus | π₯οΈ slides 01 ποΈ notes 01 |
||
Thu, Aug 28 | Meet the toolkit | π r4ds - intro π ims - chp 1 π₯ Meet the toolkit :: R and RStudio π₯ Meet the toolkit :: Quarto π₯ Code along :: First data viz with UN Votes |
π₯οΈ slides 02 ποΈ notes 02 β¨οΈ ae 01 β ae 01 |
||
2 | Mon, Sep 1 | No lab - Labor Day | |||
Tue, Sep 2 | ARC Presentation Grammar of data visualization |
π r4ds - chp 1 π ims - chp 4 π₯ Visualizing data π₯ Building a plot step-by-step with ggplot2 π₯ Grammar of graphics π₯ Code along :: First look at Palmer Penguins |
π₯οΈ slides 03 ποΈ notes 03 |
||
Thu, Sep 4 | Grammar of data transformation | π r4ds - chp 2 π r4ds - chp 3.1-3.5 π₯ Grammar of data transformation π₯ Code along :: Flights and pipes |
π₯οΈ slides 04 ποΈ notes 04 β¨οΈ ae 02 β ae 02 |
||
3 | Mon, Sep 8 | Lab 1: Exploring NC Counties | π» lab 1 π hw 1 |
Lab 1 due at the end of the lab session | |
Tue, Sep 9 | Exploratory data analysis I | π r4ds - chp 3.6-3.7 π₯ Visualizing and summarizing categorical data π₯ Visualizing and summarizing numerical data π₯ Visualizing and summarizing relationships π₯ Code along :: Star Wars characters |
π₯οΈ slides 05 ποΈ notes 05 β¨οΈ ae 03 β ae 03 |
||
Thu, Sep 11 | Exploratory data analysis II | π ims - chp 5 π ims - chp 6 π₯ Code along :: Diving deeper with Palmer Penguins |
π₯οΈ slides 06 ποΈ notes 06 β¨οΈ ae 04 β ae 04 |
||
Sun, Sep 14 | HW 1 due at 11:59 pm | ||||
4 | Mon, Sep 15 | Lab 2: Get in teams then group_by()
|
π r4ds - chp 4 | π» lab 2 π hw 2 |
Lab 2 due at the end of the lab session |
Tue, Sep 16 | Tidying data | π₯ Tidy data π₯ Tidying data π₯ Code along :: Country populations over time π r4ds - chp 5 |
π₯οΈ slides 07 ποΈ notes 07 β¨οΈ ae 05 β ae 05 |
||
Thu, Sep 18 | Joining data | π₯ Joining data π₯ Code along :: Continent populations π r4ds - chp 19.1-19.3 |
π₯οΈ slides 08 ποΈ notes 08 β¨οΈ ae 06 β ae 06 |
||
Sun, Sep 21 | HW 2 due at 11:59 pm | ||||
5 | Mon, Sep 22 | Lab 3: Inflation everywhere | π» lab 3 π hw 3 |
Lab 3 due at the end of the lab session | |
Tue, Sep 23 | Data types and classes | π₯ Data types π₯ Data classes π₯ Code along :: Thatβs my type π r4ds - chp 16 |
π₯οΈ slides 09 ποΈ notes 09 β¨οΈ ae 07 β ae 07 |
||
Thu, Sep 25 | Importing and recoding data | π₯ Importing data π₯ Code along :: Halving CO2 emissions π₯ Code along :: Student survey π r4ds - chp 7 π r4ds - chp 17.1 - 17.3 |
π₯οΈ slides 10 ποΈ notes 10 β¨οΈ ae 08 β ae 08 |
||
Sun, Sep 28 | HW 3 due at 11:59 pm | ||||
6 | Mon, Sep 29 | Lab 4: Changes in college athletics | π» lab 4 | Lab 4 due at the end of lab session | |
Tue, Sep 30 | Exam 1 review | π₯οΈ slides 11 ποΈ notes 11 π exam 1 review β exam 1 review |
|||
Thu, Oct 2 | Exam 1 - In-class + take-home released | ||||
Sat, Oct 4 | Exam 1 take-home due at 12 pm | ||||
7 | Mon, Oct 6 | Project milestone 1 - Working collaboratively [45 mins] Project milestone 2 - Project proposals [30 mins] |
π Pre-read: Merge conflicts π Project description |
π project milestone 1 π project milestone 2 |
Project milestone 1 due at the end of lab session |
Tue, Oct 7 | Web scraping a single page | π₯ Web scraping basics π₯ Code along :: Scraping an eCommerce page π r4ds - chp 24.1 - 24.6 |
π₯οΈ slides 12 ποΈ notes 12 β¨οΈ ae 09 β ae 09 |
||
Thu, Oct 9 | Web scraping many pages | π₯ Code along :: Scraping many eCommerce pages π₯ Web scraping considerations π r4ds - chp 25.1 - 25.2 |
π₯οΈ slides 13 ποΈ notes 13 β¨οΈ ae 09 β ae 09 |
Midterm course evaluation due at 11:59 pm (optional) | |
8 | Mon, Oct 13 | No lab - Fall Break | |||
Tue, Oct 14 | No lecture - Fall Break | ||||
Thu, Oct 16 | Data science ethics | π₯ Misrepresentation π₯ Data privacy π₯ Algorithmic bias π₯ Code along :: Sectors and services |
π₯οΈ slides 14 ποΈ notes 14 |
Project milestone 2 due at 11:59 pm Peer evaluation 1 due at 11:59 pm |
|
9 | Mon, Oct 20 | Project milestone 3 - Improve and progress | π Tidyverse style guide - Chp 1-5 | π project milestone 3 | Project milestone 3 due at the end of lab session |
Tue, Oct 21 | The language of models | π₯ The language of models π₯ Linear regression with a numerical predictor π ims - chp 7.1 |
π₯οΈ slides 15 ποΈ notes 15 β¨οΈ ae 10 β ae 10 |
||
Thu, Oct 23 | Linear regression with a single predictor | π₯ Linear regression with a categorical predictor π₯ Outliers in linear regression π₯ Code along :: Modeling fish π ims - chp 7.2 |
Peer evaluation 2 due at 11:59 pm | ||
10 | Mon, Oct 27 | Project milestone 4 - Peer review [30 minutes] Lab 5: Visualize, model, interpret [45 minutes] |
π project milestone 4 | Project milestone 4 at the end of lab session Lab 5 due at the end of the lab session |
|
Tue, Oct 28 | Linear regression with multiple predictors | π₯ Linear regression with multiple predictors π₯ Main and interaction effects π ims - chp 8.1-8.2 π ims - chp 8.3-8.5 |
|||
Thu, Oct 30 | Model selection and overfitting | π₯ Code along :: Modeling interest rates | Peer evaluation 3 at 11:59 pm | ||
Sun, Nov 2 | HW 4 at 11:59 pm | ||||
11 | Mon, Nov 3 | Project milestone 5 - Work on writeup and presentations | π project milestone 5 | Project milestone 5 due at the end of lab session | |
Tue, Nov 4 |
Essential data science skills potpourri:
|
π ims - chp 6 π r4ds - chp 10 |
|||
Thu, Nov 6 | Logistic regression I | π₯ Logistic regression π₯ Code along :: Building a spam filter π ims - chp 9 |
|||
12 | Mon, Nov 10 | Project milestone 6 - Present and turn in write-up! | π project milestone 6 | ||
Tue, Nov 11 | Logistic regression II | π₯ Clasification and decision errors π₯ Overfitting and spending your data |
|||
Thu, Nov 13 | Evaluating models | π₯ Code along :: Forest classification | Peer evaluation 4 due at 11:59 pm | ||
13 | Mon, Nov 17 | Lab 6: Everything so far II | Lab 6 due at the end of lab session | ||
Tue, Nov 18 | Exam 2 review | π exam 2 review β exam 2 review |
|||
Thu, Nov 20 | Exam 2 - In-class + take-home released | ||||
Sat, Nov 22 | Exam 2 take-home due at 12 pm | ||||
14 | Mon, Nov 24 | Lab 7: Explore and classify | Lab 7 due at the end of lab session | ||
Tue, Nov 25 | Quantifying uncertainty with bootstrap intervals | π₯ Quantifying uncertainty π₯ Bootstrapping π₯ Code along :: Bootstrapping Duke Forest houses π ims - chp 11 π ims - chp 12 |
|||
Thu, Nov 27 | No lecture - Thanksgiving | ||||
Sun, Nov 30 | HW 5 due at 11:59 pm | ||||
15 | Mon, Dec 1 | Lab 8: Inference | To be posted | Lab 8 due at the end of lab session | |
Tue, Dec 2 | Making decisions with randomization tests | To be posted | |||
Thu, Dec 4 | Looking further | To be posted | |||
Fri, Dec 5 | HW 6 due at 11:59 pm (will be accepted until Sun, Dec 7 at 11:59 pm without penalty) | ||||
NA | Final review (time TBD, location TBD) | π final review β final review |
|||
16 | Fri, Dec 12 | Final (2 pm - 5 pm) | To be posted |