Lab 5
Make up your other half
Introduction
In this lab you’ll “make up” data for an outcome variable to match certain criteria, and then visualize and summarize the relationship between your outcome and predictor variables as well as fit a model to evaluate whether the made up data meets the criteria you set out to achieve.
Make sure to upload your completed lab to Gradescope by the end of your lab session and commit and push your final version to GitHub.
Getting started
By now you should be familiar with how to get started with a lab assignment by cloning the GitHub repo for the assignment. If you’re not sure how, refer back to an earlier lab.
Open the lab-5.qmd template Quarto file and update the authors field to add your name first (first and last) and then your teammates’ names (first and last). Render the document. Examine the rendered document and make sure your and your teammates’ names are updated in the document. Commit and push your changes with a meaningful commit message and push to GitHub.
Click to expand if you need a refresher on assignment guidelines.
Code
Code should follow the tidyverse style. Particularly,
- there should be spaces before and line breaks after each 
+when building aggplot, - there should also be spaces before and line breaks after each 
|>in a data transformation pipeline, - code should be properly indented,
 - there should be spaces around 
=signs and spaces after commas. 
Additionally, all code should be visible in the PDF output, i.e., should not run off the page on the PDF. Long lines that run off the page should be split across multiple lines with line breaks.1
Plots
- Plots should have an informative title and, if needed, also a subtitle.
 - Axes and legends should be labeled with both the variable name and its units (if applicable).
 - Careful consideration should be given to aesthetic choices.
 
Workflow
Continuing to develop a sound workflow for reproducible data analysis is important as you complete the lab and other assignments in this course.
- You should have at least 3 commits with meaningful commit messages by the end of the assignment.
 - Final versions of both your 
.qmdfile and the rendered PDF should be pushed to GitHub. 
Packages
In this lab we will work with the tidyverse package.
Data
You have half the data, just your predictor variable (x), you need in the following data frame:
Goal
Your goal is to “make up” the outcome variable y such that when you fit a linear regression model predicting y from x, the model are true about the model:
For each additional unit in
x,yis expected to be lower, on average, by (approximately) 2 units.For observations with
xequal to 0,yis expected to be (approximately) 5.The model for predicting
yfromxexplains (approximately) 60% of the variability iny.
Questions
Question 1
Make up the values for y in the data frame above so that the linear regression model predicting y from x meets the criteria specified in the Goal section. Add these values to the data frame our_data and display the updated data frame.
Solving Question 1 will require some trial and error, creativity, and patience. Specifically, it will require writing and testing the code for Questions 2 and 3 multiple times until you get the desired results.
Question 2
Visualize the relationship between x and y using a scatter plot with a regression line. Comment on how the plot supports that the criteria specified in the Goal section are met.
Question 3
Fit a linear regression model predicting y from x. Display the model coefficients and the R-squared value from the model summary. Comment on how the model coefficients and R-squared value support that the criteria specified in the Goal section are met.
Wrap-up
Before you wrap up the assignment, make sure that you render, commit, and push one final time so that the final versions of both your .qmd file and the rendered PDF are pushed to GitHub and your Git pane is empty. We will be checking these to make sure you have been practicing how to commit and push changes.
Submission
By now you should also be familiar with how to submit your assignment in Gradescope.
Click to expand if you need a refresher on how to get started with a lab assignment.
Submit your PDF document to Gradescope by the end of the lab to be considered “on time”:
- Go to http://www.gradescope.com and click Log in in the top right corner.
 - Click School Credentials \(\rightarrow\) Duke NetID and log in using your NetID credentials.
 - Click on your STA 199 course.
 - Click on the assignment, and you’ll be prompted to submit it.
 - Mark all the pages associated with question. All the pages of your lab should be associated with at least one question (i.e., should be “checked”).
 
Make sure you have:
- attempted all questions
 - rendered your Quarto document
 - committed and pushed everything to your GitHub repository such that the Git pane in RStudio is empty
 - uploaded your PDF to Gradescope
 
Grading and feedback
- This lab is worth 30 points:
- 10 points for being in lab and turning in something – no partial credit for this part.
 - 20 points for:
- answering the questions correctly – there is partial credit for this part.
 - following the workflow – there is partial credit for this part.
 
 
 - The workflow points are for:
- committing at least three times as you work through your lab,
 - having your final version of 
.qmdand.pdffiles in your GitHub repository, and - overall organization.
 
 - You’ll receive feedback on your lab on Gradescope within a week.
 
Good luck, and have fun with it!
Footnotes
Remember, haikus not novellas when writing code!↩︎
