| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
|---|---|---|---|---|---|---|---|---|
| Tar | 25.0 | 27.0 | 20.0 | 24.0 | 20.0 | 20 | 21.0 | 24.0 |
| Nicotine | 1.5 | 1.7 | 1.1 | 1.6 | 1.1 | 1 | 1.2 | 1.4 |
Homework 8
Correlation and Linear Regression
Due Friday, Dec 5, 11:59 PM
Please submit your work in one PDF file to D2L > Assessments > Dropbox. Multiple files or a file that is not in pdf format is not allowed.
In your homework, please number questions in order.
Handwritten tables and figures receive no credits.
You do not need to attach your code of any language you use in the homework. However, if you fail to complete your calculations or produce your table or figure, you receive partial credits if your code is attached.
Questions started with (MSSC) are required for MSSC 5720 students, and optional for MATH 4720 students.
Homework Questions
- Maru and Lulu are both collecting data on number of rainy days in a year and the total rainfall for the year. Maru records rainfall in inches and Lulu in centimeters. How will their correlation coefficients compare?
- Suppose we fit a regression line to predict the shelf life of an apple based on its weight. For a particular apple, we predict the shelf life to be 4.6 days. The apple’s residual is -0.6 days. Did we over or under estimate the shelf-life of the apple? Explain your reasoning.
- Construct a scatterplot using tar for the \(x\) axis and nicotine for the \(y\) axis. Does the scatterplot suggest a linear relationship between the two variables? Are they positively or negatively related?
- Let \(y\) be the amount of nicotine and let \(x\) be the amount of tar. Fit a simple linear regression to the data and identify the sample regression equation.
- What percentage of the variation in nicotine can be explained by the linear correlation between nicotine and tar?
- The Raleigh brand king size cigarette is not included in the table, and it has 23 mg of tar. What is the best predicted amount of nicotine? How does the predicted amount compare to the actual amount of 1.3 mg of nicotine? What is the value of residual?
- Perform the test \(H_0: \beta_1 = 0\) vs. \(H_1: \beta_1 \ne 0\).
- Provide 95% confidence interval for \(\beta_1\).
- Generate the ANOVA table for the linear regression.
-
(MSSC) We have been doing a variety of hypothesis tests, two-sample pooled \(t\) tests, \(F\)-test for comparing two variances for example. However, the null hypothesis significance testing (NHST) paradigm and the p-value usage have been much criticized and shown to be problematic and often misused in data analysis. In fact, in his research, Dr. Yu never does NHST or uses p-value taught in MATH 4720. Please read the articles
Write a one-page summary discussing the problems of NHST paradigm and p-value. Welcome to share what does not make sense to you about the hypothesis testing (or confidence interval) that was proposed about 100 years ago.
-
AI Usage Declaration. Using GenAI is permitted for this course. If you choose to use GenAI to assist with your homework, you must include a brief statement documenting your use. Please provide the following information:
- Why/How I Used AI Why do you need to use GenAI? Which tool did you use? Describe your prompts or questions. What and how did you ask the AI to help you?
- Generated Output Include a screenshot or excerpt (copy and paste) of the AI’s response.
- How I Used the Output Did you revise it? Did you use it directly, or compare it with your answers? What decisions did you make based on the output?
