Take-home Exam 1

Take-home Exam for Week 1 to 6

Exam
Important

Due Wednesday, Oct 8, 11:59 PM

I recognize the importance of personal integrity in all aspects of life and work. I commit myself to truthfulness, honor and responsibility, by which I earn the respect of others. I support the development of good character and commit myself to uphold the highest standards of academic integrity as an important aspect of personal integrity. My commitment obliges me to conduct myself according to the Marquette University Honor Code.

Exam Problems (50 points)

  1. Normal Approximation to Binomial. Suppose that 60% of Americans have had chickenpox by the time they reach adulthood.
  1. Suppose we take a random sample of 6 American adults. Let \(X\) be the number of people out of the 6 who had chickenpox during childhood. If \(X \sim binomial(n, \pi)\), determine the value of \(n\) and \(\pi\).

  2. With (a), compute the probability that at least 3 adults have had chickenpox.

  3. With (a), what is the value of the mean \(\mu\) and variance \(\sigma^2\) of \(X \sim binomial(n, \pi)\)?

  4. Now treat \(X\) as a normal random variable, i.e., \(X \sim N(\mu, \sigma^2)\), where \(\mu\) and \(\sigma^2\) are the values obtained in (c). Compute the probability that at least 3 adults have had chickenpox.

  5. Suppose now we take a random sample of 60 American adults. Using \(binomial(n, \pi)\) and \(N(\mu, \sigma^2)\) with new parameter values \(n\), \(\mu\), and \(\sigma^2\) to compute the probability that at least 30 adults have had chickenpox.

  6. Compare binomial and normal probabilities when sample size are 6 and 60. In which case, binomial and normal probabilities are closer?

  7. (MSSC) Is there any issue when we use a normal distribution to approximate a binomial distribution? Use continuity correction on (e). Do you get a better approximation result?


  1. As part of a study to determine factors that may explain differences in animal species relative to their size, the following body masses (in grams) of 50 different bird species were reported in the paper ``Temperature and the Northern Distributions of Wintering Birds,’’ by Richard Repasky (1991). The data are provided in exam1data.csv. First download and upload the data to your RStudio. Then import the data using the following command like

data <- read.csv("./exam1data.csv")

where "./exam1data.csv" is the file path and ./ means your current working directory.

  1. Make a histogram of bodymass.

  2. Construct a boxplot with label X-axis or Y-axis (depending on whether the plot is horizontal or vertical) as Body Mass (in grams) with a title Boxplot of Body Mass (g). Are there any outliers in the data set? If yes, identify them.

  3. Find 30-th, 60-th and 90-th percentiles (or 0.3, 0.6 and 0.9 quantiles) as wells as the sample mean and median. Based on (a), (b), and (c), comment the skewness of the data, and which should be a better measure of center.

Many statistical methods assume data are normally distributed, and the Quantile-Quantile plot or QQ plot helps us check the normality assumption. If the data are approximately normally distributed, the points on the QQ plot will lie close to a straight line.

  1. Remove the outliers you identified in (b) and construct a QQ-plot for the data by using the functions qqnorm() and qqline(). Does the distribution without outliers look like normal? Please comment.

  2. Transform the data with outliers removed by taking the natural log with base \(e = 2.71828...\). Construct a QQ-plot for the transformed data and comment on its normality.


  1. AI Usage Declaration. Using GenAI is permitted for this course. If you choose to use GenAI to assist with your exam, you must include a brief statement documenting your use. Please provide the following information:
  1. Why/How I Used AI Why do you need to use GenAI? Which tool did you use? Describe your prompts or questions. What and how did you ask the AI to help you?

  2. Generated Output Include a screenshot or excerpt (copy and paste) of the AI’s response.

  3. How I Used the Output Did you revise it? Did you use it directly, or compare it with your answers? What decisions did you make based on the output?