Exercise 2

Exercises for Exam 2

Exercise

An independent random sample is selected from an approximately normal population with an unknown standard deviation. Find the p-value for the given sample size and test statistic. Also determine if the null hypothesis of a two-tailed test would be rejected at \(\alpha = 0.05\).
1. \(n = 11\), \(t_{test} = 1.91\)
2. \(n = 17\), \(t_{test} = -3.45\)

a) p-value= 0.08520488

b) p-value= 0.003293571

Georgianna claims that in a small city renowned for its music school, the average child takes less than 5 years of piano lessons. We have a random sample of 20 children from the city, with a mean of 4.6 years of piano lessons and a standard deviation of 2.2 years.
1. Evaluate Georgianna’s claim (or that the opposite might be true) using a hypothesis test.
2. Construct a 95% confidence interval for the number of years students in this city take piano lessons, and interpret it in context of the data.
3. Do your results from the hypothesis test and the confidence interval agree? Explain your reasoning.

## a) One sample t test with alpha 0.05
## H0: mu >= 5; H1: mu < 5
(t_test <- (4.6 - 5) / (2.2/sqrt(20)))

[1] -0.8131156

(t_cri <- qt(p = 0.05, df = 20 - 1)) ## Do not reject H0

[1] -1.729133

## b) 
4.6 + c(-1, 1) * qt(p = 0.975, df = 20 - 1) * (2.2 / sqrt(20))

[1] 3.570368 5.629632

## c)
# Not agree. The test is one-sided, but the CI is two-sided.

In each of the following scenarios, determine if the data are paired.
1. Compare pre- (beginning of semester) and post-test (end of semester) scores of students.
2. Assess gender-related salary gap by comparing salaries of randomly sampled men and women.
3. Compare artery thicknesses at the beginning of a study and after 2 years of taking Vitamin E for the same group of patients.
4. Assess effectiveness of a diet regimen by comparing the before and after weights of subjects.

a) paired.

b) independent.

c) paired.

d) paired.

Dr. Yu decided to run two slight variations of the same exam. Prior to passing out the exams, he shuffled the exams together to ensure each student received a random version. Summary statistics for how students performed on these two exams are shown below. Anticipating complaints from students who took Version B, he would like to evaluate whether the difference observed in the groups is so large that it provides convincing evidence that Version B was more difficult (on average) than Version A. Test the claim with \(\alpha = 0.01\).

Version	\(n\)	\(\bar{x}\)	\(s\)	min	max
A	30	79.4	14	45	100
B	27	74.1	20	32	100

n1 = 30; x1_bar = 79.4; s1 = 14
n2 = 27; x2_bar = 74.1; s2 = 20
A <- s1^2 / n1; B <- s2^2 / n2
df <- (A + B)^2 / (A^2/(n1-1) + B^2/(n2-1))
(df <- floor(df))

[1] 45

## t_test
(t_test <- (x1_bar - x2_bar) / sqrt(s1^2/n1 + s2^2/n2))

[1] 1.147085

## t_cv
qt(p = 0.01, df = df, lower.tail = FALSE)

[1] 2.412116

## p_value
pt(q = t_test, df = df, lower.tail = FALSE)

[1] 0.1287044

# the data do not convincingly show that one exam
# version is more difficult than the other

Undergraduate students taking an introductory statistics course at Marquette University conducted a survey about GPA and major. The ANOVA output is provided.
1. Write the hypotheses for testing for a difference between average GPA across majors.
2. What is the conclusion of the hypothesis test?
3. How many students answered these questions on the survey, i.e. what is the sample size?

Source	Df	Sum Sq	Mean Sq	F value	Pr(>F)
major	2	0.03	0.015	0.185	0.8313
Residuals	195	15.77	0.081

## (a) 
# H0: mu1 = mu2 = mu3; H1: not all mus are equal

## (b)
# p-value > 0.05. The data do not provide convincing evidence of a
# difference between the average GPAs across three groups of majors.

## (c)
# The total degrees of freedom is 195 + 2 = 197, so the sample size is 197 + 1 = 198.

Find the test statistic, critical value(s) of \(\chi^2\), and p-value, then determine whether there is sufficient evidence to support the given alternative hypothesis.
1. \(H_1: \sigma \ne 15\), \(\alpha = 0.05\), \(n = 20\), \(s = 10\).
2. \(H_1: \sigma > 12\), \(\alpha = 0.01\), \(n = 5\), \(s = 18\).

## (a)
## test statistic
(20 - 1)*10^2/(15^2)

[1] 8.444444

## critical values
qchisq(p = 0.05/2, df = 19, lower.tail = TRUE)

[1] 8.906516

qchisq(p = 0.05/2, df = 19, lower.tail = FALSE)

[1] 32.85233

## p-value
2*pchisq((20 - 1)*10^2/(15^2), df = 19)

[1] 0.03677387

## (b)
## test statistic
(5 - 1)*18^2/(12^2)

[1] 9

## critical value
qchisq(p = 0.01, df = 4, lower.tail = FALSE)

[1] 13.2767

## p-value
pchisq((5 - 1)*18^2/(12^2), df = 4, lower.tail = FALSE)

[1] 0.06109948