
MATH 4720/MSSC 5720 Introduction to Statistics
Which test we learned needs \(\sigma_1 = \sigma_2\)?
In some situations, we care about variation!


The sample variance \(S^2 = \frac{\sum_{i=1}^n(X_i - \overline{X})^2}{n-1}\) is our point estimator for the population variance \(\sigma^2\).
The inference for \(\sigma^2\) needs the population to be normal.
❗ The methods can work poorly if the normality is violated, even the sample is large.
The inference for \(\sigma^2\) involves the \(\chi^2\) distribution.
Defined over positive numbers
Parameter: degrees of freedom \(df\)
Right skewed
More symmetric as \(df\) gets larger

\(\chi^2_{\frac{\alpha}{2},\, df}\) has area to the right of \(\alpha/2\).
\(\chi^2_{1-\frac{\alpha}{2},\, df}\) has area to the left of \(\alpha/2\).
In \(N(0, 1)\), \(z_{1-\frac{\alpha}{2}} = -z_{\frac{\alpha}{2}}\). But \(\chi^2_{1-\frac{\alpha}{2},\,df} \ne -\chi^2_{\frac{\alpha}{2},\,df}\) because of non-symmetry of the \(\chi^2\) distribution.
When a random sample of size \(n\) is from \(\color{red}{N(\mu, \sigma^2)}\), \[ \frac{(n-1)S^2}{\sigma^2} \sim \chi^2_{n-1} \]
The inference method for \(\sigma^2\) introduced here can work poorly if the normality assumption is violated, even for large samples!
\((1-\alpha)100\%\) CI for \(\sigma^2\) is \[\color{blue}{\left( \frac{(n-1)S^2}{\chi^2_{\frac{\alpha}{2}, \, n-1}}, \frac{(n-1)S^2}{\chi^2_{1-\frac{\alpha}{2}, \, n-1}} \right)}\]

❗ The CI for \(\sigma^2\) cannot be expressed as \((S^2-m, S^2+m)\) anymore!
Listed below are heights (cm) for the simple random sample of 16 female supermodels:
heights <- c(178, 177, 176, 174, 175, 178, 175, 178,
178, 177, 180, 176, 180, 178, 180, 176)
\(n = 16\), \(s^2 = 3.4\), \(\alpha = 0.05\).
\(\chi^2_{\alpha/2, n-1} = \chi^2_{0.025, 15} = 27.49\)
\(\chi^2_{1-\alpha/2, n-1} = \chi^2_{0.975, 15} = 6.26\)
Use \(\alpha = 0.05\) to test the claim that “supermodels have heights with a standard deviation that is less than \(\sigma = 7.5\) cm for the population of women”.

Use \(\alpha = 0.05\) to test the claim that “supermodels have heights with a standard deviation that is less than \(\sigma = 7.5\) cm for the population of women”.
Step 1: \(H_0: \sigma = \sigma_0\) vs. \(H_1: \sigma < \sigma_0\). Here \(\sigma_0 = 7.5\) cm
Step 2: \(\alpha = 0.05\)
Step 3: Under \(H_0\), \(\chi_{test}^2 = \frac{(n-1)s^2}{\sigma_0^2} = \frac{(16-1)(3.4)}{7.5^2} = 0.91\), a statistic drawn from \(\chi^2_{n-1}\).
Heights of supermodels vary less than heights of women in the general population.

In a pooled t-test, we assume
both samples are of large size or drawn from a normal population.
\(\sigma_1 = \sigma_2\)
Use QQ-plot (and normality tests, Anderson, Shapiro, etc) to check the assumption of normal distribution.
We learn to check the assumption \(\sigma_1 = \sigma_2\).
We use \(F\) distribution for the inference about two population variances.
Two parameters: \(df_1\), \(df_2\)
Right skewed
Defined over positive numbers

The random samples of size \(n_1\) and \(n_2\) are independent from two normal populations, \(N(\mu_1, \sigma_1^2)\) and \(N(\mu_2, \sigma_2^2)\).
The ratio \[\frac{S_1^2/S_2^2}{\sigma_1^2/\sigma_2^2} \sim F_{n_1-1, \, n_2-1}\]
\((1-\alpha)100\%\) CI for \(\sigma_1^2 / \sigma_2^2\) is \[\color{blue}{\left( \frac{s_1^2/s_2^2}{F_{\alpha/2, \, n_1 - 1, \, n_2 - 1}}, \frac{s_1^2/s_2^2}{F_{1-\alpha/2, \, \, n_1 - 1, \, n_2 - 1}} \right)}\]

❗ The CI for \(\sigma_1^2 / \sigma_2^2\) cannot be expressed as \(\left(\frac{s_1^2}{s_2^2}-m, \frac{s_1^2}{s_2^2} + m\right)\) anymore!
\[\small F_{test} = \frac{s_1^2/s_2^2}{\sigma_1^2/\sigma_2^2} = \frac{s_1^2}{s_2^2} \sim F_{n_1-1, \, n_2-1}\]
A study was conducted to see the effectiveness of a weight loss program.
Two groups (Control and Experimental) of 10 subjects were selected.
The two populations are normally distributed and have the same SD.

The data on weight loss was collected at the end of six months
Assumptions:
\(\sigma_1 = \sigma_2\)
The weight loss for both groups are normally distributed.
\(n_1 = 10\), \(s_1 = 0.5 \, lb\)
\(n_2 = 10\), \(s_2 = 0.7 \, lb\)
Step 1: \(\begin{align} &H_0: \sigma_1 = \sigma_2 \\ &H_1: \sigma_1 \ne \sigma_2 \end{align}\)
Step 2: \(\alpha = 0.05\)
Step 3: \(F_{test} = \frac{s_1^2}{s_2^2} = \frac{0.5^2}{0.7^2} = 0.51\).
Step 4-c: Two-tailed test. The critical value is \(F_{0.05/2, \, 10-1, \, 10-1} = 4.03\) or \(F_{1-0.05/2, \, 10-1, \, 10-1} = 0.25\).

Step 5-c: Is \(F_{test} > 4.03\) or \(F_{test} < 0.25\)? No.
Step 6: The evidence is not sufficient to reject the claim that \(\sigma_1 = \sigma_2\).

## lower bound
(s1 ^ 2 / s2 ^ 2) / f_big[1] 0.127
## upper bound
(s1 ^ 2 / s2 ^ 2) / f_small[1] 2.05