Define type I & II errors
Define power
Describe the responsible use and reporting of p-values from hypothesis tests
Discuss how these errors are linked to a “reproducibility crisis”
Measure how these errors amplify when performing multiple hypothesis testing in the context of multiple comparisons
Hypotheses: Null (\(H_0\)) and Alternative (\(H_A\)) hypotheses.
Test statistic: function of the data on which the decision is based
p-value: assuming \(H_0\) is true, it is the probability of observing a test statistic value as or more extreme towards \(H_A\) than what we observed
significance level: the probability wrongly rejection \(H_0\).
Decision\Reality | \(H_0\) is true | \(H_0\) is false |
---|---|---|
Reject \(H_0\) | Type I error | Correct decision |
Do not reject \(H_0\) | Correct decision | Type II error |
We want a test that minimizes both types of errors
Type II error occurs when \(H_0\) is false, but we do not reject it
Note that if \(H_0\) is false, the data comes from the pink curve, not the orange.
A doctor tests a patient for a disease. The null hypothesis is that the patient is healthy.
Type I error: test shows patient has a disease when in fact the patient does not have the disease
Type II error: test shows the patient does not have the disease when in fact they do
power: the probability of correctly rejecting the null hypothesis \(H_0\), when \(H_0\) is false
\(\text{Pr(Reject } H_0 \text{ when } H_0 \text{ is false}) = 1 - \beta\)
We want the power to be large
When to use power:
Source: https://online.stat.psu.edu/stat415/lesson/25/25.2
Suppose the IQ of adults follow a Normal distribution. We take a random sample of \(n = 16\) people. The sample mean is \(\bar{X} = 101\) and sample standard deviation is \(s = 10\). We set \(\alpha = 0.05\) and want to test: \(H_0:\mu = 100 \; vs. \; H_A: \mu > 100\).
Test statistic: \(T = \frac{\bar{X} - \mu_0}{s/\sqrt{n}} = \frac{\bar{X} - 100}{10/\sqrt{16}}\)
Critical value: \(\text{Pr}(T \geq t^*) = 0.05\) \(\Rightarrow\) \(t^* =\) qt(0.05, df = 15, lower.tail = FALSE)
= 1.75
We reject \(H_0\) when \(T \geq\) 1.75
For what value of \(\bar{X}\) do we reject \(H_0\)?
Source: https://online.stat.psu.edu/stat415/lesson/25/25.2
Suppose the IQ of adults follow a Normal distribution. We take a random sample of \(n = 16\) people. The sample mean is \(\bar{X} = 101\) and sample standard deviation is \(s = 10\). We set \(\alpha = 0.05\) and want to test: \[H_0:\mu = 100 \; vs. \; H_A: \mu > 100\]
We want to find the type II error rate if the true mean is 108.
We reject \(H_0\) when \(\bar{X} \geq\) 104.38
\[\begin{align} P(\text{Type II error}) &= P(\bar{X} < 104.38 \text{ when } \mu = 108) \\ &= P(T < \frac{104.38 - 108}{10/\sqrt{16}}) \\ &= P(T < -1.448) \end{align}\]
pt(-1.448, df = 15)
= 0.08
Source: https://online.stat.psu.edu/stat415/lesson/25/25.2
Suppose the IQ of adults follow a Normal distribution. We take a random sample of \(n = 16\) people. The sample mean is \(\bar{X} = 101\) and sample standard deviation is \(s = 10\). We set \(\alpha = 0.05\) and want to test: \[H_0:\mu = 100 \; vs. \; H_A: \mu > 100\]
What is the power of the test if the true mean is 108?
A. 0.05
B. 0.45
C. 0.64
D. 0.92
E. 0.95
Unfortunately, in reality, we would not know that the true mean is 108;
So, what we do is to calculate the power for different values of \(\mu\) in \(H_A\);
Type I error: rejecting \(H_0\) when it is true
Type II error: not rejecting \(H_0\) when it is false
Affected by:
- effect size (i.e., the difference between the null hypothesis and reality)
- sample size
- significance level
- using a left- or right-tailed test instead of a two-tailed
- the test itself
Power: probability of rejecting \(H_0\) when it is false
We want a test that minimizes type I error rate and has high power
worksheet_08
We are here to help!
© 2024 Rodolfo Lourenzutti, Melissa Lee, Marie Auger-Méthé – Material Licensed under CC By-SA 4.0