Recall from the previous notes that a hypothesis test involves two statistical hypotheses: the null hypothesis, H0, and the alternative hypothesis, Ha.
When we conduct a hypothesis test, it leads us to one of two decisions based on the evidence of our sample: we either reject H0 or fail to reject H0. However, a hypothesis test is not foolproof—our decision might be incorrect!
A hypothesis test can result in one of two decision errors:
Conversely, a hypothesis test can result in one of two correct decisions as well:
The relationships amongst these correct decisions and decision errors can sometimes be best seen by a table:
H0 True |
H0 False |
|
Reject H0 |
||
Fail to Reject H0 |
Most store-bought pregnancy tests produce a binary outcome: they indicate either that a woman is pregnant or indicate that they are not pregnant. Consider the following hypotheses:
H0: a woman taking the test is pregnant
Ha: a woman taking the test is not pregnant
In a Aesop’s fable “The Boy Who Cried Wolf,” a shepherd boy repeatedly runs to nearby village to claim that there was a wolf attacking his flock when in fact there was no wolf at all. Suppose the villagers were all statisticians. What type of error would they say the boy was committing when doing this? (Assume H0: no wolf present)
What Influences α and β?
Notice that the probabilities for correct decisions and for decision errors can be defined in terms of the values of α and β. In general, there are three factors that can affect these values:
effect size = true value – hypothesized value under H0
Increasing effect size…
Since Type I error (α) and (1 – α) rely on α, a value set by the researcher, their computations are simple. Type II error (β) and power (1 – β) require a little more work; thus, in these notes, we will focus on the computations for Type II error (β) and power.
Note: To calculate these probabilities, we need to know the true value of the parameter in the population. Since this is often not something that we actually know, most questions involving error and power calculations rely on us assuming a true value for the population parameter.
Recall from above that:
Type II error = β = P(fail to reject H0|H0 false)
Power = 1 – β = P(reject H0|H0 false).
So how can we go about calculating these values?
Recommended Steps for Calculating Power (1 – β) Step 1: Set up H0 and Ha based on the scenario.
Step 2: Identify the critical value for the rejection region under H0 (you can usually find this based on α, or sometimes this value is given to you directly).
Step 3: Draw the sampling distribution based on H0.
Step 4: Draw the sampling distribution based on the true parameter value.
Step 5: Locate (and draw) the critical value in both the H0 distribution and the true parameter distribution.
Step 6: To compute power (1 – β):
Note that if the question asks for the Type II error probability (β), you can follow the steps above to find power, and then just take 1 – power to obtain β.
Hints and Tips:
Examples will really help with all of this! Let’s do a few.
A drug company that manufactures a sleeping aid drug claims that more than 70% of the people that use their drug report an improvement in their sleep quality (compared to before they were taking the drug). Suppose a competing company wishes to test this claim by sampling 200 individuals who take the sleeping aid drug and asking them whether or not they experience an improvement in their sleep quality while on the drug.
In Minitab… Stat à Power and Sample Size à 1 Proportion… Sample sizes: 300. Comparison proportions: . Power values: . Hypothesized proportion: . Click Options… Alternative hypotheses: ≠.0 Significance level: ≠.0. |
Employees at multiple levels of a large company are stating that they are receiving, on average, fewer annual paid vacation days than the national average. Suppose it is known that the national average of annual paid vacation days in 2014 is known to be 7.5 days with a standard deviation of 1.6 days. To assess the validity of the employees’ claims, the company randomly samples 55 employees and finds that the average number of annual vacation days for this sample is 7.22 days.
In Minitab: Graph à Probability Distribution Plot à View Probability |
||
Distribution tab |
Shaded Area tab |
|
Distribution: normal . |
Define Shaded Area by: X Value |
|
Mean: 0. |
Select: |
|
Std. Deviation:.1 |
X value:. |
7.41 days in 2018. Find the power of the test outlined above by hand and using Minitab.
In Minitab… Stat à Power and Sample Size à 1-Sample Z… Sample sizes: 300. Differences: . Power values: . Standard Deviation: . Click Options… Alternative hypotheses: ≠.0 Significance level: 9≠.0. |
Consider the following hypotheses about a proportion, p, in a certain population:
H0: p = 0.50
Ha: p ≠ 0.50
Suppose the decision rule for a test of H0 is given as:
“Reject H0 if p̂ < 0.408 or if p̂ > 0.592.”
In Minitab: Graph à Probability Distribution Plot à View Probability |
||
Distribution tab |
Shaded Area tab |
|
Distribution: normal . |
Define Shaded Area by: X value |
|
Mean: 0. Std. Deviation:.1 |
Select: |
|
X value:. |
Type I Error, Type II Error, and Power –
In Minitab… Stat à Power and Sample Size à 1 Proportion… Sample sizes: 300. Comparison proportions: . Power values: . Hypothesized proportion: . Click Options… Alternative hypotheses: ≠.0 Significance level: ≠.0. |
Type I Error, Type II Error, and Power –
The American Heart Association (AMA) recommends that adults should aim to get an average of at least 150 minutes of moderate exercise per week to maintain cardiovascular health. An employer at a moderately-sized company wonders if his employees are achieving this minimum. He takes a random sample of 40 employees and finds their average weekly minutes of moderate exercise to be 133 minutes. Suppose it is known that the standard deviation in the population is 57.9 minutes.
In Minitab… Stat à Power and Sample Size à 1-Sample Z… Sample sizes: 300. Differences: . Power values: . Standard Deviation: . Click Options… Alternative hypotheses: ≠.0 Significance level: 9. Click Graph… □ Display power curve ≠. |
Follow Us