Hypothesis - A supposition; a proposition or principle which is supposed or taken for granted, in order to draw a conclusion or inference for proof of the point in question; something not proved, but assumed for the purpose of argument. (Webster’s Unabridged Dictionary, 1913)
Null hypothesis – hypothesis specifies a parameter and a (null) value for that parameter, abbreviated \(H_0\) (pronounced “H” – “not”, as in, “there’s not anything interesting here”).
Alternative hypothesis – specifies a range of plausible values for the parameter should we reject the null. Sometimes abbreviated as \(H_A\) or \(H_1\) (pronounced “H” – “A” or “H” – “One”). The alternative hypothesis can be one- or two-sided but this must be determined for each problem separately.
Test Statistic (abbreviated T.S.) - There are many different kinds of test statistics but the ones we will be using rely on the CLT and therefore assume the test statistic follows a Normal (or Student’s t) distribution. Such test statistics typically take this form:
\[T.S. = \frac{\text{<estimate>} - \text{<hypothesized parameter value>}}{SE(\text{<estimate>})} \quad\text{OR}\quad T.S. = \frac{\text{<estimate>} - \text{<hypothesized parameter value>}}{st.dev(\text{<estimate>})} .\]
Significance level is the chance that out of many (typically hypothetical) repeated experiments, we make a Type I error by mistake. This is really just another way of expressing our confidence level:
\[1 - \text{confidence level} = \text{significance level}.\]
Remember, that the confidence level (and hence the significance level) has to do with the idea of repeating a bunch of hypothetical experiments again and again. You must decide what your significance level (or confidence level) is before you go through an calculate a CI or perform a hypothesis test. Your statistical conclusions are not valid if you go back and change this afterwards.
Type I error – the probability of a false positive, i.e. the probability of incorrectly rejecting the null hypothesis.
Type II error – the probability of a false negative; i.e. the probability of incorrectly failing to reject the null hypothesis.
Power – In contrast to the significance level of a hypothesis test, the power of a test is the probability that we correctly reject a false null hypothesis (out of many repeated, hypothetical experiments).
P-value – is the probability of the T.S. we observed or a larger/smaller/more extreme value occurring, given that the null hypothesis is true. A p-value is the estimated probability of observing a statistic value at least as far from the (null) hypothesized value as the one we have actually observed.
A small p-value (relative to your significance level) indicates that the statistic we have observed would be unlikely were the null hypothesis true. That leads us to doubt the null.
A large p-value (relative to your significance level) just tells us that we have insufficient evidence to doubt the null hypothesis. This does NOT prove the null hypothesis to be true.
Effect size – There are many uses of this phrase in applied statistics. As defined in your textbook however, this is the actual difference between the null hypothesis value of your parameter and the actual true value of the parameter.
Statistically significant result – This phrase is reserved for reporting that we observed a p-value that is smaller than our pre-determined significance level in a hypothesis test. An alternative phrase is statistically detectable. It is very important to keep in mind that a statistically significant/detectable result is NOT the same thing as a practically significant result.
The data was collected without bias and each observation is independent of the others and we can apply the CLT.
When we have some value for the population parameter in mind and we want to find evidence that this value is incorrect. A hypothesis test can help us make a yes/no decision about the plausibility of the value of a parameter.
(Note: Anywhere you see \({\color{red}<}{\color{red}>}\)’s you should replace the inside with problem-specific words or symbols.)
\[\begin{eqnarray} H_0:{\color{red}<}\text{parameter}{\color{red}>} = {\color{red}<}\text{hypothesized value}{\color{red}>}\quad &\text{ and }&\quad H_A: {\color{red}<}\text{parameter}{\color{red}>} \neq {\color{red}<}\text{hypothesized value}{\color{red}>} \\ &\text{OR}& \\ H_0:{\color{red}<}\text{parameter}{\color{red}>} = {\color{red}<}\text{hypothesized value}{\color{red}>}\quad &\text{ and }&\quad H_A: {\color{red}<}\text{parameter}{\color{red}>} > {\color{red}<}\text{hypothesized value}{\color{red}>} \\ &\text{OR}&\\ H_0:{\color{red}<}\text{parameter}{\color{red}>} = {\color{red}<}\text{hypothesized value}{\color{red}>}\quad &\text{ and }&\quad H_A: {\color{red}<}\text{parameter}{\color{red}>} < {\color{red}<} \text{hypothesized value}{\color{red}>} \end{eqnarray}\]
Calculate the test statistic and find its distribution under the null hypothesis. That is, find \[T.S. \sim W\] where \(W\) is typically
a Normal random variable with mean \(p_0\) and variance \((p_0 (1-p_0))/n\); or
a Normal random variable with mean \(\mu_0\) and variance \(\sigma^2/n\) (if \(\sigma^2\) is known); or
a Student’s-t distributed random variable (with \(n-1\) degrees of freedom).
Calculate the p-value for the test statistic in part 2 using statistical software.
State the full conclusion of your test. Do you fail to reject the null hypothesis or do you have enough evidence to reject the null?