Statistics as a science is concerned with efficient methods of collecting and interpreting quantitative data so that the value of conclusions can be assessed using probability mathematics. Under the scientific method a hypothesis [which is a contingent based upon preliminary observations] is set-up. This hypothesis is tested by comparing it against a collected set of data. Using statistics a definite probability value can be attached to the conclusion that the hypothesis is rejected or accepted. This probability is called the level of significance [a].
The level of significance is the probability of committing a type I error. The meaning of this can be seen in the following table.
FAIL TO REJECT | REJECT | |
HYPOTHESIS TRUE |
Correct |
Type I error |
HYPOTHESIS FALSE |
Type II error |
Correct |
The statistical hypothesis testing procedure is as follows.
Establish a hypothesis [H_{o}] and then and alternate [H_{a}].
Choose a value for a. which is the probability of rejecting an hypothesis that is true. This value is usually .05 or .01 for most statistical analyses. An a. of .01 means we are 99% confident we would not reject a true hypothesis.
List the assumptions that apply to the hypothesis. These are usually concerned with questions about random sampling.
Draw a sample [collect the data] and calculate the estimator of the parameter that will be used to test the hypothesis. The selection of the statistic used to test the hypothesis is controlled to a large extent by the type of distribution the population shows. It is generally assumed the the population being sampled has a Normal [Gaussian] distribution, and Normal statistics are used. If the distribution is not normal then a is increased.
Determine from a set of tables the critical region i.e. based on the probability value selected the most extreme value that we can have by chance if the hypothesis is true.
Accept or reject the hypothesis.
Draw conclusions.
The basic procedure then either rejects an hypothesis or fails to reject an hypothesis. Failure to reject the hypothesis is scientific truth. If a hypothesis is tested over and over again by numerous different scientists and continues not to be rejected, the probability of a Type II error is reduced to a very small value. Under such constraints the hypothesis becomes a scientific Theory. A theory in science is one step below reality and the non-scientific terminology is a FACT. It it for this reason that the Theory of Evolution is true: it has stood the scientific rigor of continuous testing and the hypothesis consistently has failed to be rejected. With so many attempts made to reject the Theory the probability of a type II error is infinitely small.
In statistical testing if the variance of the sampled population is known then the normal population can be converted to a standard normal population which gives the well-known Bell shaped curve. This is done by using the Z-statistic to converts the normal data into the standard normal distribution. The details are not important for the present discussion and a beginning text on Experimental Statistics can be consulted for additional information. An important theorem related to the distribution of sample means is that as the size of the sample increases, the distribution of the means of all possible samples of the same size drawn from the same population becomes more and more like a normal distribution, provided that the population has a finite variance. This theorem is true regardless of the shape of the original parent population. This population of the means is often called the derived population and the importance is that it allows Normal statistics to be applied even to data that originally has a non-normal distribution. Of course, if the parent population is normal, the distribution of a sample means in the derived population follows exactly a normal distribution.