.
Calculating the probabilities can be aided by creating a visual, like the one in Figure 11. For those distributions that are normal but not standard normal, the random variable X can be standardized by
.
But how can one discover the approximate underlying distribution of the data set [2]?
Figure 11. Normal Curve Probabilities [5]
DETERMINING UNDERLYING DISTRIBUTIONS
In order to provide some insight into the underlying distribution of a data set, Karl Pearson suggested a method using the chi-square statistic and hypothesis testing, which tests the suitability of a probabilistic model [3]. The chi-square goodness-of-fit test can determine whether the underlying distribution is what is assumed in the null hypothesis, or whether it is another distribution [2].
HYPOTHESIS TESTING AND SIGNIFICANCE LEVELS
From prior studies, a hypothesis is simply an educated guess in regards to the problem statement. With this in mind, we start by creating a null hypothesis, denoted. The null hypothesis is the initial assumption of one parameter in the sample set. The alternative hypothesis,, is an assertion that is a contradiction of . Based on a specified significance level,, one either rejects the null hypothesis or fails to reject the null hypothesis [2].
Once the null and alternative hypotheses are determined, a test-statistic is calculated from the distribution. If the test-statistic falls in the critical region, then you reject the null hypothesis [3]. The critical regions are defined in the Tables 2 and 3. If the null hypothesis is not accepted, one cannot assume that the alternative hypothesis is true.
Share with your friends: |