U. S. Department of Transportation


Statistical Concepts Two-way Chi-Squared Tests



Download 2.66 Mb.
Page31/35
Date02.02.2017
Size2.66 Mb.
#16216
1   ...   27   28   29   30   31   32   33   34   35

Statistical Concepts

  1. Two-way Chi-Squared Tests


The difference between relative frequency and overall frequency raises the need to test for differences in the two. This is where a (two-way) Chi-Squared test103 can be useful.

The Chi-Squared test compares the observed values of an n by k table to their expected values. The observed values are the observed frequencies of the intersection of two categories (represented by the row and column labels). In this case, the expected value for a cell of the table is the marginal percentage for the column applied to the row total.104 For example, in Table 1, the marginal percentage for OE column is approximately 14.4% (1,268/8,812). The row total for category A incursions is 132. Thus, the expected value for category A OE incursions is approximately 19 (.144 x 132). A generalized way to calculate the expected value is:



the expected value of cell i,j in a table with n rows and k columns is equal to the sum of observed frequencies across rows of column j multipled by the sum of observed frequencies across columns of row i divided by the total number of observations.

where:


Ei,j = Expected value for cell i, j

Oi,j = Observed value for cell i, j

N = Total observations

n = number of rows

k = number of columns

Constructing the expected values in this way is a test of independence between the rows and columns. That is, this tests for an association between the rows and columns. The test statistic is calculated by finding the difference between the observed and expected values for each cell, and then totaling them, shown formulaically as:



the chi-squared test statistic is defined as the sum over all cells of a table with n rows and k columns of the observed frequency minus expected frequency quantity squared divded by the expected frequency.

This test statistic is distributed Chi-Squared with degrees of freedom (n – 1)*(k – 1). In Table 1, this results in 6 degrees of freedom. Similar tests will be applied in the following sections regarding other combinations of variables.


    1. Box and Whisker Plots


The box and whisker plot concisely presents the percentiles of the distribution and outliers. The core of this plot type is the box. The box represents the middle 50% of the distribution. The lower bound of the box represents the 25th percentile, the middle line represents the 50th percentile (or median), and the top of the box represents the 75th percentile. The second component of the plot type is the whiskers. These whiskers attempt to represent a “reasonable” range of the data. Specifically, the whiskers encompass the data that is within 1.5 times the interquartile range of the 25th and 75th percentiles. Data outside the whiskers are represented by dots, and are considered outliers. An annotated example follows.

figure 66 depicts the various parts of a box and whisker diagram. the center of the diagram contains a box. the top of the box id labeled the 75th percentile. the middle of the box is labeled the 50th percentile or median. the bottom of the box is labeled the 25th percentile. extending from the top of the box is a whisker, encompassing values within 1.5 times the inter-quartile range (iqr). extending from the bottom of the box is a whisker, encompassing values within 1.5 times the inter-quartile range (iqr). finally, outside the top whisker is a single point, labeled an outlier.

Figure - Annotated Box and Whisker Plot


    1. Kruskal-Wallis Tests


The Kruskal-Wallis test is an extension of the Mann-Whitney (or Wilcoxon) rank-sum test to two or more categories. The procedure for this test replaces each observation with its rank in the overall dataset and then calculates the mean rank for each category. This procedure jointly tests if the categories have statistically different mean ranks (i.e., if the ranks are distributed randomly among the categories). In other words, a significant test statistic indicates that the categories have different distributions of the continuous variable. This test is particularly useful for small samples, as it requires no asymptotic distributional assumptions. Because the test examines ranks rather than observed values, the exact distribution of the test statistic can be calculated. However, for data with several groups and a moderate number of observations in each group, the distribution is well approximated by the Chi-Squared distribution.105 More information on the calculations underlying the Kruskal-Wallis rank test can be found in Siegel & Castellan (1988).

Given that the Kruskal-Wallis test indicates that the groups are jointly significant, it may be interesting to determine which groups are in fact different. The mean ranks can be compared in a pairwise fashion to determine this. However, this introduces a significant statistical problem, multiple comparisons.

For example, if there are four groups to compare, there are 6 total pairwise comparisons. Suppose further that that standard significance level of 5% is assumed (i.e. the null hypothesis is rejected incorrectly 5% of the time). Lastly, for this example, suppose that none of the groups actually differ (i.e., the null hypothesis is true for all comparisons). Thus:

the probability that at least one null hypothesis is rejected is equal to the one minus the probability that no null hypothesis is rejected which is equal to one minus the signficance level (in this case .95) raised to the number of comparisons (in this case 6). this is approximately equal to

Thus, for six comparisons the likelihood of rejecting at least one null hypothesis when all are known to be true is greater than 25%. Put simply, even of all 4 groups are the same, there is a 25% probability of falsely identifying one difference as statistically significant. Therefore, a correction to the statistical significance criteria is required to compare the groups pairwise and avoid falsely identifying groups as significant.

A simple correction is to compare each test at a smaller significance level. The one employed in this analysis (referred to as the Bonferroni method) uses a pairwise significance rate of α/k, where α is desired significance level for the overall set of tests and k is the number of tests. This ensures that the overall false rejection rate among all the tests combined is no greater than the desired overall false rejection rate. Thus, in the above example, a pairwise significance level of .0083 (0.05 / 6) ensures that the overall false rejection rate is less than or equal to .05.106



    1. Download 2.66 Mb.

      Share with your friends:
1   ...   27   28   29   30   31   32   33   34   35




The database is protected by copyright ©ininet.org 2024
send message

    Main page