s. A smooth and invertible link function transforms the conditional expectation of Y to the set of predictors.
G(E(Y)) = η = f (X) + ε = XβT+ ε (1)
G(.) is the link function, X is the set of predictors or independent variables, E(Y) is the expected value of the response variable and ε is the error. In a linear model the function G(.) is identity. Depending on the assumed distribution of Y there exist appropriate link functions (see McCullagh and Nelder 1989). The model parameters, β, are estimated using an iterated weighted least squares method that maximizes the likelihood function as opposed to an ordinary least squares method in linear modeling. Here we fit a linear regression model – i.e., normality of variable Y and identity link function.
Models were fitted with different combination of predictors and for each the AIC was calculated. The best model was selected as the one that minimizes the Akaike Information Criteria (AIC).
The AIC is calculated as follows:
(2)
where L is the logarithm of the likelihood function of the model with the predictor subset under consideration and k, is the number of parameters to be estimated in this model and it serves as a penalty. The AIC penalizes models with more number of predictors thus favoring parsimony. For the selected best model two performance metrics were computed – fitting R2 which explains the variance captured by the model and cross validated R2 in this, an observation is dropped, the model fitted using the rest of the observations and the dropped pointed is predicted.
This latter indicates the variance explained by the model in a ‘predictive’ mode. To further assess the predictive capability of the models we performed leave 10% out cross-validation – in this 10% of observations are dropped at random and they are predicted using model fitted on the rest of the data. A final measure of model fit, root mean squared error (RMSE) is computed using the same drop 10% method as above. This is repeated 1,000 times and the median RMSE is selected, providing a robust assessment of the predictive skill.
Table 3: Onset Period GLM model statistics and predictor sets. Predictors from Table 1 above.
Share with your friends: |