Kalman Filter
Kalman Filters (and their cousins, Extended Kalman Filters) are approaches to the fundamental statistical method of least squares optimisation that use the previous analysis to propagate a matrix of system errors forward in time. This makes the method computationally expensive unless major assumptions are made. A matrix containing the filter at given times is dependent on the error covariances of the observations and of the model forcing. These error covariances are calculated at each time-step and the Kalman Filter updated using them. The Kalman Filter may therefore be considered a statistically optimal method of sequential data assimilation of data into linear models. Optimal Interpolation can be considered as an approximation to the Kalman Filter, assuming the error covariances to be constant. Most commonly in oceanography, assumptions concern the form of the covariance matrix. Examples of efficient schemes based on reduced rank approximations in 2D shallow water models are given by Canizares et al. (1998) and Heemink et al. (1997).
Steady-state Kalman filters based on Heemink at al. (1997) can significantly improve operational forecasts in the North Sea with small extra costs or delays. With a fixed set of observations at each time step, a fixed Kalman filter proves to be a good approximation and the time-consuming step of calculating this can be done off-line. The prerequisite is that there are observations up-stream from the locations for which the forecasts are needed. And the observations have to be available fast enough to be assimilated.
In-situ observations from tide gauges or platforms will in general be the best choice for data assimilation. Satellite altimeter products can in principle be used, but these data are generally so infrequent that one can not rely on them too much, and also timely availability of the appropriate processed products might be an issue.
Quality control of the observations is of crucial importance. A few erroneous observations can do more harm to the forecasts than many good observations can ever improve them.
Usually, quality control is easy for the human eye, but if the model and the data-assimilation have to run automatically, without human interference, this has to be automated also. Tide gauge data are mostly not close enough together to have much practical value for correlations in space, but as the sea level tends to fluctuate on a time scale of minutes, observations are available with a high frequency, and correlations in time can be exploited fully. The combination of absolute values of the water level, the surge and the difference with previous model forecasts and their first and second derivatives in time, prove to be a powerful tool to achieve this. All of these quantities should be restricted to ‘sensible’ limits, which can be determined from historical data. Apart from this, the water level itself should be required to show a minimum variability to identify a hanging tide gauge. And missing data should be dealt with sensibly: if too many observations are missing for a certain period, the data which are present might get rejected because the derivatives can not be determined.
Variational analysis (3Dvar and 4Dvar)
Variational analysis is a method of data assimilation using the calculus of variations. It was first devised by Sasaki (1958). 3Dvar considers the analysis to be an approximate solution to a minimisation problem. A cost function is found, consisting of quadratic expressions of the differences between observations and model variables, and of their error covariances. Minimisation methods (from optimisation theory) are applied iteratively using an appropriate descent algorithm. Methods typically used are Preconditioned Conjugate Gradient or quasi-Newton techniques. An initial guess is used, which is normally the uncorrected model fields. The major difficulty with 3Dvar is expressing the matrix for model error covariance, since it is this matrix that determines how assimilated increments spread out through the grid. 4Dvar is a generalisation for observations that are distributed either side of an analysis step in time. It has the additional advantage that dynamical constraints (based on governing equations) can be placed on the sequence of model states that result. For more details refer to Talagrand (1981).
The UK system
The atmospheric input to the surge ensemble in the UK is provided by the Met Office Global and Regional Ensemble Prediction System (MOGREPS), which is described by Bowler et al. (2007 a, b). The perturbations are calculated using an Ensemble Transform Kalman Filter (ETKF). Twenty four alternative fields of sea-level pressure and 10-meter winds are produced, at hourly intervals. Surge ensemble runs are triggered by completion of the MOGREPS regional forecasts. The initial condition for all ensemble members is taken from the corresponding time step of the deterministic surge simulation.
The surge model output is post-processed to produce a variety of graphical outputs, based on the requirements of the UK’s operational Storm Tide Forecasting Service team. Some examples are shown below (and see Flowerdew et al., 2009; Flowerdew et al., 2010) :
Figure 5.3: Postage stamps of residual surge elevation.
Figure 5.3: shows a set of ‘postage stamps’ which is simply the surge residual predicted by each ensemble member. An animation of postage stamps would display all the information contained within the ensemble output, and be useful as a tool to study the evolution of a particular ensemble member (e.g. one that predicts an extreme event). In many cases, however, all the members look very similar, and this plot type can fail to highlight the important differences or support definitive decision making.
Figure 5.4: Mean (contours) and spread (colours) of residual surge elevation.
Figure 5.4 shows a ‘mean and spread’ chart for the same time step of the same forecast. The mean is depicted by the contour lines, and the spread by the colour shading. The colours in this example indicate that the greatest uncertainty (spread) is along the German coast, which also has the largest mean surge prediction. This plot type is good for indicating regions of uncertainty and how they relate to the mean surge prediction.
Figure 5.5: Forecast probability of surge residual exceeding 0.6m.
Figure 5.5 displays the fraction of members predicting a surge residual greater than 0.6m at a given time step. This example indicates the virtual certainty of such a surge along the German coast, risks of between 40 and 60% along the Norfolk coast of the UK, and negligible risk elsewhere. This type of plot allows a quick appreciation of the level of risk of the specified event across different sections of the coast.
Share with your friends: |