NOAA NA96GP0463
ATLANTIC BASIN TROPICAL CYCLONES: RISK ASSESSMENT USING CLIMATE INDICATORS
Principal Investigators
Yochanan Kushnir, Rajagopalan Balaji and Upmanu Lall
LamontDoherty Earth Observatory of Columbia University
Palisades, NY
Final Report
(Prepared January 2002)
GOAL:
This project was initiated in with the goal to develop a statistical model of the twodimensional tropical Atlantic hurricane distribution throughout the basin, based on the long record of observations and to attempt to condition it on existing indicators of largescale climate variability such as ENSO and the NAO.
Our interest has been in developing a more spatially distributed understanding of Atlantic Basin tropical cyclones, their movement and their landfall than is embodied in current forecasting methodologies. We sought to analyze existing long records of hurricane activity, to model their movement within the Basin and across the landsea boundary so that realistic simulations of stormdistribution scenarios can be generated to assist in probabilistic assessment of storm impact and eventually allow a more spatially distributed, probabilistic prediction of seasonal hurricane activity, including landfall.
Project Achievements: 
Tropical Storm Distribution and LargeScale Climate Variability
Throughout the second half of the 20^{th} Century, there has been ongoing interest in linking tropical cyclone climate (frequency, intensity, and location) to the atmospheric and oceanic climate parameters. This interest has been motivated by the desire to predict well in advance, the characteristics of an upcoming hurricane season, based on more easily determined “external” factors such as the sea surface temperatures, sea level pressure, and other largescale measures of the environment in which these storms develop and move. While there is some theoretical understanding of the dependence of hurricanes on largescale conditions, the state of this understanding is rather qualitative, particularly when it comes to determining the properties of an entire storm season based on prior or even concomitant conditions. Thus, many investigators resorted to a statistical, albeit physically based, modeling of the storm climate. In many cases, these statistical models remained rather crude, aiming to determine (and predict) the number of storm in the entire tropical Atlantic Basin and their intensity.
Our goal in this study was to model the dependence of the twodimensional storm distribution on climate indicators such as ENSO and NAO indices. We applied resampling technique to the existing record of Atlantic “named” tropical storms to find regions where significant differences exist between the number of storms in high, neutral, and low states of the climatic index (the latter defined by dividing the index record into terciles). The results for the Niño3 and the NAO are shown in Figure 1 and 2 below. In the case of the Niño index (Figure 1) the effect on Atlantic hurricanes is linear. A “warm event” in the eastern equatorial Pacific is associated with a reduction in the number of storms in the southeast US, the Gulf of Mexico, and the
Figure 1: Difference between years falling in the upper tercile (high) values of the JuneAugust NINO3 index and years falling in the lower tercile (low). The contours are the difference in storm days between the categories converted to number per decade. Colors indicate significance as determined by repeated resampling of the data. Dark brown color indicates that the 95% confidence interval of median number of storms during one phase of the climate indicator is higher than the confidence limit of the comparing phase, dark purple color is used when the confidence interval is lower. Yellow color indicates that the 50^{th} percentile of one phase of the climate indicator is greater than the 95% confidence intervals of the comparing phase, and light purple represents the opposite difference.
Figure 2: Same as in Figure 1 but for the upper (high) and middle (normal) terciles of the NAO index.

Markov Model
A spatial Markov model was developed and tested for the simulation of hurricane tracks. The model is intended for the assessment of hurricane risk in a sector of the ocean or potential landfall regions. A spatial grid is imposed on the domain of interest. For each grid box, longterm hurricane risk is assessed through the annual probability distributions of the number of hurricane tracks passing through the box, the associated number of storm hours, and other statistics. Hurricane birth, spatial movement hurricanes between boxes in the grid, and death are described probabilistically as a Markov process. The relevant marginal and transition probabilities for each box are estimated from historical hurricane track data recorded at 6hour intervals for the Atlantic sector. An assessment of the performance of the Markov model for hurricane track simulation and potential directions for the improvement are discussed.
The data used for the model are historical hurricane track data from 1886 to 1998 with information on storm position (latitude/longitude) every 6 hours. The data was obtained from the National Hurricane Center, and the record contained so called "named" storms, defined as storms whose maximum sustained (one min averaging period) surface winds exceeded 17 m s^{1} (34 knots, 39 mph). A domain of 5N to 45N and 100W to 25W was chosen and segmented by a 5 by 5 grid to give 120 "boxes". During the 118year record, 971 total hurricanes spent all or part of their lifetime in our domain.
One can use the Markov simulator to generate a specified number of hurricane tracks (e.g., equal to the sample size of hurricane tracks over the period of record or the expected number of hurricanes over a 100year period). The location (grid box) of each hurricane birth (track start) is simulated by sampling from the discrete probability distribution (f_{0}(i), i=1…ng) of birth in a box i, given a total of ng grid boxes under consideration. This distribution is estimated as:
(1)
where nb_{i} is the number of hurricane starts in box i, and tb_{ngs} is the total number of hurricanes starting in the ngs grid boxes under consideration. Tracks that started outside of the domain were counted as "born" in the box they first appeared in.
Using the time step of 6 hours, we recorded the progression of the 971 historical tracks through the boxes, calculating the probability of storm movement from each box to the neighboring ones. Possible states allowed were movement to the surrounding boxes (8 directions: N, NE, E, SE, S, SW, W, and NW), loitering (staying in the current box), and dying (disappearing). The probabilities for these states were computed for each box based on historical data. Subsequent movement of the simulated hurricane in increments of 6 hour time steps is simulated by considering a 1step Markov transition from the current box to one of 10 possible states. The 1step transition probability matrix is estimated from the historical data as:
(2)
where ngt is the total number of boxes used in the Atlantic (of which ngs, the number grid boxes considered for starts is a subset), j is an index for the 10 states to which a transition is possible, nt_{ij} is the number of transitions to state j in the historical record, tt_{i} is the total number of hurricane observations or time steps spent in box i by all historical hurricanes, and J_{i} is an index set for transition states that identifies the 10 allowable states as hurricane death, and the indices of the 9 neighbor grid boxes of i.
All statistics were computed using 1000 simulations of 971 tracks (i.e. 1000 simulations of the entire historical record). To reveal the spatial structure of the simulated tracks (upper panel) we counted the number of tracks in each box. The results are expressed in percentiles as compared to the historical record. The observed value in a box is placed among the 1000 simulated ones to see where it lies. If it is found in the lower fifth percent we call that the fifth percentile, if found in the lower quarter  the lower quartile, and so on for the upper fifth and quartile. The color blue represents the lower fifth percentile; green the lower quartile. No boxes reached the upper quartile (yellow) or the upper fifth percentile (red). The spatial and temporal structure of the simulated tracks (lower panel) shows the percentiles of the number of 6hour time steps in each box. This produced better results than the tracks per box because the model is based on the 6hour step. However, for both plots, data sparse regions over land and north of 35N faired poorly vastly overestimating landfalls. Simulated storm starting location percentiles (not shown) for all the boxes fell between the lower and upper quartiles.
Subsequent discussion is focused around the following questions:

Given the finite historical record, can the birth and transition probabilities be reliably estimated, and how does the sample size impact the simulated hurricane risk?

Do the estimated birth and transition probabilities reveal interesting tendencies for the Atlantic hurricanes in terms of clustering of birth, and movement direction?

Given the 5x5 degree. spatial grid resolution, and the 6hour time step, is a lag 1 Markov model appropriate for describing hurricane movement?
Given 120 boxes in the Atlantic, and 971 hurricanes, the data available to estimate the high dimensional multinomial probability distribution for birth are rather limited. However, many of these boxes record no births, and births are clustered primarily in 4 regions covering approximately 6 boxes, each having as many as 52 births. The sample size available for estimating the 10, 1step transition probabilities is also highly variable, ranging from 0 to 825 observations for different boxes. Once again, the observations are heavily clustered along the curve of the parabolic sweep. A region may be sparsely sampled either because few tracks pass through there, or because hurricanes move fast in that region, leading to a small number of 6hour observations. Consequently, the reliability of the simulator will be highly spatially variable if raw estimates of the birth and transition probabilities are used as indicated earlier. Spatial averaging of the raw probabilities may offer improvements. However, the averaging span and orientation of such a scheme will need to be variable over the domain to be effective. Designing a scheme to provide consistent, unbiased and minimum variance estimates of the probabilities is a topic of future research. Here, the only modifications employed, were one; set all estimated probabilities less than 0.01 to zero, and to renormalize the probability distribution. Note that this helps reduce the dimension of the probability matrix, but does not address problems with a very small total sample size used to develop the probability of interest. The second adaptation made was to the lifetime distribution. The distribution of the 971 historical tracks showed that no hurricane lasted less than five time steps (30 hours), so the simulation was forced to do the same.
Given the difficulties in estimating lag1 probabilities, higher order Markov models were not considered. However, we did consider changing the time increment for sampling the tracks to 12 hours or longer. This reduces the serial dependence, and may be more appropriate for a 1step model. However, the number of available observations is also reduced, and hurricanes can jump outside a 9 box neighborhood if the longer time step is employed. Consequently, we kept the 6hour time step and focused instead on the identification of potential problems with the simulations using the 1step model.

Track Segment Model
Our Markov model of hurricane tracks has a realistic starting location generator, and realistic movement in data populated regions, but there are numerous weaknesses to this method. It had an unreliable probabilities, and diffusive properties in data sparse regions. Additionally, the simulated tracks had overly populated tails of the life span distribution, leading to too short or too long hurricane duration. Due to these weaknesses we decided not to continue to improve the 1step Markov model, and instead modified our technique. Rather than calculating marginal and transition probabilities for each box we will compute these for each track. Allowing a simulated track to probabilistically jump from historical track to historical track based on parameters such as distance, comparative angle, and position on track. This new method has drastically cut down on the diffusive nature of our old method, and improved our lifetime distribution, tracks per box, 6hour time step plots, and landfall statistics. We have named it the track segment model, as segments of historical storm tracks are pieced together to make simulated tracks.
In addition to a new model, we also are constraining the historical hurricane track data used. The Markov model used the entire record of historical hurricane track data from 1886 to 1998 with information on storm position (latitude/longitude) every 6 hours. The data was obtained from the National Hurricane Center, and the record contained so called "named" storms, defined as storms whose maximum sustained (one min averaging period) surface winds exceeded 17 m s^{1} (34 knots, 39 mph). A domain of 5N to 45N and 100W to 25W was chosen and segmented by a 5 by 5 grid to give 120 "boxes". During the 118year record, 971 total hurricanes spent all or part of their lifetime in the domain. Due to routine aircraft reconnaissance missions into tropical cyclones beginning in 1944, details on the position of the hurricane eye have been available. This has lead to greater accuracy in the 6hour position data. The distribution of storm duration in the period prior (1886 to 1943) is substantially different than the subsequent (1944 to 1998) period. By virtue of the course grid of the domain, this had little effect on the Markov model, but it had a considerable impact on track segment model results. Therefore, only the later period run is shown. During the 55year record used, 544 hurricanes spent all or part of their lifetime the domain.
A table was created containing the 544 historical hurricane tracks listed in chronological order. Position information every 6 hours (pts), from storm start (pt=1) to end (pt=f), and "neighbors"  all pts on other tracks within a 2.5 degree radius with a given probability to move to that location based on distance, comparative angle, and position on track. Distance between pts is computed using spherical coordinates. The angle is computed in a forward sense (between the current point and the point that would follow in time), and compared between staying on the same track and jumping to another. The number of 6hour steps the hurricane has progressed, indicates the position on the track. Probabilities for these three parameters are each given by a squared distribution
where D is the maximum allowed value of each parameter. In our simulations D is 2.5°, 60°, and 117, for distance, comparative angle, and position on track respectively. The three probabilities are then multiplied and renormalized to give one probability. This probability makes it more likely that a simulated storm will jump to a neighbor that is close, has a similar direction heading, and is near the same point in its life cycle.
To start a simulated track, a historical track is chosen randomly. The table is searched to find all "neighbors" at their pt=1 state. The overall lifetime (number of 6hour time steps denoted F) is chosen randomly from the subset of neighbor's observed lifetimes. Number of initial steps to take on the starting storm is chosen randomly from 1 to f (where f is the lifetime of the observed storm). If the total number of steps at anytime exceeds the lifetime chosen (F) the storm stops (dies). If the total number of steps at the stopping point is less than F either another track is picked up or the storm continues on the current track based on the table probabilities. Again, steps to move on the historical track are chosen randomly from the point jumped on, to f. This process repeats until the total lifetime selected (F) is fulfilled.
All statistics were computed using 1000 simulations of 544 tracks (i.e. 1000 simulations of the historical record used). To reveal the spatial structure of the simulated tracks (upper panel) we counted the number of tracks in each box. The results are expressed in percentiles as compared to the historical record. The observed value in a box is placed among the 1000 simulated ones to see where it lies. If it is found in the lower fifth percent we call that the fifth percentile, if found in the lower quarter  the lower quartile, and so on for the upper fifth and quartile. The spatial and temporal structure of the simulated tracks (lower panel) shows the percentiles of the number of 6hour time steps in each box. The color blue represents the lower fifth percentile, green the lower quartile, yellow the upper quartile and red the upper fifth percentile.
With 120 boxes one could expect as many as 30 boxes in each of the lower and upper quartiles for a normal distribution. This simulation falls far below that with 28 in both the upper and lower quartiles for the tracks per box, and 26 in the 6hour time steps. However, the upper and lower quartile boxes (colored) are clustered in space and not randomly distributed. They appear most frequently in data sparse areas such as over land and north of 37.5N.
Conditioning the Track Segment Model on Climate Indices.
NAO (DJFM) and NINO3(DJF) index values were computed for the years 1944 to 1998. Index values are included in the jump probability similar to the other parameters of distance, comparative angle, and position on track. A vector of difference in NAO and NINO3 between the current point and possible jump points were computed. This was filled into our squared distribution to compute the probability. Unfortunately, results did not show improvement including this new parameter.
In addition to including this new probability, simulations of climate states can also be achieved directly by taking a subset of the data. For example, the 54 years of data used in the previous simulation can be ranked by thirds by NAO and NINO states into high (H), neutral (N), and low (L) years giving 9 possible climate states (H NAO H NINO, H NAO N NINO, H NAO L NINO, etc). Each climate state was then simulated to isolate the effect of the index on hurricane birth, movement, duration, and death. Distributions of model data were compared yielding results nearly identical to our original resampling technique (section 1 of Project Achievements). This adds confirmation to our results found by that technique.
PLANS FOR CONTINUED WORK:
We are currently working on a manuscript that will cover all the results and conclusions of the Markov and track segment model. It is titled "A Markov and track segment model for simulating hurricane risk with Atlantic Ocean applications".
We would like to explore the effect of other parameters on our track segment model. One that shows interest is a long index developed by Landsea of Western Sahel rainfall. This is of particular influence on “Cape Verde” storms, ones which develop from easterly waves moving off the west coast of Africa.
PATENTS AND INVENSIONS:
There are no patents or inventions associated with this project. 