Smu senior Design Project: Southwest Airlines



Download 39.48 Kb.
Date28.01.2017
Size39.48 Kb.
#10312

SMU Senior Design Project: Southwest Airlines


On Time Performance

Spring 2010

EMIS 4395-001

Table of Contents


SMU Senior Design Project: Southwest Airlines 1

Management Summary 3

Background and Description of the Problem 3

Technical Description of the Model 6

Analysis and Managerial Interpretation 14

Recommendations 19




Management Summary


Southwest Airlines asked us to analyze their on-time performance on a line by line basis. Previously, Southwest had only analyzed their data on a leg by leg basis, strictly by departing airports, and by turn time and block time. Turn time includes all activities on the ground between arriving and departing gates, and block time includes all activities in the air between departing and arriving gates. We then proceeded to analyze the data given in hopes to discover the issues with and come up with possible recommendations to improve on-time performance.

For this analysis, we collected historical data from Southwest for block time and turn time for the month of January in 2009. After sorting and filtering the data through Excel spreadsheets, we used the AWK programming language to analyze the data. Through AWK, we were able to determine probabilities and statistics of on-time performance at airports around the country on a line by line basis.

Our findings point to a few particular problems for Southwest. For example, five of the top ten departure airports have on-time performance probabilities of over 48%. We were also able to find statistics on a line by line basis, showing which elements were problematic and reoccurring in many aircraft lines. The probabilities and statistics that we obtained show percentages of recoveries made in aircraft lines, probabilities of late departures at certain airports, and various other statistics.

Background and Description of the Problem


Our problem, specifically defined by the Southwest Airlines Optimization Team, was to create a flight schedule for sale that has a specific fleet type associated with it (737-300, 737-500, 737-700) but not a specific aircraft (N201WN).  Aircraft assignments are made about a week before operation by the Dispatchers.  The flight schedule will indicate a whole day's worth of flying (a ' line') to be assigned to one aircraft.  The forecast operational characteristics of a line of flying are mainly dependent on the time assigned to the flights (block time) and the time on the ground between flights (turn time.)  We have historic data on the block times for flights (by market, time of day, etc), and also for the turn times in between flights (by passengers on and off, market, etc.) 

Essentially, the purpose of our study was to analyze data on a line by line basis to discover ways to improve on-time performance. We first had to decide what types of methods we should use to approach this analysis. After sorting and filtering the data through Excel spreadsheets, we used the programming language AWK to find probabilities and statistics regarding departing airports, certain aircraft lines, and individual leg performance within these lines.

Our goals were to answer the following questions:


  • What is the probability of a flight departing on time from a certain airport?

  • What are the top ten worst performing departing airports?

  • What are the top ten best performing departing airports?

  • What are the worst aircraft lines in terms of on-time performance?

  • What are the best aircraft lines in terms of on-time performance?

  • How many late legs?

  • How many late lines?

  • If a leg within a line becomes late, what is the probability that it is able to recover?

Analysis of the Situation

We first met with Southwest’s Optimization Team where they described the problem to us. Unlike other airline companies, who use a hub and spoke model, Southwest uses a leg by leg method. This allows them to have minimal idle time, a small number of gates, and allows for each gate to see an average of ten to twelve flights per day.

Next, we obtained the historical flight data for January 2009 from our contact within the Optimization Team. We then looked through the headers of the files to create our data dictionary. The data dictionary helped us to understand all of the data available and allowed us to determine which probabilities to find.

We approached the problem by sorting and filtering the large volume of data with Excel. Through this process, we were able to organize the data into lines of flying by tail number and the date of the flight, allowing us to view the problem holistically. We then wrote code in the AWK programming language in order to find certain probabilities and statistics within the data. The advantages to using AWK were that we could use one large data set to find many probabilities. We found this to be a more efficient method of analysis, and a helpful tool in seeing the big picture. The first set of statistics that we found determined the probability of on-time performance in relation to late lines (the probability that the current flight and preceding flight are both late), late legs (the probability that the preceding flight is on time but that the current one is late), on time flights (the preceding flight and the current flight are on time), and late leg recoveries (a preceding flight was late, but the current one is on time).

Upon investigation, we determined that certain airports had astonishingly high and low probabilities of late departures. Of Southwest’s top ten departure airports (by volume), we noticed that five had over 48% probabilities of late departures, with some even as high as 68%. We created a histogram that showed us those lateness probabilities of 0-36% and 47-68% were outliers, compared to the others, with an average probability of 42%.

The final step of our analysis was to put all of the results into a graphical format for the PowerPoint presentation. This allowed us to begin making correlations with the data as well as interpret our data into useful recommendations.


Technical Description of the Model


By using the AWK programming language, we were able to code three different programs to aid our analysis of the historical data. Our objective was to mathematically determine the probabilities regarding on-time performance via airports and aircraft lines. Our sources for the data used are Southwest Airlines and their website. We calculated our expected values from the historical data provided.

The first program found the probabilities of late lines, late legs, on time flights, and late leg recoveries. The variables we used are defined as:



numlines = total number of aircraft lines

olddate = the flight date of the previous leg

oldtail = the tail number of the previous leg

oldarrivaldelay = the previous leg’s arrival delay

currentdate = the flight date of the current leg

currenttail = the tail number of the current leg

arrivaldelay = the current leg’s arrival delay

Source code for Prob1.awk:

BEGIN { FS="\t"

getline

numlines = 0



olddate = $6

oldtail = $7

oldarrivaldelay = $14

}

{ currentdate = $6



currenttail = $7

arrivaldelay = $14

if (currentdate == olddate && currenttail == oldtail)

{

if (arrivaldelay > 0 && oldarrivaldelay >0) ++lateline



else if (arrivaldelay > 0 && oldarrivaldelay == 0) ++lateleg

else if (arrivaldelay == 0 && oldarrivaldelay == 0) ++ontime

else if (arrivaldelay == 0 && oldarrivaldelay > 0) ++caughtup

}

olddate = currentdate



oldtail = currenttail

oldarrivaldelay = arrivaldelay

}

END { print "lateline =" lateline



print "lateleg =" lateleg

print "ontime =" ontime

print "caughtup = " caughtup

print "total = " lateleg+ontime+lateline+caughtup

print "probablity of lateline = P(late|prev late) =" lateline/(lateleg+ontime+lateline+caughtup)

print "probability of lateleg = P(late|prev ontime) =" lateleg/(lateleg+ontime+lateline+caughtup)

print "probability of ontime = P(ontime|prev ontime) =" ontime/(lateleg+ontime+lateline+caughtup)

print "probability of caughtup= P(ontime|prev late) =" caughtup/(lateleg+ontime+lateline+caughtup)

}

Our second program ranked our worst performing airports by the probability that a departing flight would be late. Our variables are defined as:



airport = the name of the departing airport

depdelay = the departure delay for a specific flight

ontime = the number of on time flights for the departing airport

delays = the number of delays at the departing airport

Source Code for Prob2.awk:

BEGIN { FS="\t"

getline

}

{



airport = $9

depdelay = $12

++linesread

if (depdelay>0) {++delays[airport] }

if (depdelay==0) {++ontime[airport] }

}

END {



print linesread, "lines read"

for(i in delays) { print i, delays[i], ontime[i]+delays[i],

delays[i]/(ontime[i]+delays[i])}

}

The final program allowed us to sort and view the list of aircraft lines organized by tail number and flight date. It also showed the probability of recoveries by aircraft line, as well as the probability that a specific line will be late. The variables are defined as:



numlines = number of lines flown in the month of January

olddate = the flight date of the previous leg

oldtail = the tail number of the previous leg

oldarrivaldelay = the arrival delay of the previous leg

legsinline = the number of legs in each specific aircraft line

currentdate = the flight date of the current leg

currenttail = the tail number of the current leg

origin = the first leg in the list of lines

dest = the subsequent leg in the list of lines

arrivaldelay = the arrival delay of the current leg

dayofweek = the day of the week corresponding to the flight date

Source Code for linelist.awk:

BEGIN { FS="\t"

getline

numlines = 0



olddate = "Initial"

oldtail = "Init"

oldarrivaldelay = 0

legsinline = 0

}

{

currentdate = $6



currenttail = $7

origin = $9

dest = $10

arrivaldelay = $14

dayofweek = $5

if (currentdate == olddate && currenttail == oldtail)

{

linename = linename "-" dest



if (arrivaldelay > 0) ++numlatelegs

++legsinline

if (arrivaldelay > 0 && oldarrivaldelay > 0) ++latelate

}

else



{

++linecount[linename]

if (numlatelegs > 0) ++linelate[linename]

legslate[linename] += numlatelegs

recoveries[linename] += latelate

linelength[linename] = legsinline

linename = origin

legsinline = 1

numlatelegs = 0

++dayofline[linename,dayofweek]

latelate = 0

}

olddate = currentdate



oldtail = currenttail

oldarrivaldelay = arrivaldelay

END {

Line #observations #latelegs legsInLine #lateLegs #recoveries probrecoveries problinelate



for (i in linecount) { if (legslate[i] >0) {precoveries[i] = recoveries[i]/legslate[i]}

print i, linecount[i], linelate[i], linelength[i], legslate[i], recoveries[i], precoveries[i], (linelate[i]/linecount[i])}

# recovery = ontime after late arrival

}

Analysis and Managerial Interpretation


In this section we will provide an analysis of the overall project objectives and the output of our model.




Start

End

Leg

OAK

FLL

Leg

LAX

BUF

Historically, Southwest analyzes on-time performance by monitoring the turn time stats on a leg by leg basis. An example of a leg is shown above. For this assignment, Southwest asked us to look their data from a broader perspective and analyze on-time performance by line. An example of a line is displayed in the next section.




City 1

City 2

City 3

City 4

City 5

City 6

Line

OAK

FLL

AUS

OAK







Line

LAX

BUF

BWI

SAT

ELP




Line

SJC

RDU

BNA

ONT

SJC

LAS

Line

HOU

OAK

SEA

MCI

MCO




The data we analyzed was from January 2009. First we sorted the data into two segments, number of lines and number of legs. We then used AWK to determine the probability of lateness by line and leg.

Lateness by leg and line were valuable statistics for the following reasons:



  • It can be used by Southwest to compare its performance to other competitors in the industry

  • It can also be used as a benchmark to define an acceptable level of ‘lateness’ for all its airports

We also looked at a lines ability to recover to verify if a late leg had a ripple effect on a line. Below is a chart of this is data. This analysis did reveal a late leg does have a ripple effect on a line as only 10% of lines with a late leg actually recovered from the previous leg’s lateness.

Next, we analyzed the frequency of lateness by airport. This data helped us identify which airports were performing well above expectations (as far as low number of late departures) and airports underperforming. (as far as high number of late departures)

Here is a snapshot of the probability of late departure by airports.

Based on this data we created a box plot and histogram to identify the outlying airports which performed lower or higher than the average range.

Below are the charts show the Box Plot and Histogram.

The histogram was valuable for the following reasons:



  • It showed 17 airports were below the average lateness frequency. Although a low lateness frequency is beneficial from a customer standpoint, it would be interesting to inspect the cost at these particular airports. Their expenses may be higher in order to maintain this low level of lateness frequency

  • It showed the average frequency of lateness was 42% for the entire airport that experience lateness. This provides us a benchmark to use when comparing all the airports to each other. In our opinion, 36 - 47% may be acceptable because it represents the middle range of the data but this range may be high from a customer satisfaction standpoint. It would be interesting to review any survey data at these particular airports to see how customers feel

  • It showed 33 airports were above the average lateness frequency; in a 47 - 68% range. We recommend spending a great deal of time understanding root cause of lateness at these airports. We did capture a snap shot of the top 10 worst airports, in terms of lateness frequency and five of them are one of Southwest’s Top Airports by number of departures. A chart of these airports is listed below. (% highlighted in red represent an airport in the Top Airport list)


Recommendations


Below are our recommendations:

  • Line analysis & Airport analysis – we would recommend further analysis on Southwest’s best and worst airports. Understanding this data may improve turn time and overall line performance;

  • We also feel further research on lines which experienced a recovery leg may hold valuable insight


Download 39.48 Kb.

Share with your friends:




The database is protected by copyright ©ininet.org 2024
send message

    Main page