Part 1: Understanding Pricing
This is the second Infinity Insight article about blending the wonderful worlds of travel hacking
and data visualization, and the first in a series dedicated to the analysis of "fare classes" on United Airlines flights. It's not an exaggeration to say that this post has been a year in the making!
What Are Fare Classes?
Everybody knows that there are different cabins
on an airplane: economy, business, and sometimes even first class. But did you know that airlines sell seats within each cabin under different fare codes? Depending on the airline, there may be anywhere from 2 to 20 codes per cabin. These codes mean slightly different things to various airlines, but they are essentially used to differentiate your rights as a passenger and the reward miles you earn from taking the flight. One fare class may entitle you to a full refund if you cancel your ticket and earn you 1,000 elite miles, while another may be non-refundable and earn you only 500 elite miles from the exact same flight. There's a pretty good article
that explains what various fare classes mean, at least to United Airlines. And, of course, since the different classes confer different benefits, buying a ticket under each won't cost you the same amount of money.
Where Did the Data in This Article Come From?
Airlines pretty much take a black box approach to assigning prices to tickets in each fare class—you might know that a "discount" fare should typically
cost more than a "deep discount" fare, but you never really know what exactly
to expect. This article will attempt to shine some light on the pricing patterns between the classes using data mined over the course of a year from one particular airline: United. This airline is relatively unique in that it lets you specify your desired fare class right in their search engine. Any person is free to open the Advanced Search interface on united.com and specify in exactly what fare class he or she wants to buy a ticket. So that's what I did, on a huge scale: leveraging a series of advanced macros to search for every available fare class for 21 flights, every day since the flight was first offered for sale until the day the flight actually occurred. This GIF shows the little bot hard at work. Of course, once I had the HTML files downloaded, it was child's play for QlikView to scrape all the relevant pricing goodness from them. The visualizations in this article were created with Flourish
, a tool on which I admit I've developed a huge crush lately.
A few words about the data before we dive in to the analysis:
1. All flights in the data set are US domestic, non-stop flights.
2. I picked one morning, one afternoon, and one evening flight on each weekday. This originally resulted in 21 flights, but two of these were cancelled during the course of the year that the script was running. Since the data I had from these two flights was incomplete, I've removed them from all analyses.
3. All searches were performed assuming a single passenger.
4. I only looked at revenue-based fare classes, not award travel or upgrade fares (these are a whole separate can of worms).
5. For purposes of the analysis, I am intentionally ignoring the differences in perks between the fare classes. Meaning, when I say below that Fare 1 is cheaper than Fare 2, that doesn't necessarily mean it's the smartest purchase for every traveler's particular needs.
What's the Cheapest Fare?
OK, let's jump right into it! One big insight that came out of this project is that there's now a conclusive answer to how fare classes should be ordered, from cheapest to most expensive:
These flows are somewhat generalized, but only somewhat. There are occasional exceptions to the flow, but not many. For instance, fare V is cheaper than fare W about 1% of the time. For purposes of visualization clarity, and to make the story told by this data accessible to those who may not be used to advanced data visualizations, I've elected to hide these minor blips entirely.
Visualizing the "Flow"
Let's say you are not a travel hacker, and buy a ticket on united.com like the vast majority of people: simply searching for the default fares that United shows you for the flights available on your dates to your destination. By default, United will automatically pick the cheapest fare class per cabin (economy/business) that has enough availability for your requested number of passengers at the time of your search. In economy, for example, this is the price that the search engine will display under the "standard" bucket in the results. Here's how the flow of those default fares (basic economy, lowest non-basic economy, and business class) moves through the various fare classes. As you can see, the flow virtually always follows the above generalized ordering of fares from cheapest to most expensive, which is what we would expect. To display the vast number of data points cleanly, I've broken the 336 days of data out into 5 buckets; use the buttons above the chart to select the chart you want to see. Use the drop-down selector within each chart to switch between flights. Please note that, due to the nature of this visualization, it is best viewed on a larger screen. You can also use the "A Flourish data visualization" link underneath the chart to open a larger view.
Days Before Departure
261 - 336
196 - 260
131 - 195
66 - 130
0 - 65
What Kind of Discount Can You Expect?
We can see above that the most expensive economy fare class is Y and the most expensive business class is J. These are known as "full fare" classes. Every class other than these offers a discount as compared to these full prices. Here's a visualization of the range of likely discounts you can expect from each fare class:
I hope you enjoyed this first post. The next one in the series, based on this same data set, will be all about when in that huge 336 window you should buy tickets to stand the best chance of getting a good deal. If that's something that interests you, give the blog a follow. And I always welcome feedback in the comments below. Until next time!