How to solve a common visualization challenge with the help of an unlikely ally: the humble radar chart.
Let's say you have approximately a year's worth of data. You are not interested in displaying trends within this data, but are instead interested in viewing each day as an discrete data point. This scenario happens all the time but the challenge is fairly obvious: what graphical representation can easily support such a large number of data points? The traditional solution for graphing discrete (i.e. non-trend) data points is a bar chart. But 365 bars? Unless you own an IMAX movie theater, that is an impossible number of bars to view on a single screen without scrolling. And, of course, the problem with scrolling is that you cannot view all of your data at the same time, which causes you to lose context. A symbol chart (a type of line chart) is going similarly restricted, even if you make it "continuous" (more on that below).
A Cool Example
One of my hobbies is "travel hacking." Despite the ominous word "hacking," this basically just means flying for free using miles and points. Today I have decided to combine business and pleasure by using an example from the world of travel hacking to illustrate the challenge above, as well as the solution. Singapore Airlines is one of the world's premier airlines. They are known for their outstanding service across the board, but what really makes them stand out is their "Suites" first class product, which brings new meaning to the word "luxury":
It goes without saying that this product is far out of the price range of what most people can afford. However, Singapore often makes these first class seats available for Singapore's "KrisFlyer" miles (which you can transfer to Singapore from various sources), instead of cash. On a particular flight, a seat may be available at the "Standard" award level or at the "Saver" award level. In the example that we will use, which is the New York to Frankfurt route, a Standard award costs 130,000 miles, and a Saver award costs 76,000 miles. In addition, some award seats are "Waitlist" availability, which means that you might get the seat if the airline decides to open the waitlist up for that particular flight, or you might not. Award availability is released 355 days in advance. If I am a fairly flexible traveler, I might be open to traveling whenever I can find good availability. But what is the best way to visualize 355 days of availability to help me easily find a travel date? Well, here is what a traditional bar chart solution would look like using Standard award availability:
In this graph, green represents open availability, yellow represents Waitlist availability, and red represents no award availability. The problem with this bar is obvious though: screen real estate. You can only see a small percentage of the total 355 days, and lose the valuable context that seeing a full year of data gives you. A symbol chart would have the same problem:
QlikView has the ability to make a numeric axis (such as date values) "continuous," but doing so causes the discrete data points to blend into a single line and become hard to distinguish and pinpoint:
Radar Charts to the Rescue
Radar charts are, as these things go, a fairly obscure form of visualization. They are certainly known to business intelligence professionals, but odds are that most other folks have never seen one. Even BI pros often overlook this type of visualization because, frankly, it is not that useful most of the time. Where the humble radar shines though, is in visualizing multiples of approximately a year. In business, this can be appropriate to show seasonality of demand, for instance. In the current example, it is a fantastically elegant way of showing 355 data points cleanly and without creating a sense of chaos and clutter. Here is how the same Standard award availability looks plotted in a radar chart:
Notice that the overall effect does not overwhelm the eyes but neatly presents all 355 days in such a way that any particular day can be easily found at a glance, while the broader context is available without the need for any scrolling whatsoever. To paint a full picture, here is the same graph for Saver level award availability:
Kind of a bleak picture, but there you have it: the only dates in the next year that have non-Waitlist Saver award availability (as of the time this article was written) are New Years Eve 2018 and January 10th.
I've personally been guilty of overlooking radar charts in the past, but they might be my new favorite visualization for this type of data. At the very least, I am going to include them in the rotation. I would urge you to remember that this is a tool that you have in your toolboxes, as well! Of course, another visualization we could have considered would have been a calendar itself, with the squares colored green, yellow, or red. However, we would have had to use a bit more screen real estate, and would not have been able to see patterns in availability across multiple months as easily. With a radar chart, the same day of each month is always on the same concentric circle; this would help us see, for instance, if there's usually more availability during a particular part of each month. Plus a calendar like that is not an object that exists in QlikView without the use of extensions (which I view as cheating).
A Note on Measure Values
In this specific example, we did not care about quantifying the value at each data point. Because we knew that all saver awards cost 76,000 miles and all standard awards cost 130,000 miles for this particular route, the specific mileage required on a particular day was of little interest to us. However, the technique we just discussed can easily be adapted, in one of two ways, to situations where we do can about the measure value. In the first approach, the data does not need to be trinary (only three possible values) but should, ideally, be limited to a reasonable number of values. If you are dealing with a measure that has a wide range of possible values (e.g. sales per day), you would bucket the values into a more manageable number. For example, round your sales into buckets of $5,000 (or whatever number makes sense). You would then assign a unique color to each bucket. To those that balk at the idea of losing precision, I would counter that data visualization is about seeing patterns effectively; if you truly need precision, a graphical representation may not be appropriate for you at all. You also have the option to bring precision back via mouse-over popup text, as is done in the application that you can download below. The second approach, which is also perfectly valid, is to create a heat map of colors and assign each day's measure value to a color in the heat map range.
A Note on Color Usage: Know Your Audience
You may have noticed that, in the example above, I committed a cardinal data visualization sin: I used red and green of approximately equal saturations. For the uninitiated, the reason this is a problem is because approximately 10% of the world is red-green colorblind. Obviously, if any significant portion of your intended audience shares this condition, you should pick colors that are more easily distinguishable by these folks. You could also make your symbols different for green and red points, such as by using a square for green and a circle for red. This is not easily achieved in QlikView though, so I cut some corners in this example.
A Note on the Data Used
OK, I fully expect that at this point a portion of my readers are doing back flips because the data used in this example is supposed to be impossible to get. I probably owe you guys an explanation of how I got it. Singapore Airlines is famous for only posting its Suites availability on its own website, and you can only get to the results by performing a search for a specific route and day combination. To get the results for another day, you have to rinse and repeat. Availability is released 355 days in advance, but to actually query 355 dates by hand is arduously time consuming. To get around this, I used OLE automation via a nifty tool called iMacros. I created a loop to search for Suites availability (a single seat) from New York to Frankfurt for each of the 355 days, and then save the resulting page as an offline HTML file. The script took about 2.5 unattended hours to run. Then I simply parsed the resulting 355 HTML files using QlikView's native ETL capabilities and Bob's your uncle. There are currently 13 routes on which Singapore offers Suites service so, in theory, I could have used this same technique to generate a database of all Suites availability in about 32.5 hours. I don't really have any plans to refresh the application used in this example on an ongoing basis, at least until I decide to try Singapore's remarkable product out for myself!
As always, I'm more than happy to make the QlikView app used to create the above charts available for download below. I would love to hear from you in the comments section—in this particular post, I'm also quite curious if the majority of my readers are BI people or travel enthusiasts!
Update: by popular demand, I've decided to make the entire end-to-end process (including the OLE automation piece) available for public download. Happy Qliking!
Updated again on December 16, 2017 to fix a small bug related to CSV generation. Check back once in a while, as additional updates/improvements may be posted. For instance, there will be a code fix around January 20, 2018, when "Standard" awards are renamed to "Advantage."