Driving to and from work every day is sad reality for most of the nation’s workforce. According to USA Today, the average commute in the US is 25.5 minutes each way, or a total of 51 minutes each day. To look at the Sacramento region’s commutes, the Sacramento Business Journal analyzed average commute times using Google Maps for 20 communities in the Sacramento region.
In their article, this information was presented as a series of tables in a slideshow. Thinking this could be communicated much more effectively as a data visualization, I did some exploratory data analysis and visualization to come up with a visual representation that would quickly and easily show the Sacramento communities with the best and worst commutes.
The first step of the process was to enter the data into a Google spreadsheet. Once there, it was easy to try out some quick and dirty charts from within that application. Early on, a grouped bar chart seemed like the natural choice but I wanted to explore some other options as well.
Using a Sankey Diagram to Show Commute Times
I really like the aesthetics of Sankey/Alluvial Flow Diagrams. I had my doubts that it would be an effective way to show the commutes, but the online tool RAW makes it so easy to create them that it was a no-brainer to explore that option just for kicks.
The problem with this visualization is that the commute times between any two communities are represented by the thickness of the line connecting them, which as you can see in the example to the right, is very difficult to make sense of.
What this diagram does well however, is to show the relative total sum of how good or bad it is to commute to or from one of the cities in the morning or afternoon. This is represented by the height of the bars on the left for origins of the commute, and on the right for destinations. So, it’s easy to see for example that the afternoon commute to Davis, having the bar with the largest height, is the worst place in the area to have to commute to, while driving to Citrus Heights in morning, the bar on the right with the smallest height, is the best. Likewise on the left side, it’s easy to see at a glance that Placerville is the worst place to commute from, while West Sacramento is the best place to commute from.
Using a Heat Map to Show Commute Times
The second alternative representation I wanted to try was using a heat map. Another great online tool for quick data viz exploration is Plot.ly. One of the many chart types they offer is a heat map and it’s almost as simple as just pasting in the data from your spreadsheet to generate one. The heat map uses a simple two-dimensional grid with the “commute from” cities on the y-axis and the “commute to” cities on the x-axis and the commute times represented by color. Here red represents the worst commute times and blue the best. Again, it’s very easy to quickly see which cities are best and worst for commutes in the region.
Using a Grouped Bar Chart to Show Commute Times
Though arguably the most aesthetically boring chart type out there, the bar chart is often the most effective for making comparisons as our brains are very good at making accurate comparisons when data is represented along a common scale like bar charts do. So, while it was interesting to explore some other options, the original idea of a grouped bar chart seemed the best option for representing the data.
I’m also on the arduous learning curve of learning D3.js and a bar chart with basic interactivity is one of the easier charts to code in D3. So, I coded up a basic interactive grouped bar chart in D3 to represent the commute data.
For reference, here’s a map of the region with the represented cities.