Show the Data: Understanding Data-Ink Ratio
"Above all else show the data."
- Edward R. Tufte
One of the best ways to start improving your data viz skills is to start gaining an understanding of the theory of data graphics. If you haven't heard the name Edward R. Tufte by now and you're seeking a career in the realm of data and visual analytics, then you'll definitely want to pick up his books. His first and most widely recognized book, The Visual Display of Quantitative Information, pioneered data viz theory and introduced a fundamental yet extremely vital concept:
The Data-Ink Ratio
Here's how Tufte describes it:
A large share of ink on a graphic should present data-information, the ink changing as the data change. Data-ink is the non-erasable core of a graphic, the non-redundant ink arranged in response to variation in the numbers represented. Then,
Data-ink Ratio = data-ink / total ink used to print the graphic
= proportion of a graphic's ink devoted to the non-redundant display of data-information
= 1.0 - proportion of a graphic that can be erased without loss of data-information
“Data-Ink” The Visual Display of Quantitative Information, by Edward R. Tufte, Graphics Press, 2018, pp. 93.
So what does this mean, and how can we utilize it in a viz? Well, let's look back at the photo from above
There's a lot going on here. A lot of visual noise and what Tufte calls "non-data-ink". Non-data-ink is anything in a viz that isn't an actual data point such as tick marks, graph frames, and grid lines. In order to improve the design of this viz and increase time-to-insight, we will make use of the principles that follow.
Maximize the data-ink ratio, within reason
Maximizing the data-ink ratio means that every single line or label (ink) on a graph should have a reason, and that reason should be to present new information. This provides the foundation for the remaining principles, meaning that everything that follows is meant to maximize the data-ink ratio while ensuring that the audience can effectively interpret the viz.
Erase non-data-ink, within reason
There are two erasing principles to increasing data-ink, the first being to erase non-data-ink. In order to put this principle into action, you want to look for ways to label data points directly and remove grid lines and axis, while making sure that your viz provides enough context for the audience to understand what they're looking at. You also want to remove anything in the visual the repeats information, which leads us to the second erasing principle.
Erase redundant data-ink, within reason
Duplicate information is referred to as redundant data-ink. An example of this would be a bar that has a label of the value while also showing an axis that the audience could get this same information from. Unless there is a specific purpose for redundancy in a visual, it can be removed to reduce clutter and only put what is truly important in front of your audience.
Revise and Edit
This principle is the part of the iterative process of visuals in general. Usually when creating a viz, you focus on the insights you're able to see and then try to highlight those insights with good design. When applying these principles to the viz from earlier, I come up with this
Now there's still some redundant data-ink here (the height of the left and right sides of the bars and the color filling the bars are technically separate elements of the bar that are telling us the same piece of information) but you want to find a balance of these principles and what you know your audience will understand. There's also many different ways to go about maximizing data-ink. You may choose to show each of the value labels with the $ sign in front of them indicating that they're currency, while I chose to include this context within the title of the graph in order not to repeat the symbol. I removed all axis and labeled the data points directly. I also kept a de-emphasized dotted line to show that the bars start at 0. If you compare the first photo and the revision, the revision looks much cleaner and you are able to see which sub-categories have had the most and least sales much quicker, even with the axis removed. This is because we focused on the critical elements of the graph that present information to the audience.
Final Thoughts on the Data-ink Ratio
Understanding the five principles of the data-ink ratio will help you identify visual clutter and make your viz stand out from the pack. The principles listed once more are:
Above all else show the data
Maximize the data-ink ratio
Erase redundant data-ink
Revise and edit
While this concept elevated my viz design, It shouldn't be a make-it or break-it rule. Understanding the concept is step one, but putting your creative touch while keeping this in mind is what makes your designs unique. More importantly, your audience is going to be the driving force behind the creative elements you're able to implement while still presenting visuals that they'll be willing to get on board with.