What’s In a Name?

May 19, 2012

Names are powerful. Simply changing the name of something can have drastic effects. A name is something concrete that we can latch onto in our thoughts. Without a name, we fumble around. Without a name, a concept blurs. We can sharpen that concept by adding more and more descriptors.  Yet, this becomes cumbersome as the descriptors mount. A name embeds all of these descriptors and brings clarity to our thoughts. Consider this exchange:

Your friend: I just saw the weirdest car!

You: Weird how?

Your friend: It was real low down.

You: So?

Your friend: It was a super low, classic car. But, get this, it could hop up and down too!

You: Ah, man. That’s just a lowrider!

You can’t picture a weird car as you’ve seen plenty of weird cars, so the concept is initially very fuzzy. As you get more and more details, the image starts to take form. Once you name it, you have a solid image of it. But you have to be careful. The word “lowrider” not only describes that car, but the driver. It’s a whole culture, and maybe that car and driver are not part of that culture. Using “lowrider” means that you are on one level or another buying into all the assumptions implicit in that term.

This is exploited all the time. To name one easy example, take the “pro-life” moniker. Instead of calling themselves anti-abortionists, the pro-life group has linked themselves to all of human life by embeding the word “life” in their name. They force you think of their cause as defending all of the sanctity of life, which is a very just, noble cause.

Advertising loves to do this. By linking a product name with good traits, whenever you think of that product you associate it with those traits. Will beer really get you into a group sex? Of course it won’t! But if you are exposed to the commercial enough, you’ll link the two on some level. Next time you think of beer, you may think about how it will help you get laid.

Fittingly, if you are aware of this effect you can counteract it and use it for your own ends. In reading, “Universal Principles of Design” I realized just how much advertising uses these design principles. Nearly all of the principles have citations in psychology literature, so by learning these principles you can influence the thoughts of others and be on guard for those trying to do the same to you.

Recognition over Recall

The whole concept of brand recognition is founded on this.

The picture above helps explain why companies seek to expose you countless times to their logo and name. I always thought it was silly that companies would print their name on a pen. A pen? Really, would just putting your name on a pen make me buy a product from you? This principle suggests that it would. Especially when you learn about the exposure effect, which is that subjects exposed to a neutral stimuli will grow to like the stimuli. So by simply exposing us to a corporate logo over and over again, we grow to like that logo. And for good measure, throw in some classical conditioning, where a stimulus is associated with an unconscious physical or emotional response. The aforementioned commercial conditions us to like the beer by associated it with fantastic sex. These three principles make it so that when we walk down the beer aisle, we recognize the brand, we unconsciously like it, and then we buy it. These principles go a longways towards explaining why our lives are so saturated with ads.

I would encourage you to check out the book. It’s really gorgeous and well done. There are a hundred concepts, and I only knew maybe fifteen of them by name. I knew about half of them on some vague level, but now that I have a name for them I can exploit them and be on guard.

I’d like to finish this with one last example on the power of names and their baggage. In my lab group, we come up with models that we name. We usual name them on the day they are finished. So when we finished a model on the fifth of November, it was christened Guy Fawkes. Clearly, we should have thought a bit harder as our British sponsors are less than amused!

Aside

Visibility

An excellent summary of 3 Mile Island.

Sorry, but as a chemical engineer, I wanted to share the above spread. In one of my classes, someone presented on 3 Mile Island. He did a terrible job, and I had forever been confused. This nicely summarizes the events of that day and explains how visibility could have prevented the near disaster. The book also includes factor of safety as a principle and cites the Challenger disaster as an example of what not to do. Clearly, a little bit of good design can stave off serious problems.

Having been thoroughly impressed by Edward Tufte’s first book, The Visual Display of Quantitative Information, I decided to pick up another, Visual Explanations. Tufte envisions his books fitting together like so: The Visual Display of Quantitative Information is about pictures of numbersEnvisioning Information is about pictures of nouns, and Visual Explanations is about pictures of verbs. I was not drawn in by Visual Explanations (VE) as I was by The Visual Display of Quantitative Information (VQ). VE felt disjointed, and the lessons learned are less applicable for graduate students. However, there are still two very important lessons I gleaned: be subtle and avoid legends.

In a nutshell, VE states that images should be honest and scientific. Images should lend themselves to easy comparison through similar composition and repetition. Lastly, images alone can make an argument or tell a story through juxtaposition and symbolism.

I want to highlight a couple of these lessons, starting with one that I initially thought was wrong.

Subtlety is better than garishness.

Tufte states, “Make all visual distinctions as subtle as possible, but still clear and effective.” I would think you would want to eschew subtly to make sure the point comes across. However, this can be far too overwhelming, as shown above.

I’ve seen this lack of subtlety in lab presentations where the presenter shows a chart with ten or more curves. Every curve is a different color and has different markers for the points. Either a different color or a different set of markers would suffice, but both is excessive. Adding insult to injury, the default Excel colors are garish and unsubtle. To fix my own graphics, I’ve taken to using shades of black and gray to differentiate between each curve.

That same chart had a legend. Imagine trying to look at the legend, then the curves, then the legend, then the curves, then the . . . I simply gave up. You might think I give up too easily or that I am nit picking, so I’ll let you experience the difference between a legend and direct labeling for yourself.

Which image is quickest to understand? Which do you finish reading?

In light of this striking difference, I always label my curves directly. To do so in Excel, do not use a text box. Use a data label. You can overwrite the label and move it anywhere in the graph, so it is just like a text box. However, unlike a text box the label will stay in the same relative place on the graph and rescale along with the graph.

Gerrymandering.

The most important message for me is the same message Tufte stressed in his previous book. Graphics should be honest and scientific. To make a graphic honest, there must be a sense of scale and orientation for the viewer. If time-averaging or area-averaging has been applied, it should be done carefully as it can easily obscure important trends. The graphic to the right illustrates how area-averaging may doomed the residents of Broad Street. To make a graphic scientific entails quantification, comparison, and investigation of cause-and-effect. Quantification means applying a scale and going further to assign numbers to seemingly qualitative data. Comparison means plotting similar graphs on the same scale so that when they are put side-by-side, a line falling an inch in one means the same thing in the other. Investigation of cause-and-effect is the most difficult, but it means always plotting the suspected cause on the X-axis and the effect on the Y-axis, rather than plotting both against time.

Tufte does a fantastic job illustrating his message through the Challenger explosion. This is definitely a favorite case-study of his as it shows up in at least three of his books. Here he shows the actual set of slides the engineers sent to NASA the night before trying to persuade them not to launch. Tufte dissects the slides to show why they failed to persuade. The slides omitted many critical data points, failed to quantify the extent of damage to the O-rings, and presented important comparisons with many slides in between. At the end of the dissection, he shows the graphic below.

The slides only discussed the two labeled data points. This shows the full trend and makes a case for cold being dangerous.

By creating a damage-index, Tufte quantifies the data that was previously only qualitative. By plotting this data against temperature, Tufte makes a case for cold temperatures causing damage. I think that if this graphic had been shown to NASA, the launch would have been postponed.

While not all design decisions involve life or death, design can make or break the viewer’s comprehension of your argument. Taking the time to make your graphs easy to read will lead to better questions from the audience. Better questions will push your research along more quickly. Or maybe good design will help a grant-reviewer understand your argument and see why it is significant. In any case, this is not a small matter. While content is definitely king, it must be presented in an intelligible manner for it to take over the kingdom of the viewer’s mind.

For a while now, I’ve considered myself a good writer. I keep a daily journal, this blog, and used to write short stories and scripts for fun. Many of my favorite classes in undergrad were paper based. In the field of engineering, my love for writing is a rarity. I adopted the technical writing style taught in my class, and thought it was good. Well, having read Edward Tufte’s “The Visual Display of Quantitative Information“, I now know that class was a only start.

Who’s Tufte?

Edward Tufte is professor emeritus of statistics, political science, and computer science at Yale University. He’s been very active and outspoken about his ideas for communicating statistics and other technical data since 1975. I mentioned Tufte before when I discussed his hatred for Powerpoint. There his major gripe was the low resolution, so you can rightly expect him to prize high resolution graphics.

The Book in a Nutshell

The book is split into two parts. Part 1, Graphical Practice, gives examples of great graphics that communicate an extraordinary amount of information concisely and not-so-great graphics that lie to the reader or render the data unintelligible. This sets up the second part of the book, Theory of Data Graphics. Here, Tufte goes to the fundamentals. He disparages chartjunk, praises data-ink maximization, and explains data density. (You might not know these terms now, but once you learn them you’ll find yourself using them. I know I have started to think in terms of them.)

Graphical Practice

 
The above image shows Napoleon’s advance into and then retreat from Russia. It shows six variables: army size, location (x and y), direction of troop movement, and temperature on certain dates. Despite showing six different variables, it is easily intelligible. Looking at this, you can vividly see the toll of the frigid weather and the treacherous river crossings. This is a great graphic because it is allows you to draw comparison, shows lots of data in a small area, and tells a story.

To talk about graphical integrity, Tufte introduces the concept of the lie-factor. The lie-factor is equal to the size of the effect shown in the graphic divided by the size of effect in data. So in the image above, it is the change in the size of the line divided by the change in the number the line should represent. In this example, the decision to use perspective corrupts the display and results in a lie-factor of 14.8. The lines simply grow far too fast.

While this makes for a more dramatic graphic, it misleads the reader. As a reader, you simply look at how the lines have grown and conclude that the fuel economy is not only improving but improving more and more rapidly as the lines get bigger and bigger faster. Tufte replotted the data more truthfully below.

It is much easier to see the trend now, and you realize that the fuel economy improvement is slowing down. Clearly, the first graphic failed in that it misrepresented the data, made drawing comparisons difficult, and uses a large space to show a small amount of data.

Theory of Data Graphics

Having given an abundance of examples, Tufte moves onto the logical question of what makes a good graphic. I’ll quote his principles and then explain them.

“Above all else show the data. Maximize the data-ink ratio. Erase non-data ink. Erase redundant data-ink. Revise and edit.’’ The first and last sentences are obvious. To understand the middle three, you need to know that data-ink is literally ink used to display data as opposed to such things as the axes, the labels, the title, and the border. For example, data-ink would be the dots and lines connecting them in the redone fuel economy graphic above. Looking more closely at it, you see that there is no border on the plot. In fact, there is no y-axis. This minimalist design is what Tufte means by erasing non-data ink (the border) and redundant data-ink (the y-axis).

“ Forgo chartjunk, including moiré vibration, the grid, and the duck.’’ Chartjunk is any decoration that does not add to the graphic. It includes the obnoxious patterns in bar charts, the obfuscating grid , and excessive use of color. Particularly troublesome are the patterns, as they often give rise to moiré vibrations. You can simply use different shades of gray instead of the hatching. If you use too much chartjunk and let decoration overwhelm the data, you end up with a duck. A duck is Tufte’s odd term for a graphic that tells you nothing because the decoration has so corrupted the data.

“Maximize data density and the size of the data matrix, within reason. Graphics can be shrunk way down. Use small multiples.’’  The first principle arises from the earlier idea of maximizing data-ink and erasing non data-ink. Data density is defined as the number of entries in the data matrix dived by the area of the data graphic. This feeds right into the second principle of shrinking graphics down. Using these two principles, you should communicate a large amount of data in a very small space. The last principle seems like the odd man out. Small multiples are like the pages of a flip book. By laying out these pages next to each other, you can see how a process evolves over time. Since these are small, the data density is high. Furthermore, they invite comparison and are inherently multivariate. Thus, they always carry a high data density and should be used if possible.

“If the nature of the data suggests the shape of the graphic, follow that suggestion. Otherwise, move toward horizaontal graphics about 50 percent wider than tall.’’ This is pure aesthetics and pulls from the whole Golden Ratio idea. This is easily the weakest section of the whole book, but even here Tufte measured many graphics and found that almost all are wider rather than taller. As for the 50%, that is a very rough rule of thumb.

Taken together, these principles will make your graphics tell a dramatic story concisely.

Why Are There So Many Bad Graphics?

As I read through the book, I was surprised to see Tufte find fault with The New York Times, Time, the Journal of the American Statistical Association, and other authoritative publications. These are respectable names, and you would expect them to be getting it right. Tufte thinks that graphical designers lack quantative reasoning skills. Thus, they cannot do justice to the data. They don’t know what decoration is acceptable and what distorts the data. That is exactly what the New York Times did in the above fuel economy road graphic.

I think Tufte is really onto something here. At first, I thought that maybe the text was outdated here, but then I saw a TED talk by the current Data Artist in Residence at the New York Times, Jer Thorp. He creates graphics that are ducks, that fail at being data-dense, and probably have a high lie-factor.

The point is, with an abundance of graphics that misrepresent the data whether intentionally or unintentionally, you should always be on your guard. Don’t just judge things on how they look, but read the numbers and if necessary replot the data.

I’ll have another post up in the coming weeks about how we engineers can make use of these ideas as well as more presentation advice from Tufte.