Risk is something many people have a hard time thinking about clearly. Why is that? In his book Risk: The Science and Politics of Fear, subtitled “why we fear the things we shouldn’t–and put ourselves in greater danger”, Dan Gardner surveyed many of the theories that have been used to explain this phenomenon. They range from simple innumeracy, to the influence of the media, or even the psychology of the short-cut “heuristics” (rules of thumb) we all use to make decisions quickly but that can also lead us astray.
In Reckoning With Risk, Gerd Gigerenzer argues that the traditional formulation of probability is particularly unhelpful, making calculations even harder than they should be. Studies have shown that even doctors struggle to handle probabilities correctly when explaining risks associated with illnesses and treatments. Gigerenzer instead proposed expressing risk in terms of “natural frequencies” (e.g. thinking in terms of 8 patients out of 1,000 rather than a 0.8% probability) and tests with general practitioners suggest that this kind of re-framing can be very effective.
The latest book on the subject that I have been reading is The Illusion of Certainty: Health Benefits and Risks by Erik Rifkin and Edward Bouwer. Rifkin and Bouwer are particularly critical of the common practice of reporting medical risks in terms of relative rather than absolute frequencies. When news breaks that a new treatment reduces the risk of dying from condition X by 33%, should you be excited? That depends. This could mean that (absolute) risk of dying from X is currently 15% and the treatment brings this down to 10%. That would be big news. However, if the death rate from X is currently 3 in 10,000 and the treatment brings this down to 2 in 10,000 then the reduction in (relative) risk is still 33% but the news is far less exciting because the absolute risk of 3 in 10,000 is so much lower.
In an effort to facilitate the perception of risk, Rifkin and Bouwer devised an interesting graphical device. They note that it is particularly difficult to conceive and compare small risks, say a few cases in 1,000. In thinking about this problem, they came up with the idea of picturing a theatre with 1,000 seats and representing the cases as occupied seats in that theatre. They call the result a “Risk Characterization Theatre” (RCT). Here is an example to illustrate a 2% risk, or 20 cases in 1,000.
Now data visualization purists would be horrified by this picture. In The Visual Display of Quantitative Information, Edward Tufte argues that the “ink to data ratio” should be kept as low as possible, but the RCT uses a lot of ink just to display a single number! Still, I do think that the RCT can be an effective tool and perhaps this can be justified by thinking of it as a way of visualizing numbers rather than data (but maybe that’s a long bow).
Attractive though the theatre layout may be, there is probably no real need for the detail of the aisles, seating sections and labels, so here is a simpler version (again illustrating 20 in 1,000).
To illustrate the use of RCTs, I’ll use one of the case studies from Rifkin and Bouwer’s book: smoking. One of the most significant studies of the health effects of smoking tracked the mortality of almost 35,000 British doctors (a mix of smokers and non-smokers). The study commenced in 1951 and the first results were published in 1954 and indicated a significantly higher incidence of lung cancer among smokers. The study ultimately continued until 2001 and and the final results were published in the 2004 paper Mortality in relation to smoking: 50 years’ observations on male British doctors.
The data clearly showed that, on average, smokers died earlier than non-smokers. The chart below would be the traditional way of visualizing this effect*.
While it may be clear from this chart that being a smoker is riskier than being a non-smoker, thinking in terms of percentage survival rates may not be intuitive for everyone. Here is how the same data would be illustrated using RCTs. Appropriately, the black squares indicate a death (and for those who prefer the original layout, there is also a theatre version).
This is a rather striking chart. Particularly looking at the theatres for doctors up to 70 and 80 years old, the higher death rate of smokers is stark. However, the charts also highlight the inefficiency of the RCT. This graphic in fact only shows 8 of the 12 data points on the original charts.
So, the Risk Characterization Theatre is an interesting idea that may be a useful tool for helping to make numbers more concrete, but they are unlikely to be added to the arsenal of the serious data analyst.
As a final twist of the RCT, I have also designed a “Risk Characterization Stadium” which could be used to visualize even lower risks. Here is an illustration of 20 cases in 10,000 (0.2%).
* Note that the figures here differ slightly from those in Rifkin and Bower’s book. I have used data for doctors born between 1900 and 1930, whereas they refer to the 1900-1909 data but would in fact appear to have used the 1910-1919 data.