Tag Archives: health

Hans Rosling: data visualisation guru

It is no secret that I am very interested in data visualisation, and yet I have never mentioned the work of Hans Rosling here on the blog. It is an omission I should finally correct, not least to acknowledge those readers who regularly email me links to Rosling’s videos.

Rosling is a doctor with a particular interest in global health and welfare trends. In an effort to broaden awareness of these trends, he founded the non-profit organisation Gapminder, which is described as:

a modern “museum” on the Internet – promoting sustainable global development and achievement of the United Nations Millennium Development Goals

Gapminder provides a rich repository of statistics from a wide range of sources and it was at Gapminder that Rosling’s famous animated bubble charting tool Trendalyzer was developed. I first saw Trendalyzer in action a number of years ago in a presentation Rosling gave at a TED conference. Rosling continued to update his presentation and there are now seven TED videos available. But, the video that Mule readers most often send me is the one below, taken from the BBC documentary “The Joy of Stats”.

If the four minutes of video here have whetted your appetite, the entire hour-long documentary is available on the Gapminder website. You can also take a closer look at Trendalyzer in action at Gapminder World.

Micromorts

11 Replies

Everyone knows hang-gliding is risky. How could throwing yourself off a mountain not be? But then again, driving across town is risky too. In both cases, the risks are in fact very low and assessing and comparing small risks is tricky.

Ronald A. Howard, the pioneer of the field of decision analysis (not the Happy Days star turned director) put it this way:

A problem we continually face in describing risks is how to discuss small probabilities. It appears that many people consider probabilities less than 1 in 100 to be “too small to worry about.” Yet many of life’s serious risks, and medical risks in particular, often fall into this range.

R. A. Howard (1989)

Howard’s solution was to come up with a better scale than percentages to measure small risks. Shopping for coffee you would not ask for 0.00025 tons (unless you were naturally irritating), you would ask for 250 grams. In the same way, talking about a 1/125,000 or 0.000008 risk of death associated with a hang-gliding flight is rather awkward. With that in mind. Howard coined the term “microprobability” (μp) to refer to an event with a chance of 1 in 1 million and a 1 in 1 million chance of death he calls a “micromort” (μmt). We can now describe the risk of hang-gliding as 8 micromorts and you would have to drive around 3,000km in a car before accumulating a risk of 8μmt, which helps compare these two remote risks.

Before going too far with micromorts, it is worth getting a sense of just how small the probabilities involved really are. Howard observes that the chance of flipping a coin 20 times and getting 20 heads in a row is around 1μp and the chance of being dealt a royal flush in poker is about 1.5μp. In a post about visualising risk I wrote about “risk characterisation theatres” or, for more remote risks, a “risk characterisation stadium”. The lonely little spot in this stadium of 10,000 seats represents a risk of 100μp.

One enthusiastic user of the micromort for comparing remote risks is Professor David Spiegelhalter, a British statistician who holds the professorship of the “Public Understanding of Risk” at the University of Cambridge. He recently gave a public lecture on quantifying uncertainty at the London School of Economics*. The chart below provides a micromort comparison adapted from some of the mortality statistics appearing in Spiegelhalter’s lecture. They are UK figures and some would certainly vary from country to country.

Based on these figures, a car trip across town comes in at a mere 0.003μmt (or perhaps 3 “nanomorts”) and so is much less risk, if less fun, than a hang-gliding flight.

It is worth noting that assessing the risk of different modes of travel can be controversial. It is important to be very clear whether comparisons are being made based on risk per annum, risk per unit distance or risk per trip. These different approaches will result in very different figures. For example, for most people plane trips are relatively infrequent (which will make annual risks look better), but the distances travelled are much greater (so the per unit distance risk will look much better than the per trip risk).

Here are two final statistics to round out the context for the micromort unit of measurement: the average risk of premature death (i.e. dying of non-natural causes) in a single day for someone living in a developed nation is about 1μmt and the risk for a British soldier serving in Afghanistan for one day is about 33μmt.

*Thanks to Stephen from the SURF group for bringing this lecture to my attention.

Natural frequencies

2 Replies

In my last post, I made a passing reference to Gerd Gigerenzer’s idea of using “natural frequencies” instead of probabilities to make assessing risks a little easier. My brief description of the idea did not really do justice to it, so here I will briefly outline an example from Gigerenzer’s book Reckoning With Risk.

The scenario posed is that you are conducting breast cancer screens using mammograms and you are presented with the following information and question about asymptomatic women between 40 and 50 who participate in the screening:

The probability that one of these women has breast cancer is 0.8%. If a woman has breast cancer, the probability is 90% that she will have a positive mammogram. If a woman does not have breast cancer, the probability is 7% that she will still have a positive mammogram. Imagine a woman who has a positive mammogram. What is the probability that she actually has breast cancer?

For those familiar with probability, this is a classic example of a problem that calls for the application of Bayes’ Theorem. However, for many people—not least doctors—it is not an easy question.

Gigerenzer posed exactly this problem to 24 German physicians with an average of 14 years professional experience, including radiologists, gynacologists and dermatologists. By far the most common answer was that there was a 90% chance she had breast cancer and the majority put the odds at 50% or more.

In fact, the correct answer is only 9% (rounding to the nearest %). Only two of the doctors came up with the correct answer, although two others were very close. Overall, a “success” rate of less than 20% is quite striking, particularly given that one would expect doctors to be dealing with these sorts of risk assessments on a regular basis.

Gigerenzer’s hypothesis was that an alternative formulation would make the problem more accessible. So, he posed essentially the same question to a different set of 24 physicians (from a similar range of specialties with similar experience) in the following way:

Eight out of every 1,000 women have breast cancer. Of these 8 women with breast cancer, 7 will have a positive mammogram. Of the remaining 992 women who don’t have breast cancer, some 70 will still have a positive mammogram. Imagine a sample of women who have positive mammograms in screening. How many of these women actually have breast cancer?

Gigerenzer refers to this type of formulation as using “natural frequencies” rather than probabilities. Astute observers will note that there are some rounding differences between this question and the original one (e.g. 70 out of 992 false positives is actually a rate of 7.06% not 7%), but the differences are small.

Now a bit of work has already been done here to help you on the way to the right answer. It’s not too hard to see that there will be 77 positive mammograms (7 true positives plus 70 false positives) and of these only 7 actually have breast cancer. So, the chances of someone in this sample of positive screens actually having cancer is 7/77 = 9% (rounding to the nearest %).

Needless to say, far more of the doctors who were given this formulation got the right answer. There were still some errors, but this time only 5 of the 24 picked a number over 50% (what were they thinking?).

The lesson is that probability is a powerful but confusing tool and it pays to think carefully about how to frame statements about risk if you want people to draw accurate conclusions.

Visualizing smoking risk

13 Replies

Risk is something many people have a hard time thinking about clearly. Why is that? In his book Risk: The Science and Politics of Fear, subtitled “why we fear the things we shouldn’t–and put ourselves in greater danger”, Dan Gardner surveyed many of the theories that have been used to explain this phenomenon. They range from simple innumeracy, to the influence of the media, or even the psychology of the short-cut “heuristics” (rules of thumb) we all use to make decisions quickly but that can also lead us astray.

In Reckoning With Risk, Gerd Gigerenzer argues that the traditional formulation of probability is particularly unhelpful, making calculations even harder than they should be. Studies have shown that even doctors struggle to handle probabilities correctly when explaining risks associated with illnesses and treatments. Gigerenzer instead proposed expressing risk in terms of “natural frequencies” (e.g. thinking in terms of 8 patients out of 1,000 rather than a 0.8% probability) and tests with general practitioners suggest that this kind of re-framing can be very effective.

The latest book on the subject that I have been reading is The Illusion of Certainty: Health Benefits and Risks by Erik Rifkin and Edward Bouwer. Rifkin and Bouwer are particularly critical of the common practice of reporting medical risks in terms of relative rather than absolute frequencies. When news breaks that a new treatment reduces the risk of dying from condition X by 33%, should you be excited? That depends. This could mean that (absolute) risk of dying from X is currently 15% and the treatment brings this down to 10%. That would be big news. However, if the death rate from X is currently 3 in 10,000 and the treatment brings this down to 2 in 10,000 then the reduction in (relative) risk is still 33% but the news is far less exciting because the absolute risk of 3 in 10,000 is so much lower.

In an effort to facilitate the perception of risk, Rifkin and Bouwer devised an interesting graphical device. They note that it is particularly difficult to conceive and compare small risks, say a few cases in 1,000. In thinking about this problem, they came up with the idea of picturing a theatre with 1,000 seats and representing the cases as occupied seats in that theatre. They call the result a “Risk Characterization Theatre” (RCT). Here is an example to illustrate a 2% risk, or 20 cases in 1,000.

Now data visualization purists would be horrified by this picture. In The Visual Display of Quantitative Information, Edward Tufte argues that the “ink to data ratio” should be kept as low as possible, but the RCT uses a lot of ink just to display a single number! Still, I do think that the RCT can be an effective tool and perhaps this can be justified by thinking of it as a way of visualizing numbers rather than data (but maybe that’s a long bow).

Attractive though the theatre layout may be, there is probably no real need for the detail of the aisles, seating sections and labels, so here is a simpler version (again illustrating 20 in 1,000).

To illustrate the use of RCTs, I’ll use one of the case studies from Rifkin and Bouwer’s book: smoking. One of the most significant studies of the health effects of smoking tracked the mortality of almost 35,000 British doctors (a mix of smokers and non-smokers). The study commenced in 1951 and the first results were published in 1954 and indicated a significantly higher incidence of lung cancer among smokers. The study ultimately continued until 2001 and and the final results were published in the 2004 paper Mortality in relation to smoking: 50 years’ observations on male British doctors.

The data clearly showed that, on average, smokers died earlier than non-smokers. The chart below would be the traditional way of visualizing this effect*.

Survival of doctors born between 1900 and 1930

While it may be clear from this chart that being a smoker is riskier than being a non-smoker, thinking in terms of percentage survival rates may not be intuitive for everyone. Here is how the same data would be illustrated using RCTs. Appropriately, the black squares indicate a death (and for those who prefer the original layout, there is also a theatre version).

Mortality of doctors born between 1900 and 1930

This is a rather striking chart. Particularly looking at the theatres for doctors up to 70 and 80 years old, the higher death rate of smokers is stark. However, the charts also highlight the inefficiency of the RCT. This graphic in fact only shows 8 of the 12 data points on the original charts.

So, the Risk Characterization Theatre is an interesting idea that may be a useful tool for helping to make numbers more concrete, but they are unlikely to be added to the arsenal of the serious data analyst.

As a final twist of the RCT, I have also designed a “Risk Characterization Stadium” which could be used to visualize even lower risks. Here is an illustration of 20 cases in 10,000 (0.2%).

* Note that the figures here differ slightly from those in Rifkin and Bower’s book. I have used data for doctors born between 1900 and 1930, whereas they refer to the 1900-1909 data but would in fact appear to have used the 1910-1919 data.

Fertility Declines Don’t Reverse with Development

4 Replies

In this follow-up guest post on The Stubborn Mule, Mark Lauer takes a closer look at the relationship between national development and fertility rates.

STOP PRESS: Switzerland’s population would be decimated in just two generations if it weren’t for advances in their development.

At least, that’s what the modelling in a recent Nature paper projects. The paper, widely reported in The New York Times, The Washington Post and The Economist, amongst others, was the subject of my recent Stubborn Mule guest post. In that post, I shared an animated chart and some statistical arguments that cast doubt on the paper’s conclusion. In this post, I’ll take a firmer stance: the conclusion is plain wrong. But to understand why, we’ll have to delve a little deeper into their model. Still, I’ll try to keep things as non-technical as possible.

First, let’s recap the evidence presented in the paper. It comprised three parts: a snapshot chart (republished in most of the reportage), a trajectory chart, and the results of an econometric model. As argued in my earlier post, the snapshot is misleading for several reasons, not least the distorted scales. And the trajectory chart suffers from a serious statistical bias, also explained in my earlier post. I’ll reproduce here my chart showing the same information without the bias.

That leaves the econometric model. From reading the paper, where details of the model are sketchy, I had wrongly inferred that the model suffered the same statistical bias as the trajectory chart. I have since looked at the supplementary information for the paper, and at the SAS code used to run the model. From these, it is clear that a fixed HDI threshold of 0.86 is used to define when a country’s fertility should begin to increase. So there’s no statistical bias. However, I discovered far more serious problems.

Continue reading →

Is There a Baby Bounce?

4 Replies

In this first ever guest post on The Stubborn Mule, Mark Lauer takes a careful look at the relationship between national development and fertility rates.

Recently The Economist and the Washington Post reported a research paper in Nature on the relationship between development and fertility across a large number of countries. The main conclusion of the paper is that, once countries get beyond a certain level of development, their fertility rates cease to fall and begin to rise again dramatically. In this post I’ll show an animated view of the data that casts serious doubt on this conclusion, and explain where I believe the researchers went wrong.

But first, let’s review the data. The World Bank publishes the World Development Indicators Online, which includes time series by country of the Total Fertility Rate (TFR). This statistic is an estimate of the number of children each woman would be expected to have if she bore them according to current national age-specific fertility rates throughout her lifetime. In 2005, Australia’s TFR was 1.77, while Niger’s was 7.67 and the Ukraine’s only 1.2.

The Human Development Index (HDI) is defined by the UN as a measure of development, and combines life expectancy, literacy, school enrolments and GDP. Using these statistics, again from the World Bank database, the paper’s authors construct annual time series of HDI by country from 1975 until 2005. For example, in 2005, Australia’s HDI was 0.966, the highest amongst all 143 countries in the data set. Ukraine’s HDI was 0.786, while poor old Niger’s was just 0.3.

A figure from the paper was reproduced by The Economist; it shows two snapshots of the relationship between HDI and TFR, one from 1975 and one from 2005. Both show the well-known fact that as development increases, fertility generally falls. However, the 2005 picture appears to show that countries with an HDI above a certain threshold become more fertile again as they develop further. A fitted curve on the chart suggests that TFR rises from 1.5 to 2.0 as HDI goes from 0.92 to 0.98.

Of course, this is only a snapshot. If there really is a consistent positive influence of advanced development on fertility, then we ought to see it in the trajectories through time for individual countries. So to explore this, I’ve used a Mathematica notebook to generate an animated bubble chart. The full source code is on GitHub, including a PDF version for anyone without Mathematica but still curious. After downloading the data directly from Nature’s website, the program plots one bubble per country, with area proportional to the country’s current population.

Unlike with the figure in The Economist, here it is difficult to see any turn upwards in fertility rates at high development levels. In fact, the entire shape of the figure looks different. This is because the figure in The Economist uses axes that over-emphasise changes in the lower right corner. It uses a logarithmic scale for TFR and a reflected logarithmic scale for HDI (actually the negative of the logarithm of 1.0 minus the HDI). These rather strange choices aren’t mentioned in the paper, so you’ll have to look closely at their tick labels to notice this.

To help focus on the critical region, I’ve also zoomed in on the bottom right hand corner in the following version of the bubble chart.

One interesting feature of these charts is that one large Asian country, namely Russia, and a collection of smaller European countries, dart leftwards during the period 1989 to 1997. The smaller countries are all eastern European ones, like Romania, Bulgaria and the Ukraine (within Mathematica you can hover over the bubbles to find this out, and even pause, forward or rewind the animation). In the former Soviet Union and its satellites, the transition from communism to capitalism brought a crushing combination of higher mortality and lower fertility. In Russia, this continues today. One side effect of this is to create a cluster of low fertility countries near the threshold HDI of 0.86 in the 2005 snapshot. This enhances the impression in the snapshot that fertility switches direction beyond this development level.

But the paper’s conclusion isn’t just based on these snapshots. The authors fit a sophisticated econometric model to the time series of all 37 countries that reached an HDI of 0.85, a model that is even supposed to account for time fixed-effects (changes in TFR due only to the passage of time). They find that the threshold at which fertility reverses is 0.86, and that beyond this

an HDI increase of 0.05 results in an increase of the TFR by 0.204.

This means that countries which develop from an HDI of 0.92 to 0.98 should see an increase in TFR of about 0.25. This is only about half as steep as the curve in their snapshot figure, but is still a significant rate of increase.

However, even this rate is rather surprising. Amongst all 37 countries, only two exhibit such a steep rise in fertility relative to development between the year they first reach an HDI of 0.86 and 2005, and one of these only barely. The latter country is the United States, which manages to raise TFR by 0.211 per 0.05 increase in HDI. The other is the Czech Republic, which only reaches an HDI of 0.86 in 2001, and so only covers four years. Here is a plot of the trajectories of all countries that reached an HDI of 0.86, beginning in the first year they did this. Most of them actually show decreases in TFR.

So how do the authors of the paper manage a statistically significant result (at the 0.1% level) that is so widely different from the data? The answer could well lie in their choice of the reference year, the year in which they consider each country to have passed the threshold. Rather than using a fixed threshold as I’ve done above, they express TFR

relative to the lowest TFR that was observed while a country’s HDI was within the window of 0.85–0.9. The reference year is the first year in which this lowest TFR is observed.

In other words, their definition of when a country reaches the threshold depends on its path of TFR values. In particular, they choose the year when TFR is at its lowest.

Does this choice statistically bias the subsequent trajectories of TFR upwards? I leave this question as a simple statistical exercise for the reader, but I will mention that the window of 0.85–0.9 is wider than it looks. Amongst countries that reached an HDI of 0.9, the average time taken to pass through that window is almost 15 years, while the entire data set only covers 30 years.

Finally I’d like to thank Sean for offering this space for my meandering thoughts. I hope you enjoy the charts. And remember, don’t believe everything you see in The Economist.

UPDATE:

To show that the statistical bias identified above is substantial, I’ve programmed a quick simulation to measure it. The simulation makes some assumptions about distributions, and estimates parameters from the original data. As such it gives only a rough indication of the size of the bias – there are many alternative possibilities, which would lead to larger or smaller biases, especially within a more complex econometric estimation.

In the simulation, each of the advanced countries begins exactly where it was in the year that it first reached an HDI of 0.85. Thereafter, a trajectory is randomly generated for each country, with zero mean for changes in fertility. That is, in the simulation, fertility does not increase on average at all¹. As in the paper, a threshold is found for each country based on the year with lowest TFR within the HDI window. All shifts in TFR thereafter are used to measure the impact of HDI on TFR (which is actually non-existent).

Here is a sample of the trajectories so generated, along with the fitted response from the paper.

The resulting simulations find, on average, that a 0.06 increase in HDI leads to an increase of about 0.075 in TFR, despite that fact that there is no connection whatsoever. The range of results is quite broad, with an increase of 0.12 in TFR also being a likely outcome. This is half of the value found in the paper; in other words, simulations of a simplified case where HDI does not influence TFR at all, can easily generate half of the paper’s result.

Of course, if the result is not due to statistical bias, then the authors can easily prove this. They need only rerun their analysis using a fixed HDI threshold, rather than one that depends on the path of TFR. Until they do, their conclusion will remain dubious.

¹ For the technically minded, the HDI follows a random walk with drift and volatility matching those of advanced countries, and the TFR follows an uncorrelated random walk with volatility matching the advanced countries, but with zero drift. The full source code and results have been uploaded to the Github repository.

FURTHER UPDATE:

More details can be found in the follow-up post to this one, Fertility Declines Don’t Reverse with Development.

Swine Flu on Swivel

1 Reply

I have now uploaded the swine flu data to a Swivel data set. I will update this data set periodically and so the rankings in the chart below should stay reasonably up to date.

Data sources: Guardian Data Blog, CIA World Fact Book.

UPDATE: A number of people have told me that in a number of places, including Victoria and much of the US, testing for swine flu has ceased. This means that the “lab confirmed” swine flu count will become increasingly meaningless over time, so I have decided to stop updating this data.

Couch Potatoes

12 Replies

A colleague has lent me a copy of Oliver James’ book “Affluenza” and, while I am not far through it yet, it is scathing in its damnation of the effects of capitalism on individuals in society. At a time when capitalism is rapidly losing it shine on a global scale, with the financial sector collapsing around us, this individual perspective is an interesting small scale counterpoint to the large scale picture we are seeing on the news each day.

The thesis of the book is that an “affluenza virus” has spread thoughout English-speaking countries. This virus leads us to be obsessively focused on shallow material pursuits. At the same time, it leaves us anxious and prone to low self-esteem, addictions and depression as there is always someone with a faster car or a bigger cigar (to quote The Beautiful South).

Continue reading →