Knowing Simpson’s Paradox Can Help You Spot Data Misrepresentation

Posted on 31 December 2021

In late November of 2021, a rather perplexing graph began to make the rounds on social media. Claiming to be based on data from the UKs Office for National Statistics (ONS), the graph appeared to show that, for individuals aged 10 to 59 in England, more COVID-vaccinated people had died (from any cause) than unvaccinated people during the previous 6 months. This seems to contradict all of the scientific evidence concerning vaccine effectiveness and safety, and so you might be forgiven for assuming that this data was simply made up. However, there was actually nothing wrong with the data itself, which did indeed come from the ONS. Yet that very same ONS data also showed that for a given age, a vaccinated person was significantly less likely to die from any cause than an unvaccinated person. This apparent contradiction is an example of a statistical phenomenon called Simpson’s paradox.

What Is Simpson’s Paradox?

Simpson’s paradox occurs when a statistical relationship (such as between vaccination and all-cause mortality) disappears or even reverses when the data is subdivided into smaller categories. To understand how this can happen, let’s take another real world example: a study which looked at the success rates of open surgery (treatment A) vs closed surgery (treatment B) for treating kidney stones. Here’s a table summarising the success rates:

Success rates (with actual patient numbers shown in brackets) for open surgery (treatment A) vs closed surgery (treatment B) for small kidney stones, large kidney stones, and pooled success rates.

You can see that in the case of both small kidney stones and large kidney stones, treatment A (open surgery) was more likely to be successful. Yet when all patients are considered together irrespective of kidney stone size, open surgery was successful in 78% of cases, while closed surgery was successful in 83% of cases – the relationship is reversed. How is this possible? If you pay attention to the patient numbers (shown in brackets), you may be able to guess. When doctors received a patient with small kidney stones, they were much more likely to choose the less invasive but less effective treatment for the easier to treat condition. This meant that open surgery was used mainly on large kidney stones, which are inherently harder to treat, making open surgery appear to be less effective overall. This is an example of why when comparing two groups, scientists need to ensure that those groups are as identical as possible apart for the intervention being studied.

Back To Vaccination

So, what happened in the case of the vaccine data? The problem lies with the size of the age band. 10 to 59 is a very wide range of ages to include in a single group, and this introduces a huge confounding factor in the form of age itself. As you will probably be aware if you follow this site, age is the single biggest risk factor for death from any cause. A 59 year-old is far more likely to die in a given period of time than a 10 year-old, simply by virtue of their age. A 59 year-old is also far more likely to be vaccinated against COVID-19 than a 10 year-old – indeed, at the time this data was being circulated, over half of all unvaccinated 10-59 year-olds were under the age of 25, while over half of the vaccinated 10-59 year-olds were over the age of 40. It’s therefore not that surprising that the vaccinated members of this group were more likely to die – not because vaccines are unsafe, but because these people are older on average.

The other age groups presented in the ONS data (60-69, 70-79 and 80+) were all much narrower, and all showed a significant reduction in mortality for the vaccinated. Sure enough, if the data for individuals aged 10-59 is subdivided into smaller age groups, or if the data is adjusted for age, we see that vaccination does generally reduce risk of death below the age of 59 as well.

The moral of the story is that it’s very easy to misrepresent perfectly legitimate data in order to argue the opposite of what the data is actually telling us. Luckily, you don’t have to be a statistician in order to train yourself to spot this kind of data manipulation. If you want to do just that (and learn about other kinds of dodgy statistics), then check out the More or Less podcast, in which this story was covered.


Simpson’s Paradox: How to make vaccinated death figures misleading:

Featured in This Post

Never Miss a Breakthrough!

Sign up for our newletter and get the latest breakthroughs direct to your inbox.

Checkout the Gowing Life Store

Scientifically Developed Blended Vitamins, and Exclusive Supplements For Health, and Longevity

Copyright © Gowing Life Limited, 2022 • All rights reserved • Registered in England & Wales No. 11774353 • Registered office: Ivy Business Centre, Crown Street, Manchester, M35 9BG.