Posted on 19 February 2021
If you are reading this article, then congratulations – you have been entered for a free COVID-19 test. Here is your result: you do not have COVID-19. This test is over 99% accurate. It also fails to diagnose 100% of COVID-19 cases. It is, all things considered, a pretty rubbish test.
Accuracy, by itself, is not always a particularly useful measurement when it comes to medical tests. Accuracy is simply the rate at which a test correctly identifies whether or not an individual has the condition being tested for. However, as the above example shows, a test that always returns a negative result can still easily be right most of the time if the test is applied indiscriminately, but remains completely useless for the purpose of actually detecting disease.
To better describe the qualities of a test, scientists break things down into two distinct measurements: sensitivity and specificity.
Sensitivity is the ability of a test to correctly detect cases of the disease in question. Suppose that a test is 99% sensitive, and we test 1000 people of whom 100 actually have the disease. Amongst those diseased individuals, there will be on average 99 positive results and 1 false negative result – we will miss 1 case. The higher the sensitivity, the less chance of a false negative, and the more confident a person can be that they are disease-free when they test negative. But how will the test perform in those 900 people who didn’t have the disease to begin with? To know this, we need to know the specificity of the test.
Specificity is the ability of a test to correctly identify when an individual does not have the disease. If a test is 99% specific, and we test 1000 people of whom 900 do not have the disease, then amongst these individuals, there will be on average 9 false positives. The higher the specificity, the less chance of a false positive, and the more confident a person can be that they do truly have the disease when they test positive.
What counts as ‘good’ specificity or sensitivity depends on how common the disease is and how the test is being used. Suppose we were to use the test described above to screen the population for disease X, a disease that only occurs in one in a million individuals. For this purpose, our 99% sensitivity seems pretty good – we’ll only miss 1 in 100 cases. However, we have to screen about 1 million people in order to find a single case. Even with a specificity of 99%, we’ll still get 10 000 false positives, meaning that only around 1 in 10 000 people with positive results actually have disease X – our test’s specificity is much too low.
Luckily, there is a way of making our test more useful. Instead of screening everyone for disease X, we can test only the people who have disease X’s very well defined symptoms and risk factors. Suddenly, the proportion of the people being tested who actually have disease X is much higher – perhaps 9 in 10 people we test have disease X. Now our 99% specificity is more than adequate – if we tested 1000 people, we’d pick up 891 out 900 cases for only one false positive. Conversely, our 99% sensitivity has become slightly less impressive. Since the people we’re testing are very likely to have the disease, a given negative test has more chance of being wrong than when we were screening the general population.
If you think this is beginning to sound confusing, you would be correct. Thankfully, we have some easier to understand metrics for expressing the qualities of a test. These are known as positive and negative predictive values (PPV and NPV).
When we receive the results of a medical test, what most of us would like to know is ”what are the chances that my result is actually correct”? Positive and negative predictive values combine all of the factors that we discussed above (sensitivity, specificity, and the frequency of disease within the tested population) in order to provide an answer to this question. For example, if a test has a PPV of 90%, then a positive test has a 90% chance of being correct – at least, under the specific circumstances in which the test is being used.
It is important to be aware that these values are calculated using the frequency of the disease in the tested population. A COVID-19 PCR test could have a 90% PPV in the UK, but that same test might have a much lower PPV in another country where COVID-19 was less prevalent. Likewise, if the test was to be given to more asymptomatic individuals, the PPV would drop.
Sensitive tests are good at detecting cases of a disease, while specific tests are good at ‘ignoring’ healthy people. The less common a disease is within the tested population, the more important specificity becomes compared with sensitivity. For this reason, a test that is appropriate for use in individuals with symptoms or risk factors for a disease may be wholly unsuitable for screening healthy individuals.
To understand how good a test really is, it may be more helpful and intuitive to look at positive and negative predictive values, which represent how likely a given test result is to be correct.
Copyright © Gowing Life Limited, 2023 • All rights reserved • Registered in England & Wales No. 11774353 • Registered office: Ivy Business Centre, Crown Street, Manchester, M35 9BG.