I guess I’ve re-discovered a form of Bayes’ Theorem regarding the problem that is posed by the high numbers of false negatives and false positives when testing for the feared coronavirus. What I found is that it doesn’t really even matter whether our tests are super-accurate or not. The solution is to assume that all those who test negative, really are negative, and then to give a second test to all those who tested positive the first time. Out of this group, a larger fraction will test positive. You can again forget about those who test negative. But re-test again, and if you like, test again. By the end of this process, where each time you are testing fewer people, then you will be over 99% certain that all those who test positive, really have been exposed.

Let me show you why.

Have no fear, what I’m gonna do is just spreadsheets. No fancy math, just percents. And it won’t really matter what the starting assumptions are! The results converge to almost perfect accuracy, if repeated!

To start my explanation, let’s start by assuming that 3% of a population (say of the US) has antibodies to CV19, which means that they have definitely been exposed. How they got exposed is not important for this discussion. Whether they felt anything from their exposure or not is not important in this discussion. Whether they got sick and died or recovered, is not going to be covered here. I will also assume that this test has a 7% false positive rate and a 10% false negative rate, and I’m going to assume that we give tests AT RANDOM to a hundred thousand people (not people who we already think are sick!) I’m also assuming that once you have the antibodies, you keep them for the duration.

This table represents that situation:

If you do the simple arithmetic, using those assumptions, then of the 100,000 people we tested, 3%, or three thousand, actually do have those antibodies, but 97%, or ninety-seven thousand, do not (white boxes, first column with data in it).

Of the 3,000 folks who really do have the antibodies – first line of data – we have a false negative rate of 10%, so three hundred of these poor folks are given the **false** good tidings that they have never been exposed (that’s the upper orange box). The other 90% of them, or two thousand seven hundred, are told, **correctly**, that they **have been exposed** (that’s the upper green box).

Now of the 97,000 people who really do NOT have any antibodies – the second line of data – we have a false positive rate of 7%, so you multiply 0.07 times 97000 to get six thousand, seven hundred ninety of them who would be told, **incorrectly,** that they DID test positive for Covid-19 – in the lower orange box. (Remember, positive is bad here, and negative is good.) However, 90,210 would be told, correctly, that they did **not** have those antibodies. (That’s in the lower green box.)

Now let’s add up the folks who got the positive test results, which is the third data column. We had 2,700 who correctly tested positive and 6,790 who wrongly tested positive. That’s a total of 9,490 people with a positive CV19 antibody test, which means that of that group of people, only 28.5% were correctly so informed!! That’s between a third and a fourth! Unacceptable!

However, if we look at the last column, notice that almost every single person who was told that they were negative, really was negative. (Donno about you, but I think that 99.7% accuracy is pretty darned good!)

However, that 28.5% accuracy among the ‘positives’ (in the left-hand blue box) is really worrisome. What to do?

Simple! Test those folks again! Right away! Let’s do it, and then let’s look at the results:

Wowser! We took the 9490 people who tested positive and gave them another round of tests, using the exact same equipment and protocols and error rates as the first one. The spreadsheet is set up the same; the only thing I changed is the bottom two numbers in the first data column. I’m not going to go through all the steps, but feel free to check my arithmetic. Actually, check my logic. Excel doesn’t really make arithmetic errors, but if I set up the spreadsheet incorrectly, it will spit out incorrect results.

Notice that our error rate (in blue) is much lower in terms of those who tested positive. In fact, of those who test positive, 83.7% really ARE positive this time around, and of those who test negative, 95.9% really ARE negative.

But 84% isn’t accurate enough for me (it’s either a B or a C in most American schools). So what do we do? Test again – all of the nearly three thousand who tested positive the first time. Ignore the rest.

Let’s do it:

At this point, we have much higher confidence, 98.5% (in blue), that the people who tested ‘positive’, really are ‘positive’. Unfortunately, at this point, of the people who tested negative, only about 64% of the time is that correct. 243 people who really have the antibodies tested negative. So perhaps one should test that subgroup again.

The beautiful thing about this method is that it doesn’t even require a terribly exact test! But it does require that you do it repeatedly, and quickly.

Let me assure you that the exact level of accuracy, and the exact number of exposed people, doesn’t matter: If you test and re-test, you can find those who are infected with almost 100% accuracy. With that information you can then discover what the best approaches are to solving this pandemic, what the morbidity and mortality rates are, and eventually to stop it completely.

Why we don’t have enough tests to do this quickly and accurately and repeatedly is a question that I will leave to my readers.

Addendum:

Note that I made some starting assumptions. Let us change them and see what happens. Let’s suppose that the correct percentage of people with COVID-19 antibodies is not 3%, but 8%. Or maybe only 1%. Let’s also assume a 7% false positive and a 10% false negative rate. How would these results change? With a spreadsheet, that’s easy. First, let me start with an 8% infection rate and keep testing repeatedly. Here are the final results:

Round | Positive accuracy rating | Negative accuracy rating |

1 | 52.8% | 99.1% |

2 | 93.5% | 89.3% |

3 | 99.5% | 39.3% |

So after 3 rounds, we have 99.5% accuracy.

Let’s start over with a population where only 1% has the antibodies, and the false positive rate is 7% and the false negative rate is 10%.

Round | Positive accuracy rating | Negative accuracy rating |

1 | 11.5% | 99.9% |

2 | 62.6% | 98.6% |

3 | 95.6% | 84.7% |

4 | 99.6% | 30.0% |

This time, it took four rounds, but we still got to over 99.6% accuracy at distinguishing those who really had been exposed to this virus. Yes, towards the end our false negative rate rises, but I submit that doesn’t matter that much.

So **Parson Tommy Bayes **was right.

## Leave a Reply