### Liars, Clinical Tests and a Bit of Math

Bryan Caplan of EconLog posted on our ability to distinguish truth tellers from liars. It's not as good as we might think. According to one study cited "People correctly identify truths 70% of the time and correctly identify lies only 50% of the time."

If lies are the disease, then this corresponds to what statisticians refer to as a

*sensitivity*of 50% and a

*specificity*of 70%. Please note that I'm restating the more pessimistic analysis of Kaplan's where

*truth*is considered the disease.

Both of these performance indicators are pretty bad by medical standards. You'd definitely hope that that expensive spiral CT was better at detecting a serious blood clot in your lung!

The concept of test sensitivity and specificity is confusing to a lot of people especially doctors. Although it's a simplification, for most purposes, these numbers reflect only the

*test*under consideration. What clinicians (and in today's example, CIA interrogators) are mainly interested is

*predictive ability*, the ability of a test to predict the probability that a person has or doesn't have a particular disease (or is telling a fib).

We use knowledge of a test's sensitivity and specificity to help us with these predictions using a horrible piece of mathematics called

*Bayes' theorem*.

To use the example here, assume a person lies 50% of the time. This is called the

*pre-test*probability (or prevalence) of lying. This is about what we'd expect in a politician. With the above sensitivity and specificity, Bayes' theorem would predict the following:

If a person "seems" to be lying, there's a 63%

*post-test*probability that he is.

If a person "seems" to be truthful, theres a 58% post-test probability that he is.

So given a 50% probability of lying in the first place, after actually observing the person, if you suspect a lie, our post-test conviction has risen only from 50% to

*63%*. Not very good in my opinion.

If we suspect truthfulness, our post-test conviction of truthfulness only rises from 50% to

*58%*. It would appear that we're worse at detecting truth.

One problem with this kind of analysis is that it is not very intuitive. juggling around sensitivities and specificities requires some mental gymnastics that few of us are capable of. Many different combinations of sensitivity and specificity (and probabilities of having the disease in the first place) will yield similar or different predictive abilities of the test. What is needed is a framework for thinking about tests that allows us to

*combine*sensitivity and specificity into one number which can then be thought of

*independently*from the disease prevalence.

Fortunately such a framework was published (at least in the medical literature) in 1999 here and here. These articles are not for the faint of heart! On the other hand, the concepts the author discusses actually make this whole thing easier. He's incorporated sensitivity and specificity, two somewhat nebulous concepts into two indicators: a positive and a negative

*likelihood ratio*(PLR and NLR) which even an internist can understand.

Likelihood ratios are calculated from the sensitivity and specificity using some simple formulas. You can find the formulas here. You can also find an online calculator that does this for you here. I only want to discuss this conceptually and demonstrate how LR's are used.

When using LR's, you think in terms of

*odds*instead of probability. This is the way things are done in Las Vegas. Instead of saying that there's a 50% chance a coin will come up heads, we say the odds are 1 to 1 (1/1 or 1.0). Instead of saying that the odds of a (non-crooked) die coming up 5 is 1/6 or 17% we say the odds are 1 to 5 (1/5 or .20).

The reason for using odds instead of probability is that the relationships between pre-test odds and post-test odds becomes VERY straightforward mathematically. No horrible Bayes' Theorem! The relationships are here:

If the test is positive:

(pre-test odds of condition) X(PLR) = (post-test odds of condition)

If the test is negative:

(pre-test odds of condition)X(NLR) = (post-test odds of condition-free)

This makes things easy to understand. In our example above, the PLR is 1.7. Use the online calculator to calculate this (and use proportions instead of percentages). This means that the odds of someone telling a lie increases by a factor of 1.7 if you think he looks dishonest.

No complicated Bayesian analysis. No fiddling with sensitivity or specificity. In fact, you don't even have to know the pre-test odds of disease. You just know that on the basis of the above-cited data, the post-test odds of lying is 1.7 times greater if you suspect lying. This gives you a better mental idea of how good the test is.

Likewise, the NLR is 0.7. So if you suspect that the person is telling the truth (a negative test), then the odds that he's lying go

*down*to 0.7 times the pre-test odds.

Have I hopelessly confused you all? I hope not. The reason I mention all this is because once you get the hang of it, thinking in terms of LR's is much easier than thinking in terms of sensitivity and specificity. The medical literature increasingly calculates and cites the LR's for tests. Hopefully now you'll understand why they are useful.

If you're interested in this topic, I'd strongly suggest reading the link I cited above.

Labels: Education, Statistics

## 3 Comments:

What a GREAT blog!

I enjoyed this post.

Keep it up!

Incidentally, it's

, not Baye's.Bayes'The eponym is after the Reverend Thomas Bayes, who lived in the 1700s. He died before ever publishing the work for which his theorem was named. His friend, Richard Price, brought out the paper posthumously.

You might be interested in reading the original work,

"An Essay towards solving a Problem in the Doctrine of Chances"here (pdf). It begins with an intro from Price.Thanks for the correction MD. I did not know that! I'm going to correct that in my post.

John

Post a Comment

<< Home