# Predictability and the Base Rate

Whenever a statistician wants to predict the likelihood of some event based on the available evidence, there are two main sources of information that have to be taken into account: (1) the evidence itself, for which a reliability figure has to be calculated; and (2) the likelihood of the event calculated purely in terms of relative incidence. The second figure here is the base rate. Since it is just a number, obtained by the seemingly dull process of counting, it frequently gets overlooked when there is new information, particularly if that new information is obtained by “experts” using expensive equipment. In cases where the event is dramatic and scary, like a terrorist attack on an airplane, failure to take account of the base rate can result in wasting massive amounts of effort and money trying to prevent something that is very unlikely.

For example, suppose that you undergo a medical test for a relatively rare cancer. The cancer has an incidence of 1 percent among the general population. (That is the base rate.) Extensive trials have shown that the reliability of the test is 79 percent. More precisely, although the test does not fail to detect the cancer when it is present, it gives a positive result in 21 percent of the cases where no cancer is present—what is known as a false positive. When you are tested, the test produces a positive diagnosis. The question is: What is the probability that you have the cancer?

If you’re like most people, you’ll assume that if the test has a reliability rate of nearly 80 percent, and you test positive, then the likelihood that you do indeed have the cancer is about 80 percent (i.e., the probability is approximately 0.8). Are you right?

The answer is no. You have focused on the test and its reliability and overlooked the base rate. Given the scenario just described, the likelihood that you have the cancer is a mere 4.6 percent (i.e., 0.046). That’s right—there is a less than 5 percent chance that you have the cancer. Still a worrying possibility, of course. But hardly the scary 80 percent you thought at first.

## Notes:

Keith Devlin explains why the accuracy of tests and measurments must take into account the base rate for the phenomenon.

Folksonomies: predictability

Taxonomies:
/health and fitness/disease/cancer (0.682154)
/finance/personal finance/insurance (0.534139)
/health and fitness/disease/aids and hiv (0.267114)

Keywords:
base rate (0.962879 (positive:0.079542)), percent (0.810359 (negative:-0.245762)), Rate Keith Devlin (0.775631 (negative:-0.457402)), seemingly dull process (0.728379 (negative:-0.617337)), cancer (0.726960 (negative:-0.583879)), relatively rare cancer (0.708974 (negative:-0.597086)), new information (0.699901 (negative:-0.415415)), percent chance (0.666191 (negative:-0.823671)), test (0.647622 (negative:-0.002192)), reliability rate (0.644340 (positive:0.368270)), likelihood (0.640410 (negative:-0.433121)), reliability figure (0.635592 (neutral:0.000000)), relative incidence (0.632997 (negative:-0.441965)), main sources (0.631878 (neutral:0.000000)), available evidence (0.626045 (negative:-0.413206)), expensive equipment (0.623810 (negative:-0.344273)), massive amounts (0.622658 (negative:-0.781560)), terrorist attack (0.621326 (negative:-0.507852)), worrying possibility (0.619013 (negative:-0.578387)), Extensive trials (0.615503 (positive:0.441714)), general population (0.615217 (negative:-0.832642)), medical test (0.611272 (negative:-0.597086)), positive diagnosis (0.607974 (positive:0.455950)), false positive (0.605353 (neutral:0.000000)), positive result (0.604026 (neutral:0.000000)), account (0.576016 (negative:-0.619481)), event (0.574405 (negative:-0.427585)), probability (0.551337 (negative:-0.593677)), cases (0.547811 (neutral:0.000000)), statistician (0.532288 (negative:-0.413206))

Entities:
cancer:HealthCondition (0.856341 (negative:-0.572556)), Keith Devlin:Person (0.456400 (negative:-0.457402)), false positive:FieldTerminology (0.407188 (neutral:0.000000)), 80 percent:Quantity (0.407188 (neutral:0.000000)), 4.6 percent:Quantity (0.407188 (neutral:0.000000)), 21 percent:Quantity (0.407188 (neutral:0.000000)), 79 percent:Quantity (0.407188 (neutral:0.000000)), 1 percent:Quantity (0.407188 (neutral:0.000000)), 5 percent:Quantity (0.407188 (neutral:0.000000))

Concepts:
Psychometrics (0.934598): dbpedia | freebase | opencyc
Statistics (0.894623): dbpedia | freebase | opencyc
Epidemiology (0.849213): dbpedia | freebase | opencyc
Failure (0.818985): dbpedia | freebase
Probability theory (0.754131): dbpedia | freebase | opencyc
Probability (0.735085): dbpedia | freebase
Reliability engineering (0.719382): dbpedia | freebase
Conditional probability (0.677455): dbpedia | freebase  This Will Make You Smarter
Books, Brochures, and Chapters>Book:  Brockman , John (2012-02-14), This Will Make You Smarter, HarperCollins, Retrieved on 2013-12-19