AdvancedBehavioral Economics / Philosophy of Science

The Prosecutor's Fallacy

Confusing P(evidence | innocent) with P(innocent | evidence).

What it is

The prosecutor's fallacy is the error of treating two different conditional probabilities as if they were the same. Specifically: confusing the probability of observing certain evidence given innocence — P(evidence | innocent) — with the probability of innocence given that evidence — P(innocent | evidence). These are not the same number, and in many real cases they differ by orders of magnitude.

The name comes from its most visible domain — criminal trials — but the fallacy appears wherever people reason from diagnostic signals to underlying causes. It is one of the most consequential errors in applied probabilistic reasoning.

The classic example

In a 1999 murder trial in England, a statistician testified that the probability of two children in the same family dying of sudden infant death syndrome (SIDS) was 1 in 73 million. The implication was that the odds of innocence were 1 in 73 million. The mother was convicted. She was later released when statisticians pointed out the fallacy.

The 1-in-73-million figure was P(two SIDS deaths | innocent family). But the relevant question is P(innocent | two child deaths). To answer that, you need to know how often double infant deaths occur due to causes other than SIDS — which is also rare. When both hypotheses are rare, the ratio between them determines the posterior probability, not just one number in isolation.

In DNA evidence, the same fallacy appears regularly. If a DNA profile occurs in 1 in 10,000 people, and a match is found, the fallacy is to conclude there is a 1-in-10,000 chance the suspect is innocent. But if there were 10,000 people in the database, you would expect one false match on average even among entirely innocent people.

Why it matters in markets and analysis

Financial analysis involves constant conditional reasoning. Analysts regularly observe signals and make inferences about their causes.

Consider an analyst who notes that a particular earnings pattern preceded every major corporate fraud in their dataset. P(this pattern | fraud) might be high. But the relevant question is P(fraud | this pattern) — which depends on how common the pattern is in legitimate companies. If thousands of healthy companies show the same pattern, the signal is nearly useless despite appearing predictive.

The same structure applies to technical analysis, credit analysis, and macro forecasting. A signal that fires before every recession is not useful if it also fires during non-recessions. The base rate of recessions relative to false positives determines the signal's actual value.

Where it shows up

In law: Eyewitness identifications, DNA matches, bite mark evidence — all of these produce P(evidence | innocent) numbers that get reported as if they were P(innocent | evidence). The difference is enormous when the base rate of guilt in the relevant population is not close to 50%.

In medicine: A test that is 99% sensitive and 99% specific for a disease that affects 1 in 10,000 people will produce mostly false positives. Of 100 positive tests, roughly 99 will be in people who do not have the disease. P(positive test | disease) is 0.99; P(disease | positive test) is approximately 0.01.

In investing: Investment managers who show strong recent track records. P(this track record | pure luck) may be low — but if there are thousands of managers, some will produce impressive records by chance. The correct question is P(skill | track record), which requires estimating the prior probability of genuine skill in the population.

The right way to think about it

Bayes' theorem provides the correction:

P(innocent | evidence) = P(evidence | innocent) × P(innocent) / P(evidence)

The term P(innocent) is the prior probability — the base rate before seeing the evidence. The term P(evidence) normalizes across all possible explanations for the evidence.

The intuition: to judge a hypothesis given evidence, you must compare how likely the evidence is under your hypothesis versus how likely it is under all alternative hypotheses, weighted by the prior probability of each hypothesis.

When the prior probability of guilt is low (because the suspect pool is large or the crime is rare), even strong evidence produces a surprisingly modest posterior probability. This is not a loophole — it is arithmetic.

One thing most people get wrong

The fallacy becomes most dangerous precisely when the base rate is very low. A test that sounds extremely accurate — 1-in-a-million false positive rate — sounds conclusive. But if the underlying condition affects 1 in a billion people, a positive test result still means the probability of actually having the condition is only about 0.1%. The accuracy of the test is swamped by the rarity of what it is testing for.

This is why rare-disease testing, rare-crime forensics, and rare-event financial prediction all require explicit base rate accounting. The more extreme and unlikely the hypothesis, the more extreme the evidence needs to be — not just unlikely given innocence, but more likely given guilt than given any other explanation at the relevant base rate. Failing to account for this has sent innocent people to prison and has destroyed capital by triggering on false signals in markets.