CoreStatistics / Decision Making

Survivorship Bias

Studying only what survived, and drawing conclusions that apply to everything.

What it is

Survivorship bias is the logical error of drawing conclusions from a selected sample composed only of entities that survived some elimination process, while ignoring entities that did not. The survivors are not a representative sample of the original population — they have been filtered by the very outcome you are trying to study.

The bias is insidious because the survivors are the only ones visible. The failures have quietly exited the dataset. Any pattern observed among survivors may be a property of survival rather than a property of success.

The classic example

During World War II, the statistician Abraham Wald was asked to advise on where to add armor to Allied aircraft. The military had collected data on bullet holes in planes that returned from missions and proposed adding armor to the areas most frequently hit.

Wald pointed out the error. The planes under observation were the ones that survived — which meant the bullet holes they carried were in locations that were not fatal. The planes that did not return were the missing data. Wald argued the armor should go on the areas that were not hit on returning planes — because if those areas had been hit, the planes would not have returned to be counted.

This is survivorship bias in its clearest form: the sample excludes the most informative observations, precisely because those observations represent failure.

In finance

Mutual fund performance data is one of the most studied examples of survivorship bias in practice. When you look at the average historical return of funds that exist today, you are looking at a sample that has been filtered: funds that performed poorly were closed or merged, removing them from the dataset. The "average fund" in the database performed better than the average fund that actually existed during the period, because the bad ones are gone.

Studies consistently find that accounting for survivorship bias reduces apparent fund manager alpha by several percentage points annually. The magnitude is large enough to substantially change conclusions about whether active management adds value.

In venture capital, the portfolios that are visible — the ones written up in case studies, profiled in business media, and used to recruit founders — are the winners. The 60-70% of investments that return less than capital are quiet. Reading about successful VC portfolios teaches you about survival, not about the quality of initial decision-making.

Strategy consulting faces the same problem. The "best practices" of successful companies are studied extensively. The companies that adopted the same practices and failed are rarely written up. The resulting advice may be correlated with success only in the sense that it did not kill the companies that tried it.

In strategy and research

Jim Collins's "Good to Great" identified practices common to companies that outperformed their peers over a long period. But a rigorous critique pointed out that many of the identified practices were also common to companies that underperformed — the book selected only on the dependent variable (greatness), not on the independent variables (practices). Without a control group of companies that shared the same practices but failed, no causal inference is possible.

The same problem affects almost every qualitative business case study. When Harvard Business School profiles a successful turnaround, the practices described may be shared by dozens of failed turnarounds never published.

One thing most people get wrong

Survivorship bias compounds. When you study a survivor-selected sample, every metric you calculate on that sample is contaminated. Not just returns, but volatility, Sharpe ratios, investment styles, factor exposures — all are calculated on a non-representative population. The bias does not just affect the mean; it affects the entire distribution. A financial strategy that looks low-volatility among survivors may have included extremely volatile and ultimately fatal positions in the full population. The covariance structure itself is distorted.

The corrective is demanding: you must find and include the failures. In practice, this means using point-in-time databases (Compustat point-in-time, CRSP delisting returns), tracking funds from inception rather than backward from today, and being suspicious of any dataset assembled from currently-observable entities.