Audio Cassette
Data

Survivorship Bias – Why what’s missing is important

Survivorship bias is the logical error of concentrating on people or things that make it past some selection process and overlooking those that do not. Probably the most famous example of this is the “missing bullet holes”…

The damaged portions of returning planes show locations where they can sustain damage and still return home

During World War II, bombers would return from battle with numerous bullet holes. The Allies discovered a common pattern to these holes and strengthened the areas most commonly hit by enemy fire. The hope was that this would reduce the number of aircraft shot down.

The mathematician Abraham Wald reasoned that there was another way to look at the data. By incorporating survivorship bias into his calculations he concluded that the way to reduce losses was to add armour to the areas that showed the least damage.

He concluded that the way to reduce losses was to add armour to the areas that showed the least damage.

As the military only considered aircraft that had survived their missions, any bombers that had been shot down were ignored. This meant that the bullet holes in the returning aircraft represented areas where a bomber could take damage and still fly well enough to return safely.

By looking at what data was missing, it was possible to draw more accurate conclusions and increase the survivability of aircraft. Quite simply, the reason why certain data may be missing can be more meaningful than the data we have.

Quite simply, the reason why certain data may be missing can be more meaningful than the data we have.

In World War I

To continue the military theme, in WWI the same bias almost resulted in effective military helmets being re-designed. Following the introduction of the ‘Brodie’ helmet, there was a dramatic rise in field hospital admissions for severe head injuries. The rise was so sharp that Army command considered redesigning the helmet. A statistician investigating the issue remarked that soldiers who might previously have been killed by shrapnel hits to the head (and therefore never made it to the hospital) were now surviving the same hits, and thus made it to a field hospital. The design stayed.

The impact of survivorship bias is so common that we encounter it almost every day.

“They don’t make ’em like they used to”

This is a commonly held belief that items made in the past were better built and lasted longer than today. What this fails to account for, is that due to time and use, it is inevitable that only those items that were built to last will have survived into the present day. All of the machinery, equipment or goods that have failed over the intervening years are no longer around as they have been disposed of.

“Modern architecture is ugly/awful/cheap”

New buildings are built every day, while older structures are demolished. So only those buildings deemed “worthy” of preservation survive for extended periods of time. The consequence is that the ugliest and weakest buildings of history have long been demolished leaving the flawed impression that all buildings in the past were more beautiful or better constructed.

“Modern music is rubbish”

Music from one’s youth, or generally the past is often thought of as better than music now. This is because only the best music from any previous period is played today, while today’s music, good or bad, is far more available.

“Work Hard To Succeed”

Actors, athletes, musicians, CEOs who failed their exams… The media often tells the story of the determined person who pursues their dreams and beats the odds. But what about all those people who are just as skilled and just as determined but never ‘make it’?

Given that the vast majority of failures are not publicly visible, this creates a false perception that anyone can achieve great things if they simply ‘work-hard’.

The “meaning” is more important than the data

What the above illustrates, is that the story behind the data is demonstrably more important than the data itself. Therefore the reason why certain data may be missing is often more meaningful than the data that is present.

Survivorship bias is of particular importance when analysing online advertising and website performance. Ad agencies will often boast of their successes by showing high conversion rates or high click-through rates. But in isolation these metrics are meaningless. Unless the whole picture is analysed, including the “missing-data”; the failed conversions, the missed clicks, etc… How do you know if you really got what you paid for? Would you have achieved the same results by simply putting more money into your existing campaigns for example?

Similar Posts