In statistical computations, intuition can be very misleading
A Bad Penny Always Shows Up
To simulate what would happen with a bad (deformed, bent) penny that shows tails more often than heads, you could add more sides to the coin in line 5 of Listing 1:
my @sides = qw( H H H T T T T );
From seven tosses, the coin would then come up heads three times and tails four times; the script would then correspondingly compute (with random deviation) the p-value from:
$ Rounds: 1000 Tails: 565 p-value: 0.0351
The p-value is approximately 0.04 percent (i.e., well below the specified 5 percent threshold for the significance value). This seriously threatens the null hypothesis that the coin should land on both sides with the same likelihood.
Careful with Your Diagnosis
Experiments that test new medications or treatment procedures for their efficiency define the null hypothesis as "The medication has no effect," set the significance value to around 5 percent, and then alert if the p-value drops below this in the experiment – that is, if there are suddenly good reasons for assuming that the null hypothesis is incorrect. In this case, the miracle cure tested really has shown some positive treatment effects with a high degree of probability.
According to Alex Reinhart's recently published book on statistical blunders [3], however, it is common practice for studies to interpret the significance value incorrectly retroactively, thus giving patients false hope or needlessly causing patients to panic. These base rate fallacies [5] occur in the context of conditional probabilities and are caused by the fact that a certain result already has a certain probability a priori that needs to be considered in the computation.
The following experiment from Reinhart's book shows what for many people is an amazing deviation between popular opinion and precise science: A mammography returns the correct diagnosis for patients with breast cancer with a 90 percent probability. However, the test comes up with a diagnosis of breast cancer for approximately 7 percent of healthy patients, so that – in the case of positive findings – further diagnostic procedures are necessary for clarification. The question is now: Is this test suitable for effectively screening the population? If the mammography detects breast cancer, how great is the probability that a randomly selected woman really needs treatment?
Most people will think about this for a while, and then subtract the 7 percent false positive rate from 100 percent in their heads and end up with a result of around 93 percent. However, this assumption is totally wrong. What is the correct result? Maybe 70 percent? Or even 50 percent? The amazing truth is that a mammography performed on a randomly selected woman correctly diagnoses breast cancer in only around 9 percent of the cases.
Amazing Statistics
If your mind experiment led you to believe that the test accuracy was higher than it actually is, you probably fell into the typical base rate fallacy trap and forgot to consider in your calculations that, on average, only 0.8 percent of women in a given population have breast cancer.
Of 1,000 women, 992 thus do not have breast cancer and in 7 percent of these cases, mammography will return the wrong diagnosis; that is, 70 of the women tested will be given incorrect results. Of the eight women with breast cancer, the test diagnoses the medical condition of seven of these women correctly, which means, in total, of the 77 breast cancer findings after mammography, only seven are correct (i.e., approximately 9 percent). Given this low accuracy rate, it is inadvisable to perform across-the-board tests; instead, only certain high-risk groups should be tested where an inefficient test is still far better than no test at all.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.
-
ZorinOS 17.1 Released, Includes Improved Windows App Support
If you need or desire to run Windows applications on Linux, there's one distribution intent on making that easier for you and its new release further improves that feature.
-
Linux Market Share Surpasses 4% for the First Time
Look out Windows and macOS, Linux is on the rise and has even topped ChromeOS to become the fourth most widely used OS around the globe.
-
KDE’s Plasma 6 Officially Available
KDE’s Plasma 6.0 "Megarelease" has happened, and it's brimming with new features, polish, and performance.
-
Latest Version of Tails Unleashed
Tails 6.0 is based on Debian 12 and includes GNOME 43.
-
KDE Announces New Slimbook V with Plenty of Power and KDE’s Plasma 6
If you're a fan of KDE Plasma, you'll be thrilled to hear they've announced a new Slimbook with an AMD CPU and the latest version of KDE Plasma desktop.
-
Monthly Sponsorship Includes Early Access to elementary OS 8
If you want to get a glimpse of what's in the pipeline for elementary OS 8, just set up a monthly sponsorship to help fund its continued existence.