In statistical computations, intuition can be very misleading
Infected Script
Listing 2 simulates the experiment in Perl. In an array of 1,000 women whose health status is initially set in line 12 to 0
(i.e., "no findings"), the infect()
function in line 50 randomly introduces eight 1
s, thus simulating the 0.8 percent of women with breast cancer in the population. To allow this to happen, line 55 uses the shuffle()
function from the CPAN Algorithm::Numerical::Shuffle module to shuffle an array with the element index numbers of the patients' array based on the Fisher-Yates method [6]. Then, it selects a total of eight indices. At these points, the function then modifies the patient's array.
Listing 2
base-rate
The experiment performs the while
loop in line 16 until the "mammography" has returned 100,000 positive results, each initiated with the examine()
function in lines 34-47 with the previously known health status of the patient (for test purposes). The diagnosis takes errors of the first kind (10 percent) and the second kind (7 percent) into consideration. In case of positive findings, the function returns a true value to the main program and for negative findings, a false value. The results after calling the script are in line with the previous mathematical projections:
$ ./base-rate Test score: 9.26%
The Perl script confirms that a test with a significant false positive rate on a group of people with a small ratio of true positive findings is inherently unreliable. Instead of giving test results more credit than they're worth, in this case, it might be advisable to seek a second opinion instead.
Infos
- Ludo: https://en.wikipedia.org/wiki/Ludo_(board_game)
- p-Value: https://en.wikipedia.org/wiki/P-value
- Reinhart, Alex. Statistics Done Wrong: The Woefully Complete Guide. No Starch Press, 2015.
- Listings for this article: ftp://ftp.linux-magazine.com/pub/listings/magazine/178
- Base rate fallacy: https://en.wikipedia.org/wiki/Base_rate_fallacy
- Fisher-Yates shuffle: https://en.wikipedia.org/wiki/Fisher--Yates_shuffle
« Previous 1 2 3
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
Linux Servers Targeted by Akira Ransomware
A group of bad actors who have already extorted $42 million have their sights set on the Linux platform.
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs