In statistical computations, intuition can be very misleading
Guess Again
Even hardened scientists can make mistakes when interpreting statistics. Mathematical experiments can give you the right ideas to prevent this from happening, and quick simulations in Perl nicely illustrate and support the learning process.
If you hand somebody a die in a game of Ludo [1], and they throw a one on each of their first three turns, they are likely to become suspicious and check the sides of the die. That's just relying on intuition – but when can you scientifically demonstrate that the dice are loaded (Figure 1)? After five throws that all come up as ones? After ten throws?
Each experiment with dice is a game of probabilities. What exactly happens is a product of chance. It is not so much the results of a single throw that are relevant, but the tendency. A player could throw a one, three times in succession from pure bad luck. Although the odds are pretty low, it still happens, and you would be ill advised to jump to conclusions about the dice based on such a small number of attempts.
The Value of p
For this experiment, a scientist would start by defining a socalled null hypothesis (e.g., "The die is fair" or "The medication shows no effect in patients"). On the basis of the test results, this hypothesis would be either confirmed or rejected later on. The mistake of rejecting a correct null hypothesis is known by statisticians as a "Type I error" or an "Error of the first kind." Experiments define up front the maximum acceptable probability of this event happening; this value is known as the significance level of the experiment.
Another statistical tool, the socalled "pvalue" [2], is a probability value between 0 and 1 that can be computed during the experiment and that states how likely it is to see the result you just found – or one that is even more extreme (Figure 2). The smaller the pvalue, the more significant the test; thus, the null hypothesis is highly likely to be incorrect.
For example, if you toss a coin 20 times and it comes up heads 14 times (10 is what you would expect), the pvalue of 0.115 is still well above the threshold value of 5 percent (0.05), which is typical for scientific experiments [3]. The scientist can thus accept the null hypothesis ("The coin is fair") with a clear conscience and demonstrate with a maximum error probability of 5 percent that the coin has dropped as expected. If the coin came up heads 15 times out of 20 tosses, instead of 14, the pvalue would drop to 0.041 – below the 5 percent threshold – and both the null hypothesis and the quality of the coin would be begin to look questionable.
To Err is Human
The Perl script in Listing 1 [4] throws a good coin with the sides H
(for heads) and T
(for tails) a total of 1,000 times and then adds the number of times that it came up heads. The p_value()
function starting in line 23 then computes the pvalue from the observed result. The script's output helps you decide whether the coin tosses are regular, or whether there is an anomaly:
$ ./cointoss Rounds: 1000 Tails: 507 pvalue: 0.182979
Listing 1
cointoss
01 #!/usr/local/bin/perl w 02 use strict; 03 use Math::BigFloat; 04 05 my @sides = qw( H T ); 06 my $rounds = 1000; 07 my $tails = 0; 08 09 for ( 1 .. $rounds ) { 10 my $side = $sides[ rand scalar @sides ]; 11 12 if( $side eq "T" ) { 13 $tails++; 14 } 15 } 16 17 printf "Rounds: $rounds\n"; 18 printf "Tails: $tails\n"; 19 printf "pvalue: %f\n", 20 p_value( $tails, $rounds/2, $rounds ); 21 22 ########################################### 23 sub p_value { 24 ########################################### 25 my( $tails, $expect, $rounds ) = @_; 26 27 my @vals = ( $tails < $expect ? 28 ( 1 .. $tails ) : 29 ( $tails .. $rounds ) ); 30 31 my $sum = Math::BigFloat>new( 0 ); 32 33 for my $val ( @vals ) { 34 my $nok = 35 Math::BigFloat>new( $rounds ); 36 $nok>bnok( $val ); 37 $sum>badd( $nok ); 38 } 39 40 my $total = Math::BigFloat>new( 2 ); 41 $total>bpow( $rounds ); 42 43 return 2 * 44 $sum>bdiv( $total )>numify(); 45 }
In this example, out of 1,000 tosses, the coin came up heads 507 times; the pvalue is thus 0.18 and well above the threshold value of 5 percent. This means that there's no good reason to reject the null hypothesis.
The script randomly chooses the H
or T
symbol in the @sides
array in each of the 1,000 rounds, thereby deciding whether heads or tails was thrown. In the latter case, the $tails
counter in line 13 is incremented by 1 for the tally output and pvalue calculation later on.
Looking for Extremes
So, how do you compute the value of p? If the coin comes up heads 7 times from 10 throws in the experiment, then 8, 9, or 10 times heads would be an even more extreme result. Because the coin is symmetrical, and 8, 9, or 10 times tails is "more extreme" as well, the value of p also includes these values. The probability of k tosses coming up heads in an experiment with a series of n binomial distributed tosses is computed from the binomial coefficient (n/k) divided by the total number of combinations (2n):
The p_value()
function in lines 2345 uses the CPAN Math::BigFloat module to compute the binomial coefficient and for the ensuing division. The downstream steps in longer experiments result in numeric values that exceed the floating point capacity of most computers by far, whereas Math::BigFloat computes with an arbitrary degree of accuracy even when faced with values featuring a couple of thousand digits.
The bnok()
method in line 36 computes the binomial coefficient (n/k), and line 37 accumulates a subtotal with badd()
. If the value is below the expected average value (e.g., 10 out of 20 tosses), the algorithm looks for more extreme values to the left (i.e., in increasing order from 1, 2, … , up to the value shown in the experiment).
However, if the experimental value is on the righthand side of the bell curve, it counts from this value up to the maximum value. In both cases, line 43 multiplies the result by 2
because of the symmetry of the experiment (heads and tails are interchangeable).
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Direct Download
Read full article as PDF:
Price $2.95
News

Linus Torvalds Invites Attackers to Join the Kernel Community
He wants attackers to join the community instead of attacking the code.

Oracle Donating Java EE to the Eclipse Foundation
Oracle is recommending a new name for the Java Enterprise Edition platform.

Reddit Closing Doors to Open Source
The peanut gallery of the Internet is closing its open source repository.

Largest Open Source Summit to Date Around the Corner
LinuxCon is now Open Source Summit.

Gnome is Celebrating it’s 20th Birthday
More than 33 releases since the first release of Gnome.

SQL Server Comes to RHEL; OpenShift Comes to Azure
Microsoft and Red Hat expand their partnership.

LibreOffice 5.4 Released
Comes with improved support for Microsoft Office file formats.

Red Hat to Drop Support for Btrfs
The company is building their own storage solution called Stratis.

Reinvent your network with DevOps tools and techniques:
Opportunities and Risks: Containers for DevOps
Automated Builds Using CentOS 7 and Kickstart
Elasticsearch, Logstash, and Kibana – The ELK stack
Are you ready for DevOps? Try your luck with the beta version of Linux Professional Institute's new DevOps exam.

Kolab Now Integrates Collabora Online
Kolab Now subscribers now have access to Collabora Online.