Strange Coincidence

Programming Snapshot – Python Gambling

Author(s):

Can 10 heads in a row really occur in a coin toss? Or, can the lucky numbers in the lottery be 1, 2, 3, 4, 5, 6? We investigate the law of large numbers.

If you follow the stock market, you are likely to hear people boasting that their buy and sell strategies allegedly only generate profits and never losses. Anyone who has read A Random Walk down Wall Street [1] knows that wild stock market speculation only leads to player bankruptcy in the long term. But what about the few fund managers who have managed their deposits for 20 years in such a way that they actually score profits year after year?

According to A Random Walk down Wall Street, even in a coin toss competition with enough players, there would be a handful of "masterly" throwers who, to the sheer amazement of all other players, seemingly throw heads 20 times in a row without tails ever showing up. But how likely is that in reality?

Heads or Tails?

It is easy to calculate that the probability of the same side of the coin turning up twice in succession is 0.5x0.5 (i.e., 0.25), since the probability of a specific side showing once is 0.5 and the likelihood of two independent events coinciding works out to be the multiplication of the two individual probabilities. The odds on a hat trick (three successes in a row) are therefore 0.5^3=0.125, and a series of 20 is quite unlikely at 0.5^20=0.000000954. But still, if a program only tries often enough, it will happen sometime, and this is what I am testing for today. The short Python coin_throw() generator in Listing 1 [2] simulates coin tosses with a 50 percent probability of heads or tails coming up.

Listing 1

cointhrow

01 #!/usr/bin/env python3
02 import random
03
04 def coin_throw():
05   sides=('heads', 'tails')
06
07   while True:
08     yield sides[random.randint(0, 1)]
09
10 def experiment():
11   run     = 1
12   max_run = 0
13   prev    = ""
14   count   = 0
15
16   for side in coin_throw():
17     count += 1
18
19     if prev == side:
20       run += 1
21       if run > max_run:
22         max_run = run
23         print("max_run: %d %s (%d)" %
24               (max_run, side, count))
25     else:
26       prev = side
27       run  = 1
28
29 experiment()

Listing 1 is divided into the coin_throw() coin toss generator and the experiment evaluation in the experiment() function. In a for loop as embedded in line 16, it looks like coin_throw() would return a long list of result strings consisting of heads or tails, but in reality, the function is a dynamic generator that returns a new value whenever an iteration pump like the for loop asks if there is more coming.

In the Generator House

How does the generator, whose output is shown in Figure 1, work? The key is the yield command in line 8 of Listing 1. Python interrupts the execution of the current function for a yield, remembers its internal state, and returns the string (heads or tails) to the calling program in line 8. The randint(0,1) method returns the integer value   or 1 with 50 percent probability, and the script picks one of the two entries in the sides tuple.

Figure 1: Python generator coin toss.

The next time the coin_throw()function is called, the program flow returns to the previous location within the function to continue where yield left off. In this case, this is the endless loop with while True in line 7, whose condition is still true, whereupon another yield command again returns a random value. Thus the generator produces an endless sequence of values in coin_throw() and always performs a new coin toss if the for loop in the main program wants more.

The experiment() function's remaining code stores the number of identical coin toss outcomes in sequence so far in the variable run, the maximum longest sequence so far in max_run, the previous coin toss in prev (heads or tails), and the total number of tosses so far in count.

Fatal Double Down

The output from the script in Figure 2 reveals that after 21 million passes, a sequence of 23 tails in succession actually occurred. If a player had doubled the bet for the next game each time they lost, a strategy called "doubling down," they would have needed a total of 2^23 – 1=$8,388,607 of betting capital, assuming an initial bet of $1, not to go bankrupt if they had initially bet on heads, only to see 23 tails in a row with growing frustration.

Figure 2: Longest sequences of identical coin tosses.

Great Stuff

Python not only offers generators with the yield keyword; classes can also implement generators as iterators. For this the Pythonista defines two methods: __iter__() and __next__(). The quadruple underscores ("dunders") mark the official entry points for Python's standard library. If the Python interpreter sees a loop head like for n in Roulette(), it instantiates an object of the Roulette class, uses __iter__() to access its iterator, and then retrieves new values from it with calls to __next__() until the iterator throws an exception.

In the class defined in Listing 2, __iter__() conveniently returns the Roulette instance itself, because the class does not need a separate iterator, since it implements the iterator itself with __next__(). The latter always returns a new random number in the range 0 to 36 and never throws an exception, so that the flow of the for loop in the main program never stops.

Listing 2

roulette.py

01 #!/usr/bin/env python3
02 import random
03
04 class Roulette:
05   slots   = 36
06   numbers = range(0,slots+1)
07
08   def __iter__(self):
09     return self
10
11   def __next__(self):
12     return self.__class__.numbers[random.randint(0, self.__class__.slots)]

The Roulette class defines two class variables: slots as the highest number on the roulette wheel and numbers as a sequence of numbers from 0 to 36 inclusive. It makes sense to define the variables once only for the class and not to rebuild them for each instance or even every time the iterator is called.

Class or Instance?

Python's class variables differ from instance variables in that they are not accessed with self.variable, but with __class__.variable or self.__class__.variable. For read-only access, you could even reference the class variable using the self.variable instance path.

But if you modify the latter, you may be in for a surprise, because Python creates a new instance variable behind your back. The instance variable will be decoupled from the class variable so that each object will modify its own, instead of propagating changes to the class level. Also, methods do not find the class variable simply by its name; if you simply reference slots in __iter__(), you can expect the syntax checker to blow up in your face.

As so often in the Python world, there is a small but subtle difference between versions 2 and 3: The iterator entry into the generator class goes by the name of next in Python 2 and not __next__ as in Python 3; programmers who want to use both versions thus typically simply define another next() method, which passes the parameters fed to it to __next__() without modification. Python 3 does not use next(), so the compatibility trick does no harm there.

The output of the statistical evaluation of the roulette generator is shown in Listing 3. After 27 rounds, a doublet appeared for the first time: the number 11 occurred twice in a row. After 6,249 rounds of Faites vos jeux, 15 occurred three times in a row; after 57,393 games, 34 occurred four times in a row, and so on.

Listing 3

roulette-run

1 $ ./roulette
2 max_run: 2 11 (27)
3 max_run: 3 15 (6249)
4 max_run: 4 34 (57393)
5 max_run: 5 1 (3363284)
6 max_run: 6 0 (95846456)
7 max_run: 7 26 (357289507)

Tumultuous Scenes

What would happen in Las Vegas at a roulette table if zero came up come six times in a row as in Listing 3 after 95 million rounds? Tumultuous scenes would probably take place in the casino before the pit boss appeared and sent the croupier home for the day, because every player at the table would immediately suspect that something fishy was going on. But, seen statistically, everything is above board; it's an inevitable fact that even random values will repeat at some time.

Lottery Winner

What would TV viewers think if the lottery lady announced that the numbers drawn from a bucket with 49 balls were 1, 2, 3, 4, 5, and 6? Since the probability of getting all six numbers right is about 1:14 million and there are 44 combinations of consecutive lottery number combinations (1, 2, 3, 4, 5, 6 through 44, 45, 46, 47, 48, 49), the chance of a straight in the lottery is about 1:318,000 – this means that the incredible event would occur relatively quickly with a fast draw generator.

Listing 4 shows an automatic drawing machine in the lotto_draw() function. From 49 numbered balls in the numbers list, it draws six random numbers and then removes them to prevent double draws. Since it takes a significant amount of compute time to remove an element from a Python list and move up the remaining elements to close the gap, the function swaps the value of the selected element with the last element in the list and reduces the list length size by one – much faster!

Listing 4

lotto

01 #!/usr/bin/env python3
02 import random
03
04 def lotto_draw():
05   total   = 49
06   draws   = 6
07   numbers = list(range(1,total+1))
08   size    = total
09   result  = []
10
11   for _ in range(draws):
12     idx = random.randrange(size)
13     result.append(numbers[idx])
14     numbers[idx] = numbers[size-1]
15     size -= 1
16
17   return sorted(result)
18
19 def is_consecutive(draw):
20   prev    = ""
21   for number in draw:
22     if prev < 0:
23       prev=number
24     elif prev + 1 == number:
25       prev = number
26     else:
27       return False
28   return True
29
30 count = 0
31 while True:
32   count += 1
33   draw=lotto_draw()
34   if is_consecutive(draw):
35       print("%d: %s" % (count, str(draw)))
36       break

Following this algorithm, lotto_draw() returns a sorted list of six randomly selected balls. The main program starting in line 30 uses is_consecutive() to check whether the drawn numbers each differ only by one from their predecessor. If this is the case, line 35 prints the number of draws in count and the lucky numbers that led to the termination. Figure 3 shows that this sometimes occurs after 30,000 passes; sometimes, however, it takes more than 800,000 – purely random, but within the calculated probability.

Figure 3: The lottery generator determines the number of draws until it finds a curious outcome.

Python Tricks [3] by Dan Bader is recommended for implementing this and other cool Python tricks. It shows a multitude of everyday programming tasks with elegant Python solutions. It is perfectly suited for users of other programming languages (like Perl!) who are mainly interested in converting typical idioms into clean Python and don't want to start with Adam and Eve and "Hello World."

Infos

  1. Malkiel, Burton G. A Random Walk down Wall Street. Norton & Company, 2016: https://www.amazon.com/Random-Walk-Down-Wall-Street-ebook/dp/B00QH9NTSI
  2. Listings for this article: ftp://ftp.linux-magazine.com/pub/listings/linux-magazine.com/212/
  3. Bader, Dan. Python Tricks. Dan Bader, 2017: https://dbader.org/products/python-tricks-book/

The Author

Mike Schilli works as a software engineer in the San Francisco Bay area, California. Each month in his column, which has been running since 1997, he researches practical applications of various programming languages. If you email him at mailto:mschilli@perlmeister.com he will gladly answer any questions.