Bayes' Theorem¶

Take a look at how Bayes' Theorem can be applied repeatedly to increase our knowledge of the probability of a hypothesis being correct. We'll do this the context of repeatedly tossing what might be a loaded coin in order to determine whether it is in fact loaded.

We will end by answering this question: is tossing a coin forty times and updating the probability after each coin toss equivalent to running ten sets of four coin tosses?

Tossing a loaded coin¶

Bayes' Theorem¶

$$ P(H \mid E) = \frac {P(E \mid H) \, P(H)} {P(E \mid H) \, P(H) + P(E \mid \neg H) \, P(\neg H)} $$

We all know this formula but I've put it in because Jupyter Notebook includes support for $\TeX$ as standard, which is nice. I won't go through an explanation of this formula.

How does this apply to a loaded coin?¶

Let's say, for example, that we have 100 coins and we know that 10 of these coins are loaded and the probability of tossing heads with a loaded coin is 80% (I have no idea how, in practice, you would create such a coin). In this scenario, $H$ represents the hypothesis that the coin is loaded and $E$ represents an event, which in our case is the outcome of tossing the coin a number of times (for example, tossing fours times and seeing Heads, Heads, Tails, Heads). We have $P(H)=0.1$ because we know that $10/100$ coins are loaded.

The probability of tossing such a loaded coin $n$ times and seeing $n_h$ heads and $n_t$ tails is given by $$ P(E \mid H) = {n \choose n_h} 0.8^{n_h} (1 - 0.8)^{n_t}, $$ while the probability of tossing a fair coin and seeing the same outcome is $$ P(E \mid \neg H) = {n \choose n_h} 0.5^n. $$

Take as an example $n=4$, $n_h=3$ and $n_t=1$, which gives

$$ P(E \mid H) = {4 \choose 3} 0.8^3 0.2^1 = 4 \times 0.512 \times 0.2 = 0.4096 $$

Thus, if we were to see this result, our knowledge about whether the coin is loaded has changed - we can increase the likelihood of the coin being loaded from 10% to 41%.

Implement all this in Python¶

import numpy as np
import pandas as pd
import scipy.special
import matplotlib.pyplot as plt

Create a function to calculate the posterior probability given a series of coin tosses¶

Pass the function the probability of tossing heads with a loaded coin, a list of coin tosses (where 0 represents tails and 1 represents heads) and the prior probability of the coin being loaded.

Return the posterior probability.

def bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, tosses, prior):
    """Calculate the probability that a coin is loaded from a series to coin tosses.

    To keep it simple, there is no error-checking in this function, so
    caller must ensure that the values passed to it are valid.

    Parameters
    ----------
    prob_load : float
        Probability of tossing heads.
    tosses : list of integers
        A list of integers representing a series of coin tosses.
            0 = tails
            1 = heads
            Example: [1, 0, 1, 1, 0] means three heads and two tails.
    prior : float
        Prior probability.

    Returns
    -------
    float
        Posterior probability.
    """

    prob_fair = 0.5     # We are comparing to a fair coin, so it has a 50% chance of heads.

    ntosses = len(tosses)
    nheads = sum(tosses)
    ntails = ntosses - nheads

    bcoef = scipy.special.comb(ntosses, nheads)      # Binomial coefficient, used twice.

    p_load = bcoef * prob_load**nheads * (1 - prob_load)**ntails    # Probability of these tosses if the coin is loaded.
    p_fair = bcoef * prob_fair**nheads * (1 - prob_fair)**ntails    # Probability of these tosses if the coin is not loaded.
    p_e = p_load * prior + p_fair * (1 - prior)                     # Overall probability of these tosses.
    posterior = p_load * prior / p_e                                # Posterior probability.

    return posterior

Take a look at the output of this function for various combinations of coin tosses¶

For a given number of coin tosses, increase the number of heads and see how the posterior probability increases.

We start with a prior probability of 10%. The probability of tossing heads with a loaded coin is 80%.

Seeing that the probability of it being a loaded coin is so high, we expect more than half of tosses to be heads. So unless the number of heads is quite high, the posterior probability drops below the prior.

prior = 0.1
prob_load = 0.8


coin_series = []

coin_series.append([0])
coin_series.append([1])

coin_series.append([0, 0])
coin_series.append([0, 1])
coin_series.append([1, 1])

coin_series.append([0, 0, 0])
coin_series.append([0, 0, 1])
coin_series.append([0, 1, 1])
coin_series.append([1, 1, 1])

coin_series.append([0, 0, 0, 0])
coin_series.append([0, 0, 0, 1])
coin_series.append([0, 0, 1, 1])
coin_series.append([0, 1, 1, 1])
coin_series.append([1, 1, 1, 1])

coin_series.append([0, 0, 0, 0, 0])
coin_series.append([0, 0, 0, 0, 1])
coin_series.append([0, 0, 0, 1, 1])
coin_series.append([0, 0, 1, 1, 1])
coin_series.append([0, 1, 1, 1, 1])
coin_series.append([1, 1, 1, 1, 1])


print("")
print("Posterior  Coin tosses")
print("---------  -----------")
for tosses in coin_series:
    posterior = bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, tosses, prior)
    print("{0:.3f}      {1}".format(posterior, tosses))

del prior, prob_load, coin_series, tosses, posterior

Posterior  Coin tosses
---------  -----------
0.043      [0]
0.151      [1]
0.017      [0, 0]
0.066      [0, 1]
0.221      [1, 1]
0.007      [0, 0, 0]
0.028      [0, 0, 1]
0.102      [0, 1, 1]
0.313      [1, 1, 1]
0.003      [0, 0, 0, 0]
0.011      [0, 0, 0, 1]
0.044      [0, 0, 1, 1]
0.154      [0, 1, 1, 1]
0.421      [1, 1, 1, 1]
0.001      [0, 0, 0, 0, 0]
0.005      [0, 0, 0, 0, 1]
0.018      [0, 0, 0, 1, 1]
0.068      [0, 0, 1, 1, 1]
0.226      [0, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]

An aside: take a look at what happens when the loaded coin is, in fact, fair.¶

In all cases the posterior probability is unchanged from the prior probability.

You can think of this as trying to distinguish between one pound coins and one euro coins by tossing them. The type of coin has no bearing on the outcome of a coin toss so tossing the coin does not affect our knowledge of the currency of the coin.

prior = 0.1
prob_load = 0.5

print("Posterior")
print("---------")
print("{0:.8f}".format(bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, [0, 0, 0], prior)))
print("{0:.8f}".format(bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, [0, 0, 1], prior)))
print("{0:.8f}".format(bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, [0, 1, 1], prior)))
print("{0:.8f}".format(bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, [1, 1, 1], prior)))
print("{0:.8f}".format(bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, [0, 0, 1, 1, 1], prior)))

del prior, prob_load

Posterior
---------
0.10000000
0.10000000
0.10000000
0.10000000
0.10000000

Create a function to toss the coin a number of times and calculate the posterior probability¶

This returns both the posterior probability and the tosses themselves, for reference. (The tosses are returned as a sorted native Python list.)

def bayes_coin_toss_test(prob_load, ntosses, prior):
    """Calculate the probability a coin being loaded by tossing the coin a number of times.

    Parameters
    ----------
    prob_load : float
        Probability of tossing heads.
    ntosses : int
        Number of times to toss the coin.
    prior : float
        Prior probability.

    Returns
    -------
    float
        Posterior probability.
    list of ints
        Sorted list of the individual tosses (0=tails, 1=heads).
        As a simple Python list.
    """

    coin = [1, 0]       # [Heads, tails].
    tosses = np.random.choice(coin, ntosses, p=[prob_load, 1 - prob_load], replace=True)
    
    posterior = bayes_posterior_probability_from_a_series_of_coin_tosses(prob_load, tosses, prior)
    
    return posterior, sorted(list(tosses))

Take a look at the output of this function¶

The sequences of coin tosses are chosen randomly, naturally, so there are duplicates in the output.

These results will correspond to the results in an earlier cell, where we specify the sequences. But now it's automated.

prior = 0.1
prob_load = 0.8
ntosses = 5

print("")
print("Posterior  Coin tosses")
print("---------  -----------")
for _ in range(10):
    posterior, tosses = bayes_coin_toss_test(prob_load, ntosses, prior)
    print("{0:.3f}      {1}".format(posterior, tosses))

del prior, prob_load, ntosses, posterior, tosses

Posterior  Coin tosses
---------  -----------
0.538      [1, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.226      [0, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.538      [1, 1, 1, 1, 1]
0.068      [0, 0, 1, 1, 1]
0.068      [0, 0, 1, 1, 1]

Run an experiment in which we run a test multiple times, updating the posterior each time¶

The posterior probability of each test becomes the prior probability of the next test.
The coin is tossed 'ntosses' times in each test.
Plot the results.

You can run this cell multiple times to see how there is a lot of variance in how the posterior probability changes.

TODO: Run the experiment a few times and plot a chart for each set of results

prob_load = 0.8         # Probability of heads for a loaded coin.
prior = 0.01            # Prior probability, starting position.
ntosses = 3             # Number of coin-tosses in a single test.
ntests = 30             # How many tests to run.  This number should increase as ntosses decreases.

posteriors = []
posteriors.append(prior)        # Add the original prior as the first posterior.

for i in range(ntests):
    posterior, tosses = bayes_coin_toss_test(prob_load, ntosses, prior)
    posteriors.append(posterior)
    prior = posterior           # The posterior probability becomes the next prior probability.

del i, prob_load, ntests, ntosses, tosses, prior, posterior

plt.plot(posteriors)
plt.xlabel("Number of tests")
plt.ylabel("Posterior probability")
plt.show()

Run a series of experiments and aggregate the results¶

Note that each experiment has 50 tests with 3 tosses in each test, for a total of 150 coin tosses. This total is used again in a later cell for comparison.

The results of each experiment become a column in the 'stacked' array. This array is used in the next cell, where the results are plotted.

Each experiment involves tossing the coin 'ntosses' times.

prob_load = 0.8         # Probability of heads for a loaded coin.
original_prior = 0.01   # Prior probability - this is the starting position for each run.
ntosses = 3             # Number of coin-tosses in a test.
ntests = 50             # How many sets of tosses in each run.  This should be large
                        # enough to get the probability above some desired probability 
                        # that the coin is loaded.
nruns = 100             # How many times to repeat ntossings of ntests.
                        # More runs give more precise (averaged) results.

stacked = np.empty(0)   # Posterior probabilitiies held in this array.  This is the main result set.
results = {}            # Results of each test held here (prior, posterior and coin tosses).

for _ in range(nruns):

    prior = original_prior

    posteriors = []
    posteriors.append(prior)    # Add the original prior as the first posterior.

    for i in range(ntests):

        posterior, tosses = bayes_coin_toss_test(prob_load, ntosses, prior)
        
        results[i] = {}
        results[i]['prior'] = prior
        results[i]['posterior'] = posterior
        results[i]['tosses'] = tosses

        posteriors.append(posterior)

        prior = posterior       # The posterior probability becomes the next prior probability.

    if stacked.shape[0] == 0:
        stacked = np.array(posteriors)
    else:
        stacked = np.column_stack((stacked, posteriors))

del prob_load, original_prior, ntests, nruns, prior, posterior, tosses

Combine the runs to obtain some statistics¶

Do this in a pandas DataFrame (df_runs). This DataFrame is used again in later cell for comparison.

The probability of the coin being loaded increases, but we see a large variance in that probability in the early stages.

The variance depends on the value of ntosses in the previous cell - if you re-run that cell with a larger values of ntosses, the variance decreases, which is what you would expect.

df_runs = pd.DataFrame(stacked)

ncols = df_runs.shape[1]

df_runs['mean'] = df_runs[range(ncols)].apply(np.mean, axis=1)
df_runs['median'] = df_runs[range(ncols)].apply(np.median, axis=1)
df_runs['std'] = df_runs[range(ncols)].apply(np.std, axis=1)

# Plot the mean with +/- standard deviation as a shaded region.
plt.plot(df_runs['mean'])
plt.fill_between(range(df_runs.shape[0]), df_runs['mean'] - df_runs['std'], df_runs['mean'] + df_runs['std'], color='red', alpha=0.1)
plt.xlabel("Number of steps")
plt.ylabel("Posterior probability")
plt.legend(loc='lower right')
plt.show()

And again¶

This time instead of tossing the coins in groups, update the posterior after each coin toss. We have a total of 150 coin tosses, as in the earlier cells with 50 tests of 3 tosses in each test.

The results are added to DataFrame df_runs_2, which is compared to the results in df_runs, created earlier.

prob_load = 0.8         # Probability of heads for a loaded coin.
original_prior = 0.01   # Prior probability - this is the starting position for each run.
ntosses = 1             # Number of coin-tosses in a test.
ntests = 150             # How many sets of tosses in each run.  This should be large
                        # enough to get the probability above some desired probability 
                        # that the coin is loaded.
nruns = 100             # How many times to repeat ntossings of ntests.
                        # More runs give more precise (averaged) results.

stacked = np.empty(0)   # Posterior probabilitiies held in this array.  This is the main result set.
results = {}            # Results of each test held here (prior, posterior and coin tosses).

for _ in range(nruns):

    prior = original_prior

    posteriors = []
    posteriors.append(prior)    # Add the original prior as the first posterior.

    for i in range(ntests):

        posterior, tosses = bayes_coin_toss_test(prob_load, ntosses, prior)
        
        results[i] = {}
        results[i]['prior'] = prior
        results[i]['posterior'] = posterior
        results[i]['tosses'] = tosses

        posteriors.append(posterior)

        prior = posterior       # The posterior probability becomes the next prior probability.

    if stacked.shape[0] == 0:
        stacked = np.array(posteriors)
    else:
        stacked = np.column_stack((stacked, posteriors))

del prob_load, original_prior, ntests, nruns, prior, posterior, tosses


#
# Plot these new results alongside the results from the previous cell.
# For the results to line up, we include only every 3rd point from this cell.
# If you change ntosses in the earlier cell, 
#

# This corresponds to ntosses in the earlier cell, with results in df_runs.
nskip = 3 

df_runs_2 = pd.DataFrame(stacked)

ncols = df_runs_2.shape[1]

df_runs_2['mean'] = df_runs_2[range(ncols)].apply(np.mean, axis=1)

# The '3' here corresponds to ntosses=3 in the earlier cell.
df_runs_2 = df_runs_2[::nskip].reset_index(drop=True)

plt.plot(df_runs['mean'], label='df_runs')
plt.plot(df_runs_2['mean'], label='df_runs_2')
plt.xlabel("Number of steps")
plt.ylabel("Posterior probability")
plt.legend(loc='lower right')
plt.show()

Conclusion¶

Looks like tossing a coin forty times is the same as tossing the coin in ten sets of four tosses. Which looked likely, but it's nice to take a look. Mind you, this is not a rigorous treatment.