# Bernoulli and Binomial Distributions

A Bernoulli Distribution is the probability distribution of a random variable which takes the value 1 with probability p and value 0 with probability 1 – p, i.e.

$$

\begin{cases}

1-p & \text{for}\ k=0 \\

p & \text{for}\ k=1 \\

\end{cases}$$

We will use the example of left-handedness. Approximately 10% of the population are left-handed (p=0.1).

We want to know, out of a random sample of 10 people, what is the probability of 3 of these 10 people being left handed?

We assign a 1 to each person if they are left handed and 0 otherwise:

- $P(X=1) = 0.1$
- $P(X=0) = 0.9$

A Binomial distribution is derived from the Bernoulli distribution.

We’ll start with the simpler problem:

What is the probability of the *first* 3 people we pick being left-handed, followed by 7 people being right-handed?

This is just $ 0.1 ^3 \times 0.9 ^7$

```
0.1 ** 3 * 0.9 ** 7
```

What if we wanted the *last* 3 people to be left-handed?

This is just $0.9^7 \times 0.1^3$, the same answer.

In fact, no matter how we arrange the 3 people, we will always end up with the same probability ($ 4.7 \times 10^{-4} $).

So we have to add up all the ways we can arrange the 3 people being picked.

There are $10!$ ways to arrange 10 people and there are $3!$ ways to arrange the 3 people that are picked and $7!$ ways to arrange the 7 people that aren’t picked.

This is given as:

$$\dfrac{10!}{3!\ 7!}$$

```
from math import factorial
factorial(10) / (factorial(3) * factorial(7))
```

Or more commonly, “10 choose 3”. The “n choose k” notation is written as:

$$

\begin{equation*}

\binom{n}{k}

\end{equation*} = \dfrac{n!}{k!\ (n-k)!}

$$

We can now caclulate the probability that there are 3 left-handed people in a random selection of 10 people as:

$$

P(X=3) = \begin{equation*}

\binom{10}{3}

\end{equation*} (0.1)^3 (0.9)^7

$$

```
(factorial(10) / (factorial(3) * factorial(7))) * 0.1 ** 3 * 0.9 ** 7
```

$ P(X=3) = 0.057 $

This will generalise such that:

$$

P(X=k) = \begin{equation*}

\binom{n}{k}

\end{equation*} (p)^k (1-p)^{n-k}

$$

Scipy’s stats package has a binomial package that can be used to calculate these probabilities:

```
# parameters are k, n and p
from scipy.stats import binom
binom.pmf(3, 10, 0.1)
```

We can use this function to calculate what the probability of 3 *or fewer* people being left-handed from a selection of 10 people.

$$

P(X \leq 3) = \sum_{i=0}^{3} \begin{equation*}

\binom{10}{i}

\end{equation*} (0.1)^i (0.9)^{n-i}

$$

```
sum([binom.pmf(x, 10, 0.1) for x in range(4)])
```

$ P(X \leq 3) = 0.987 $

Or we could plot our probability results for each value up to all 10 people being left-handed:

```
import matplotlib.pyplot as plt
plt.bar(range(11), [binom.pmf(x, 10, 0.1) for x in range(11)])
plt.xlabel('k')
plt.ylabel('P(X=k)')
plt.title('Binomial PMF')
plt.show()
```

We can see there is almost negligible chance of getting more than 6 left-handed people in a random group of 10 people.

### Roulette

On an American roulette wheel there are 38 squares:

- 18 black
- 18 red
- 2 green

We bet on black 10 times in a row, what are the chances of winning more than half of these?

$$

P(X \gt 5) = \sum_{i=6}^{10} \begin{equation*}

\binom{10}{i}

\end{equation*} \bigg(\dfrac{18}{38}\bigg)^i \bigg(1-\dfrac{18}{38}\bigg)^{n-i}

$$

```
p = 18 / 38
sum([binom.pmf(x, 10, p) for x in range(6, 11)])
```

$ P(X \gt 5) = 0.314 $

# Poisson Distribution

A Poisson distribution is a limiting version of the binomial distribution, where $n$ becomes large and $np$ approaches some value $\lambda$, which is the mean value.

The Poisson distribution can be used for the number of events in other specified intervals such as distance, area or volume. Examples that may follow a Poisson include the number of phone calls received by a call center per hour and the number of decay events per second from a radioactive source.

It is calculated as:

$ P(k) = e^{-\lambda} \dfrac{\lambda^k}{k!} $

The average number of goals in a World Cup football match is 2.5.

We would like to know the probability of 4 goals in a match.

```
from math import exp
_lambda = 2.5
k = 4
(exp(-_lambda)) * _lambda ** k / factorial(k)
```

Again, scipy has in-built functions for calculating this and we can use this to calculate the probability of any number of goals in a World Cup match.

```
# parameters are k and lambda
from scipy.stats import poisson
import matplotlib.pyplot as plt
plt.bar(range(11), [poisson.pmf(k, _lambda) for k in range(11)])
plt.xlabel('k (number of goals)')
plt.ylabel('P(X=k)')
plt.title('Poisson PMF')
plt.show()
```