The Beta Distribution Explained Simply

The Beta distribution is a crucial probability distribution in Bayesian statistics.

In theoretical probability problems, we know the exact probability value of a single event, making it relatively straightforward to apply basic probability calculation rules to reach the desired result.

In real life, however, it’s much more common to deal with collections of observations, and it’s from this data that we must derive probability estimates.

To put it more clearly: in life, we almost never have access to the exact probability value of an event: rather, we have data and observations.
Deriving probabilities from observed data is what we call statistical inference.

Beta is a continuous value distribution, and in this respect, it differs from the binomial distribution, which, as we’ve seen, presents discrete values.

We define it through a probability density function (PDF): (no, not the well-known format created by Adobe…)

\( Beta(p;\alpha,\beta)=\frac{p^{\alpha-1} \times (1-p)^{\beta-1}}{beta(\alpha;\beta)} \ \)

where

p = is the probability of an event
α = how many times we observe our event of interest
β = how many times our event of interest does NOT occur
and obviously:
α + β = number of trials

The beta function (not the β value) in the denominator serves to normalize the result (which will thus be between 0 and 1).
It is calculated through numerical integration, since the distribution is continuous.

The Beta distribution is a probability distribution of probabilities, and since it models a probability, its domain is limited between 0 and 1.

Let’s look at a practical example of the beta distribution using R

Imagine that an online game organizer claims that at least 1 in 10 players wins a prize. We have the data, and we know that among the last 800 players, there were 65 winners.

The question we ask ourselves is: is the game organizer telling the truth based on our data? Can we consider that a player has at least a 10% chance of winning a prize when buying a ticket based on our sample?

The solution to our question can be easily derived using the beta function with our data:

We use the cumulative beta distribution:
β (.1, 65, 735, TRUE)

In R, it takes just one line to find the part of our function that lies between 0.1 and 1, showing the probabilities above 10% of winning a prize when buying a ticket:

integrate(function(x) dbeta(x,65,735),0.1,1)

0.03170546 with absolute error < 2.3e-06

The answer is right before our eyes. The probability of having at least 10% success is just 3.17%. What the game organizer claims, in light of the data, is false.