The Beta Distribution Explained Simply

The Beta distribution is a crucial probability distribution in Bayesian statistics.

In theoretical probability problems, we know the exact probability value of a single event, making it relatively straightforward to apply basic probability calculation rules to reach the desired result.

In real life, however, it’s much more common to deal with collections of observations, and it’s from this data that we must derive probability estimates.

To put it more clearly: in life, we almost never have access to the exact probability value of an event: rather, we have data and observations.
Deriving probabilities from observed data is what we call statistical inference.

Beta is a continuous value distribution, and in this respect, it differs from the binomial distribution, which, as we’ve seen, presents discrete values.

We define it through a probability density function (PDF): (no, not the well-known format created by Adobe…)

\( Beta(p;\alpha,\beta)=\frac{p^{\alpha-1} \times (1-p)^{\beta-1}}{beta(\alpha;\beta)} \ \)

where

p = is the probability of an event
α = how many times we observe our event of interest
β = how many times our event of interest does NOT occur
and obviously:
α + β = number of trials

The beta function (not the β value) in the denominator serves to normalize the result (which will thus be between 0 and 1).
It is calculated through numerical integration, since the distribution is continuous.

The Beta distribution is a probability distribution of probabilities, and since it models a probability, its domain is limited between 0 and 1.

Let’s look at a practical example of the beta distribution using R

Imagine that an online game organizer claims that at least 1 in 10 players wins a prize. We have the data, and we know that among the last 800 players, there were 65 winners.

The question we ask ourselves is: is the game organizer telling the truth based on our data? Can we consider that a player has at least a 10% chance of winning a prize when buying a ticket based on our sample?

The solution to our question can be easily derived using the beta function with our data:

We use the cumulative beta distribution:
β (.1, 65, 735, TRUE)

In R, it takes just one line to find the part of our function that lies between 0.1 and 1, showing the probabilities above 10% of winning a prize when buying a ticket:

integrate(function(x) dbeta(x,65,735),0.1,1)

0.03170546 with absolute error < 2.3e-06

The answer is right before our eyes. The probability of having at least 10% success is just 3.17%. What the game organizer claims, in light of the data, is false.

You might also like

Authoritative online resources to learn more


Further Reading

Bayesian Statistics the Fun Way by Will Kurt is the natural next step: it shows why the Beta distribution is the queen of Bayesian inference, with concrete examples in R just like the ones we use here.

paolo

Recent Posts

The peeking problem: why sneaking a look at an A/B test inflates false positives

On 21 January 2015 Optimizely — one of the most widely used A/B testing platforms…

2 days ago

Regression to the Mean: the SEO Fix That Worked… by Accident

In the Israeli Air Force, Daniel Kahneman recounts, the flight instructors were sure of one…

3 days ago

A/B Testing: How to Run Statistically Valid Experiments (and the Mistakes to Avoid)

Over the previous articles we have looked at how hypothesis testing works and how the…

6 days ago

An Introduction to Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is a widely used statistical technique for reducing the complexity of…

6 days ago

Correlation: Pearson, Spearman and Kendall (and Why It Isn’t Causation)

Anyone who looks at a website's data does it constantly, often without noticing: they spot…

6 days ago

Effect Size and Power Analysis: How Big Is the Effect (and How Much Data You Need)

We closed the article on the A/B test significance calculator with a promise. We said…

1 week ago