statistics

The Geometric Distribution

After looking at the most famous discrete distribution, the Binomial, as well as the Poisson distribution and the Beta distribution, it is time to take a look at the geometric distribution.

How Many Trials Until the First Success?

We use the geometric distribution when we perform independent trials, each of which can result in either success or failure, and we want to know how many trials are needed to obtain the first success.

In symbols:

\( X \sim Geo(p) \\ \\ \)
  • \(X\) is the number of trials needed to obtain the first success.
  • \(r\) is the number of trials.
  • \(P\) is the probability of success on each trial.
  • We also define, as is natural: q = 1 – p

Here is where it gets interesting. We have:

\( \\ P(X=r) = p \times q ^ {r-1} \\ \)

P therefore denotes the probability that the first success occurs on trial number r.
Let us continue our reasoning:

\( P(X > r) = q ^ {r} \)

This allows us to calculate the probability that more than r trials are needed before the first success, as well as:

\( P(X \leq r) = 1 – q ^ {r} \\ \)

which helps us find the probability that r trials or fewer are needed to achieve the first success. The expected value is:

\( E(X) = \frac{1}{P} \\ \)

The variance is:

\( Var(X) = \frac{q}{P^{2}} \)

Worked Examples

We know that the probability of an ice skater completing a course without incident is 0.4. Therefore:

\( X \sim Geo(0.4) \\ \)

X is the number of attempts our skater must make in order to complete a course without any incident.

We are now ready to apply our new knowledge.

Let us calculate the expected number of attempts before achieving a success:

\( E(X) = \frac{1}{P} \\ \)
therefore
\( \frac{1}{0.4} = 2.5 \)

The variance in the number of attempts is quickly calculated:

\( Var(X) = \frac{q}{p^{2}} \\ \)
that is
\( \frac{0.6}{0.4^{2}} = \frac{0.6}{0.16} = 3.75 \\ \)

The probability of succeeding on the second attempt, after having failed the first:

\( P(X=2) = P \times q = 0.4 \times 0.6 = 0.24 \\ \)
that is, 24%

The probability of succeeding in 4 attempts or fewer? Easy!

\( P(X \leq 4) = 1-q^{4} = 1 – 0.6^{4} = 1 – 0.1296 \\ \)

That is 0.8704, or 87%.

The probability of needing more than 4 attempts? A simple calculation:

\( P(X > 4) = q^{4} = 0.6^{4} \\ \)

That is 0.1296, or about 13%.


Computing in R

Now that we have the formulas well in mind, we can let our laziness take over and use R to do the heavy lifting.

With P(X=2) and P=0.4:

dgeom(1, 0.4)

where 1 is the number of failures before the first success.

P(X<=4) and P=0.4:

pgeom(3, 0.4)

Simple, quick, and fun!


You might also like


Further Reading

For an accessible yet thorough introduction to discrete probability distributions—including the geometric—Finalmente ho capito la statistica by Maurizio De Pra covers these topics in a clear and approachable style, ideal for building solid intuition.

autore-articoli

Recent Posts

Understanding the Basics of Machine Learning: A Beginner’s Guide

Introduction Machine Learning is changing the way we see the world around us. From weather…

16 hours ago

The Gini Index: What It Is, Why It Matters, and How to Compute It in R

The Gini coefficient is a measure of the degree of inequality in a distribution, and…

16 hours ago

Contingency Tables and Conditional Probability

Contingency tables are used to evaluate the interaction between two categorical variables (qualitative). They are…

16 hours ago

The Poisson Distribution

The Poisson distribution is a discrete probability distribution that describes the number of events occurring…

16 hours ago

A Brief (Personal) Manifesto for SEO

The need I feel—the fruit of many years working in this field—is to affirm the…

16 hours ago

Descriptive Statistics: Measures of Variability (or Dispersion)

Measures of variability are used to describe the degree of dispersion of observations around a…

16 hours ago