{"id":3469,"date":"2026-03-01T20:31:56","date_gmt":"2026-03-01T19:31:56","guid":{"rendered":"https:\/\/www.gironi.it\/blog\/?p=3469"},"modified":"2026-03-02T09:32:03","modified_gmt":"2026-03-02T08:32:03","slug":"probability-distributions-discrete-distributions-and-the-binomial","status":"publish","type":"post","link":"https:\/\/www.gironi.it\/blog\/en\/probability-distributions-discrete-distributions-and-the-binomial\/","title":{"rendered":"Probability Distributions: Discrete Distributions and the Binomial"},"content":{"rendered":"<p>A <strong>random variable<\/strong> (also called a stochastic variable) is a variable that can take on different values depending on some random phenomenon. In many statistics textbooks it is simply abbreviated as r.v. It is a numerical value.<\/p>\n<p>When probability values are assigned to all the possible numerical values of a random variable x, the result is a <strong>probability distribution<\/strong>.<\/p>\n<p style=\"background-color:#f0f0f0;padding:1em;\">In even simpler terms: a random variable is a variable whose values are each associated with a probability of being observed. The set of all possible values of a random variable and their associated probabilities is called a <strong>probability distribution<\/strong>. The <strong>sum of all probabilities is 1<\/strong>.<\/p>\n<p><!--more--><\/p>\n<div style=\"border: 1px solid #ccc;padding: 1.2em 1.5em;margin: 1.5em 0;border-radius: 6px\">\n<h3 style=\"margin-top: 0\">What We&#8217;ll Cover<\/h3>\n<ul>\n<li><a href=\"#discrete-vs-continuous\">Discrete and Continuous Variables<\/a><\/li>\n<li><a href=\"#bernoulli\">The Bernoulli Random Variable<\/a><\/li>\n<li><a href=\"#binomial\">The Binomial Distribution<\/a><\/li>\n<li><a href=\"#mean-variance\">Mean, Expected Value, and Variance<\/a><\/li>\n<li><a href=\"#probability-density-example\">An Example: Computing the Probability Density<\/a><\/li>\n<li><a href=\"#other-distributions\">Other Discrete Distributions<\/a><\/li>\n<li><a href=\"#further-reading\">Further Reading<\/a><\/li>\n<\/ul>\n<\/div>\n<hr \/>\n<h2 id=\"discrete-vs-continuous\">Discrete and Continuous Variables<\/h2>\n<p>There are two main types of random variables: <strong>discrete<\/strong> and <strong>continuous<\/strong>.<\/p>\n<ul>\n<li>A <strong>discrete r.v.<\/strong> can take on a discrete (<strong>finite<\/strong> or countable) <strong>set of real numbers<\/strong>. That is, we could list all possible values in a table together with their respective probabilities. An example is the outcome of rolling a die: there are 6 possible outcomes, each with a probability of 1\/6 (and the sum of all probabilities, of course, equals 1).<\/li>\n<li>A <strong>continuous r.v.<\/strong> can take on <strong>all values within a real interval<\/strong>\u2014that is, an infinite number of values within any given interval. The probability that X falls within a given interval is represented by the <strong>area under the probability distribution<\/strong>. In the case of a continuous random variable, probabilities are represented by means of a <strong>probability density function<\/strong>. The total area under the curve (i.e. the total probability) equals 1.<\/li>\n<\/ul>\n<p>Depending on the case, we deal with various types of distributions. These are the most common:<\/p>\n<table>\n<thead>\n<tr>\n<th>Discrete distributions<\/th>\n<th>Continuous distributions<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>\n<ul>\n<li>Binomial<\/li>\n<li>Poisson<\/li>\n<li>Geometric<\/li>\n<\/ul>\n<\/td>\n<td>\n<ul>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/the-normal-distribution\/\">Normal<\/a><\/li>\n<li>Uniform<\/li>\n<li>Student&#8217;s t<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr \/>\n<h2 id=\"bernoulli\">Event Yes or Event No? The Bernoulli Random Variable<\/h2>\n<p>Consider a trial in which we are only interested in verifying whether a certain event has occurred or not. The random variable generated by such a trial will take the value 1 if the event has occurred, 0 otherwise. This r.v. is called a <strong>Bernoulli random variable<\/strong>.<\/p>\n<p>Any dichotomous trial can be represented by a Bernoulli random variable.<\/p>\n<figure class=\"aligncenter\"><a href=\"https:\/\/en.wikipedia.org\/wiki\/Jacob_Bernoulli\" target=\"_blank\" rel=\"noopener\"><img decoding=\"async\" src=\"https:\/\/www.gironi.it\/blog\/wp-content\/uploads\/2018\/09\/Jakob_Bernoulli-268x300.jpg\" alt=\"Jakob Bernoulli - the binomial distribution\" \/><\/a><figcaption>This is Mr. Jakob Bernoulli. The details are on Wikipedia for those interested&#8230;<\/figcaption><\/figure>\n<p>A bit of notation. We denote a Bernoulli r.v. as follows:<\/p>\n\\( x \\sim Bernoulli(\\pi) \\\\ \\)\n<p>Its mean is:<\/p>\n\\( E(x)=\\pi \\\\ \\)\n<p>And its variance is:<\/p>\n\\( V(x)=\\pi(1-\\pi) \\\\ \\)\n<p><strong>All trials that produce only 2 possible outcomes generate Bernoulli random variables<\/strong> (for example, tossing a coin). Starting from this simple assumption, it is a very short step to the Binomial Distribution.<\/p>\n<hr \/>\n<h2 id=\"binomial\">The Binomial Distribution<\/h2>\n<p>Rather than dwelling on the conceptual aspects\u2014important as they are, and for which I refer to specialised texts\u2014what I want to do here is show in practice, and as clearly as possible, what we are talking about. Let us start with a definition and then look at the characteristics and a few practical examples.<\/p>\n<p><strong>The Binomial random variable can be understood as a sum of Bernoulli random variables.<\/strong><\/p>\n<p>What does this mean? Simply that if we repeat the success\u2013failure dichotomy of the Bernoulli random variable <em>n<\/em> times under the same conditions, the result will be a sequence of <em>n<\/em> independent sub-trials, each of which can be associated with a Bernoulli random variable.<\/p>\n<p>What are <strong>the characteristics of the binomial distribution<\/strong>?<\/p>\n<ul>\n<li>There is a <strong>fixed number of trials<\/strong> (<em>n<\/em>).<\/li>\n<li>Each trial has two possible outcomes: <strong>success<\/strong> or <strong>failure<\/strong>.<\/li>\n<li>The <strong>probability of success<\/strong> (<em>p<\/em>) is <strong>the same<\/strong> for every trial.<\/li>\n<li>The outcome of one trial does not affect any other (the trials are <strong>independent<\/strong>).<\/li>\n<\/ul>\n<p>If even one of these characteristics is absent, the binomial distribution does not apply.<\/p>\n<p style=\"background-color:#f0f0f0;padding:1em;\"><strong>From a practical standpoint, the binomial distribution allows us to calculate the probability of obtaining <em>r<\/em> successes in <em>n<\/em> independent trials.<\/strong><\/p>\n<p>The probability of a certain number of successes, <em>r<\/em>, depends on <em>r<\/em> itself, on the number of trials <em>n<\/em>, and on the individual probability, which we denote by <em>p<\/em>.<\/p>\n<p>The probability of <em>r<\/em> successes in <em>n<\/em> trials is given by:<\/p>\n\\( \\frac{n!}{r!(n-r)!} \\times p^r (1-p)^{n-r} \\\\ \\)\n<p>Looks difficult? It really is not (and in practice it turns out to be useful and even fun!).<\/p>\n<div style=\"border:1px dotted silver; padding:8px;\">\nNOTE: The part<br \/>\n\\(<br \/>\n\\frac{n!}{r!(n-r)!} \\\\<br \/>\n\\)<br \/>\nis called the <strong>binomial coefficient<\/strong>, and is found in textbooks written as:<br \/>\n\\(<br \/>\n{n\\choose k} \\\\<br \/>\n\\)\n<\/div>\n<p>First, let us recall that the symbol ! in mathematics denotes the <em>factorial<\/em>. As you will certainly remember, the factorial of 3, i.e. 3!, is: 3 &times; 2 &times; 1 = 6; the factorial of 4, i.e. 4!, is: 4 &times; 3 &times; 2 &times; 1 = 24; and so on (it will not escape notice that the factorial grows very, very quickly as the number increases&#8230;).<\/p>\n<blockquote>\n<p><strong>The factorial of a natural number is the product of that number by all its predecessors.<\/strong><\/p>\n<\/blockquote>\n<p>With that said, let us first see how to find the mean\u2014the centre of our distribution\u2014and the variance. This way, we will have everything we need for a few practical examples.<\/p>\n<hr \/>\n<h2 id=\"mean-variance\">Mean, Expected Value, and Variance of a Binomial Distribution<\/h2>\n<p>Let us call <em>x<\/em> the expected value. We can write our problem as follows:<\/p>\n\\( x \\sim Binomial(n, p) \\\\ \\)\n<p>The mean is:<\/p>\n\\( E(x) = n \\times p \\\\ \\)\n<p>The variance is:<\/p>\n\\( Var(x) = n \\times p \\times (1 &#8211; p) \\\\ \\)\n<p>At this point, an example is in order. Let us calculate the variance of a distribution with size <em>n<\/em> = 10 and individual probability <em>p<\/em> = 0.5 (i.e. 50%). For instance, this could represent ten coin tosses.<\/p>\n\\( x \\sim Binomial(10, 0.5) \\\\ \\)\n<p>So the variance will be:<\/p>\n\\( Var(x) = 10 \\times 0.5 \\times (1 &#8211; 0.5) = 2.5 \\\\ \\)\n<p>And the mean, naturally, will be:<\/p>\n\\( E(x) = 10 \\times 0.5 = 5 \\\\ \\)\n<p><em>Side note: it is intuitive that if p = 1 &#8211; p = 0.5, the probability distribution will be symmetric. If p &lt; 0.5, it will be right-skewed, and if p &gt; 0.5, it will be left-skewed.<\/em><\/p>\n<hr \/>\n<h2 id=\"probability-density-example\">An Example: Computing the Probability Density<\/h2>\n<p>Let us now introduce the concept of <strong>probability density<\/strong>, which is what we will use most often in real-world applications. This is when, for example, we want to know the probability that exactly two out of ten coin tosses come up heads.<\/p>\n<p>To explain this more clearly, let us take a problem from a textbook:<\/p>\n<p><em>If I cross a black mouse with a white one, there is a 3\/4 probability that the offspring will be black and 1\/4 that it will be white. What is the probability that out of 7 offspring, exactly 3 are white?<\/em><\/p>\n<p>Let us write down the data straight away:<\/p>\n<ul>\n<li><em>n<\/em> = 7<\/li>\n<li><em>r<\/em> = 3<\/li>\n<li><em>p<\/em> = 1\/4, i.e. 0.25<\/li>\n<\/ul>\n<p>And now? Shall we do the calculations by hand? Why not:<\/p>\n\\( \\frac{n!}{r!(n-r)!} \\times p^r (1-p)^{n-r} \\\\ \\\\ \\)\n<p>therefore:<\/p>\n\\( \\frac{7!}{3!4!} \\times 0.25^{3} \\times 0.75^{4} = 35 \\times 0.0049439 = 0.173 \\\\ \\)\n<p>That is, 17.3%.<\/p>\n<p>Doing calculations by hand is fun, but we are lazy and have R at our disposal. In R, the probability density is computed by a simple function:<\/p>\n<p><strong>dbinom()<\/strong><\/p>\n<p>The problem is therefore solved with the simple instruction:<\/p>\n<pre><code class=\"language-r\">dbinom(3, 7, 0.25)\n# [1] 0.173<\/code><\/pre>\n<p>which gives us 0.173, so the answer is 17.3%.<\/p>\n<hr \/>\n<h2 id=\"other-distributions\">Other Discrete Distributions<\/h2>\n<p>There are equally interesting questions that call upon other discrete distributions:<\/p>\n<ul>\n<li>How many trials should we expect before obtaining a success? This is where the <a href=\"https:\/\/www.gironi.it\/blog\/la-distribuzione-geometrica\/\">geometric distribution<\/a> enters the scene.<\/li>\n<li>How many times can we expect an event to occur (or not) in a given time interval? That calls for the <a href=\"https:\/\/www.gironi.it\/blog\/la-distribuzione-di-poisson\/\">Poisson distribution<\/a>.<\/li>\n<li>Are we sampling from a population without replacement? Then we use the <a href=\"https:\/\/www.gironi.it\/blog\/en\/the-hypergeometric-distribution\/\">hypergeometric distribution<\/a>.<\/li>\n<\/ul>\n<p>As we can see, this is a vast and fascinating topic, which we will explore (lightly) across several articles. In the next one, we will look at another important distribution: the <a href=\"https:\/\/www.gironi.it\/blog\/en\/the-beta-distribution-explained-simply\/\">beta distribution<\/a>, which plays a central role in Bayesian statistics.<\/p>\n<hr \/>\n<h3>You might also like<\/h3>\n<ul>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/first-steps-into-the-world-of-probability\/\">First Steps into the World of Probability<\/a><\/li>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/the-hypergeometric-distribution\/\">The Hypergeometric Distribution<\/a><\/li>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/the-negative-binomial-distribution\/\">The Negative Binomial Distribution<\/a><\/li>\n<\/ul>\n<hr \/>\n<h3 id=\"further-reading\">Further Reading<\/h3>\n<p>For an accessible yet thorough introduction to probability distributions and the reasoning behind them, <a href=\"https:\/\/www.amazon.it\/dp\/8867319396?tag=consulenzeinf-21\" rel=\"nofollow sponsored noopener\" target=\"_blank\"><em>Finalmente ho capito la statistica<\/em><\/a> by Maurizio De Pra covers discrete distributions\u2014including the binomial\u2014in a clear and approachable style, ideal for building solid intuition before moving on to more advanced topics.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A random variable (also called a stochastic variable) is a variable that can take on different values depending on some random phenomenon. In many statistics textbooks it is simply abbreviated as r.v. It is a numerical value. When probability values are assigned to all the possible numerical values of a random variable x, the result &hellip; <a href=\"https:\/\/www.gironi.it\/blog\/en\/probability-distributions-discrete-distributions-and-the-binomial\/\" class=\"more-link\">Leggi tutto<span class=\"screen-reader-text\"> &#8220;Probability Distributions: Discrete Distributions and the Binomial&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","footnotes":""},"categories":[161],"tags":[],"class_list":["post-3469","post","type-post","status-publish","format-standard","hentry","category-statistics"],"lang":"en","translations":{"en":3469,"it":807},"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false,"post-thumbnail":false},"uagb_author_info":{"display_name":"autore-articoli","author_link":"https:\/\/www.gironi.it\/blog\/author\/autore-articoli\/"},"uagb_comment_info":2,"uagb_excerpt":"A random variable (also called a stochastic variable) is a variable that can take on different values depending on some random phenomenon. In many statistics textbooks it is simply abbreviated as r.v. It is a numerical value. When probability values are assigned to all the possible numerical values of a random variable x, the result&hellip;","_links":{"self":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3469","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/comments?post=3469"}],"version-history":[{"count":1,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3469\/revisions"}],"predecessor-version":[{"id":3476,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3469\/revisions\/3476"}],"wp:attachment":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/media?parent=3469"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/categories?post=3469"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/tags?post=3469"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}