<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>distribution &#8211; paologironi blog</title>
	<atom:link href="https://www.gironi.it/blog/en/tag/distribution/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.gironi.it/blog</link>
	<description>Scattered notes on (retro) computing, data analysis, statistics, SEO, and things that change</description>
	<lastBuildDate>Sun, 08 Dec 2024 11:02:24 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	
	<item>
		<title>The Hypergeometric Distribution</title>
		<link>https://www.gironi.it/blog/en/the-hypergeometric-distribution/</link>
					<comments>https://www.gironi.it/blog/en/the-hypergeometric-distribution/#respond</comments>
		
		<dc:creator><![CDATA[paolo]]></dc:creator>
		<pubDate>Fri, 24 Mar 2023 13:15:00 +0000</pubDate>
				<category><![CDATA[probability]]></category>
		<category><![CDATA[distribution]]></category>
		<category><![CDATA[hypergeometric]]></category>
		<guid isPermaLink="false">https://www.gironi.it/blog/?p=3269</guid>

					<description><![CDATA[We have seen that the binomial distribution is based on the hypothesis of an infinite population N, a condition that can be practically realized by sampling from a finite population with replacement. If this does not occur, meaning if we are sampling from a population without replacement, we must use the hypergeometric distribution. (In reality, &#8230; <a href="https://www.gironi.it/blog/en/the-hypergeometric-distribution/" class="more-link">Continue reading<span class="screen-reader-text"> "The Hypergeometric Distribution"</span></a>]]></description>
										<content:encoded><![CDATA[
<p>We have seen that the <a href="https://www.gironi.it/blog/distribuzioni-di-probabilita-distribuzioni-discrete-la-binomiale/" data-type="post" data-id="807" target="_blank" rel="noreferrer noopener"><strong>binomial distribution</strong></a> is based on the hypothesis of an infinite population N, a condition that can be practically realized by sampling from a finite population <strong>with replacement</strong>.</p>



<p>If this does not occur, meaning if we are sampling from a population <strong>without replacement</strong>, we must use the <strong>hypergeometric distribution</strong>. (In reality, if N is large, the hypergeometric probability density function tends towards the binomial).</p>



<p class="has-light-gray-background-color has-background">The hypergeometric distribution is used to calculate the probability of obtaining a certain number of successes in a series of binary trials (yes or no), which are dependent and have a variable probability of success.</p>



<p>The hypergeometric distribution allows us to answer questions like:</p>



<p class="has-light-gray-background-color has-background">If I take a sample of size N, in which M elements meet certain requirements, what is the probability of drawing x elements that meet those requirements?</p>



<span id="more-3269"></span>


				<div class="wp-block-uagb-table-of-contents uagb-toc__align-left uagb-toc__columns-1  uagb-block-f5fe3cc3      "
					data-scroll= "1"
					data-offset= "30"
					style=""
				>
				<div class="uagb-toc__wrap">
						<div class="uagb-toc__title">
							What we will discuss						</div>
																						<div class="uagb-toc__list-wrap ">
						<ol class="uagb-toc__list"><li class="uagb-toc__list"><a href="#lets-start-with-the-formula" class="uagb-toc-link__trigger">Let&#039;s start with the formula</a><li class="uagb-toc__list"><a href="#the-hypergeometric-distribution-explained-with-examples" class="uagb-toc-link__trigger">The hypergeometric distribution explained with examples</a><li class="uagb-toc__list"><a href="#can-an-example-with-an-urn-and-balls-be-missing" class="uagb-toc-link__trigger">Can an example with an urn and balls be missing?</a><li class="uagb-toc__list"><a href="#further-examination-of-the-hypergeometric-distribution" class="uagb-toc-link__trigger">Further Examination of the Hypergeometric Distribution</a></ol>					</div>
									</div>
				</div>
			


<h2 class="wp-block-heading">Let&#8217;s start with the formula</h2>



<p>I express my distribution in the form of a formula:</p>



\(
f(X|N,M,n)=\frac{C^{N-M}_{n-x}\times C^M_x}{C^N_n} \
\)



<h2 class="wp-block-heading">The hypergeometric distribution explained with examples</h2>



<p>We know that a batch of 30 pieces contains 6 malfunctioning pieces.<br>If I take a sample of 5 pieces, what is the probability of finding exactly 2 defective pieces?</p>



<p>I&#8217;ll immediately write down the data:</p>



<ul class="wp-block-list">
<li>N=30 (<em>the total number of pieces in my batch</em>)</li>



<li>M=6 (<em>the total malfunctioning pieces present in the batch</em>)</li>



<li>x=2 (<em>I want to know the probability of finding 2 defective pieces</em>)</li>



<li>n=5 (<em>the size of my sample</em>)</li>
</ul>



<p>Let&#8217;s see how to solve the same problem in R:</p>



<pre class="wp-block-preformatted"># Definition of the hypergeometric distribution parameters
x &lt;- 2 # I want to know the probability of finding 2 defective pieces
n &lt;- 5 # the size of my sample
M &lt;- 6 # the total malfunctioning pieces present in the batch
N &lt;- 30 # the total number of pieces in my batch

# Probability calculation with the dhyper function
prob &lt;- dhyper(x, M, N - M, n)
prob</pre>



<p>and I get the output:</p>



<pre class="wp-block-preformatted">[1] 0.2130437</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Can an example with an urn and balls be missing?</h2>



<div class="wp-block-uagb-image aligncenter uagb-block-eb7e4992 wp-block-uagb-image--layout-default wp-block-uagb-image--effect-static wp-block-uagb-image--align-center"><figure class="wp-block-uagb-image__figure"><img decoding="async" srcset="https://www.gironi.it/blog/wp-content/uploads/2023/03/07bed749-7708-4f32-8b92-d46342b9f532-300x300.jpeg " src="https://www.gironi.it/blog/wp-content/uploads/2023/03/07bed749-7708-4f32-8b92-d46342b9f532-300x300.jpeg" alt="Hypergeometric distribution: drawing white or black balls from an urn." class="uag-image-2945" width="300" height="300" title="" loading="lazy"/></figure></div>



<p>Let&#8217;s now make another example: let&#8217;s estimate the probability that in an urn with 10 white balls and 5 black ones, drawing 4 balls without replacement, we get 3 white and 1 black. So:</p>



<ul class="wp-block-list">
<li>x=3 Number of white balls drawn</li>



<li>n=4 Number of balls drawn</li>



<li>M=5 Number of black balls</li>



<li>N = 15 Total number of balls</li>
</ul>



<p>We have seen that in R, it&#8217;s possible to use the <code>dhyper</code> function to calculate the probability of drawing 3 white balls and 1 black ball from the described urn.</p>



<p>Here&#8217;s the R code:</p>



<pre class="wp-block-preformatted"># Definition of the hypergeometric distribution parameters
x &lt;- 3 # Number of white balls drawn
n &lt;- 4 # Number of balls drawn
M &lt;- 5 # Number of black balls
N &lt;- 15 # Total number of balls

# Probability calculation with the dhyper function
prob &lt;- dhyper(x, M, N - M, n)
prob</pre>



<p>The probability of drawing 3 white balls and 1 black ball is therefore 0.07326007, or about 7.33%.</p>



<h2 class="wp-block-heading"><strong>Further Examination of the Hypergeometric Distribution</strong></h2>



<ul class="wp-block-list">
<li><a href="https://it.wikipedia.org/wiki/Distribuzione_ipergeometrica" target="_blank" rel="noreferrer noopener">Hypergeometric Distribution &#8211; Wikipedia</a></li>



<li><a href="https://www.webtutordimatematica.it/materie/statistica-e-probabilita/distribuzioni-di-probabilita-discrete/distribuzione-ipergeometrica" target="_blank" rel="noreferrer noopener">Hypergeometric Distribution &#8211; WebTutorDiMatematica.it</a></li>



<li><a href="https://www.okpedia.it/distribuzione-ipergeometrica" target="_blank" rel="noreferrer noopener">Hypergeometric Distribution &#8211; Okpedia</a></li>
</ul>
]]></content:encoded>
					
					<wfw:commentRss>https://www.gironi.it/blog/en/the-hypergeometric-distribution/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>The Negative Binomial Distribution (or Pascal Distribution)</title>
		<link>https://www.gironi.it/blog/en/the-negative-binomial-distribution-or-pascal-distribution/</link>
					<comments>https://www.gironi.it/blog/en/the-negative-binomial-distribution-or-pascal-distribution/#respond</comments>
		
		<dc:creator><![CDATA[paolo]]></dc:creator>
		<pubDate>Tue, 21 Mar 2023 13:25:59 +0000</pubDate>
				<category><![CDATA[probability]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[distribution]]></category>
		<guid isPermaLink="false">https://www.gironi.it/blog/?p=3272</guid>

					<description><![CDATA[The negative binomial distribution describes the number of trials needed to achieve a certain number of successes in a series of independent trials. For example, it could be used to calculate the probability of getting three heads when flipping a coin 5 times, assuming the coin is balanced and therefore the probability of getting heads &#8230; <a href="https://www.gironi.it/blog/en/the-negative-binomial-distribution-or-pascal-distribution/" class="more-link">Continue reading<span class="screen-reader-text"> "The Negative Binomial Distribution (or Pascal Distribution)"</span></a>]]></description>
										<content:encoded><![CDATA[
<p>The negative binomial distribution describes the number of trials needed to achieve a certain number of successes in a series of independent trials. For example, it could be used to calculate the probability of getting three heads when flipping a coin 5 times, assuming the coin is balanced and therefore the probability of getting heads on each flip is 50%.</p>



<p>The negative binomial distribution is useful in many fields, including statistics, economics, biology, and physics. And also in &#8220;<em>our</em>&#8221; SEO.</p>



<span id="more-3272"></span>


				<div class="wp-block-uagb-table-of-contents uagb-toc__align-left uagb-toc__columns-1  uagb-block-41c526a3      "
					data-scroll= "1"
					data-offset= "30"
					style=""
				>
				<div class="uagb-toc__wrap">
						<div class="uagb-toc__title">
							What we&#8217;ll discuss						</div>
																						<div class="uagb-toc__list-wrap ">
						<ol class="uagb-toc__list"><li class="uagb-toc__list"><a href="#defining-the-negative-binomial-distribution-or-pascal-distribution" class="uagb-toc-link__trigger">Defining the Negative Binomial Distribution (or Pascal Distribution)</a><li class="uagb-toc__list"><a href="#examples-of-using-the-negative-binomial-distribution" class="uagb-toc-link__trigger">Examples of Using the Negative Binomial Distribution</a><li class="uagb-toc__list"><a href="#differences-between-the-geometric-distribution-and-pascal-distribution" class="uagb-toc-link__trigger">Differences Between the Geometric Distribution and Pascal Distribution</a><li class="uagb-toc__list"><a href="#links-to-authoritative-resources-for-further-study" class="uagb-toc-link__trigger">Links to Authoritative Resources for Further Study</a></ol>					</div>
									</div>
				</div>
			


<h2 class="wp-block-heading">Defining the Negative Binomial Distribution (or Pascal Distribution)</h2>



<p class="has-light-gray-background-color has-background">The negative binomial distribution is a <strong>discrete probability distribution</strong> that describes the number of trials needed to achieve a certain number of successes in a series of independent trials.</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img fetchpriority="high" decoding="async" width="768" height="768" src="https://www.gironi.it/blog/wp-content/uploads/2023/03/distribuzione-di-Pascal.jpg" alt="Imaginary depiction of Blaise Pascal plotting the graph of an inverse binomial distribution" class="wp-image-2902" srcset="https://www.gironi.it/blog/wp-content/uploads/2023/03/distribuzione-di-Pascal.jpg 768w, https://www.gironi.it/blog/wp-content/uploads/2023/03/distribuzione-di-Pascal-300x300.jpg 300w, https://www.gironi.it/blog/wp-content/uploads/2023/03/distribuzione-di-Pascal-150x150.jpg 150w" sizes="(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px" /></figure>
</div>


<p>A trial is said to succeed with probability p, and we want to achieve a total of r successes. <strong>The negative binomial distribution provides the probability of achieving r successes in the first n trials</strong>.</p>



<p>The parameters used in the negative binomial distribution are:</p>



<ul class="wp-block-list">
<li><strong>the probability of success p</strong>, which indicates the probability of achieving a success in a single trial.</li>



<li><strong>the desired number of successes r</strong>, which indicates the total number of successes we want to achieve.</li>



<li><strong>the number of trials needed n</strong>, which indicates the number of trials that must be conducted to achieve r successes.</li>
</ul>



<p>The negative binomial distribution is often denoted with the following notation:</p>



\(
X \sim NB(r,p) \\
\)



<p>where X indicates the number of trials needed to achieve r successes, and the symbol &#8220;~&#8221; means &#8220;distributed as&#8221;.</p>



<h2 class="wp-block-heading">Examples of Using the Negative Binomial Distribution</h2>



<p>The negative binomial distribution can be applied in various situations, for example:</p>



<ul class="wp-block-list">
<li>In marketing, to calculate the number of attempts needed to achieve a certain number of sales or conversions.</li>



<li>In biology, to calculate the number of attempts needed to achieve a certain number of successes in a series of experiments (for example, the number of attempts needed to isolate a specific bacterial strain).</li>



<li>In engineering, to calculate the number of repetitions needed to test the resistance of a material or structure.</li>
</ul>



<p>To calculate the probability of achieving r successes in the first n trials, we can use the following formula:</p>



\(
P(X = n) = {n-1 \choose r-1} p^r (1-p)^{n-r} \\
\)



<p>where X indicates the number of trials needed to achieve r successes, p indicates the probability of success in a single trial, and C indicates the binomial coefficient.</p>



<p>For example, suppose we want to calculate the probability of getting 3 heads when flipping a fair coin (probability of success p=0.5). If we assume that 5 flips are needed to get 3 heads, we can use the negative binomial distribution to calculate the probability of success:</p>



\(
P(X = 5) = {4 \choose 2} \cdot 0.5^3 \cdot 0.5^2 = 0.3125 \\
\)



<p>This means that the probability of getting 3 heads in 5 flips is 31.25%.</p>



<h2 class="wp-block-heading">Differences Between the Geometric Distribution and Pascal Distribution</h2>



<p>The negative binomial distribution and the <a href="https://www.gironi.it/blog/la-distribuzione-geometrica/" target="_blank" data-type="post" data-id="863" rel="noreferrer noopener">geometric distribution</a> are both discrete probability distributions used to model the number of trials needed to achieve a certain number of successes. However, the two distributions differ in the definition of success and the modeling objective.</p>



<p>The geometric distribution is used to model the number of trials needed to achieve the first success in a sequence of independent identically distributed trials. For example, the probability of achieving the first success in a fair coin toss can be modeled with a geometric distribution, where the probability of success is p=0.5 and the number of trials needed can take the values 1, 2, 3, &#8230;.</p>



<p>In more precise terms:</p>



<p class="has-light-gray-background-color has-background">The main difference between the <a href="https://www.gironi.it/blog/la-distribuzione-geometrica/" target="_blank" data-type="post" data-id="863" rel="noreferrer noopener">geometric distribution</a> and the Pascal distribution is that the geometric distribution represents the total number of attempts needed to achieve one success, while the Pascal distribution represents the number of failures before the k-th success in a succession of independent and identically distributed Bernoulli experiments.</p>



<p>In other words, the geometric distribution describes the time needed to achieve the first success, while the Pascal distribution describes the time needed to achieve a certain fixed number of successes. Moreover, the <strong>geometric distribution has only one parameter</strong> (the probability of success), while the <strong>Pascal distribution has two parameters</strong> (the desired number of successes and the probability of success).</p>



<p>The formulas for the negative binomial distribution and the geometric distribution are similar, but with some differences in parameters and modeling objective.</p>



<p>For example, let&#8217;s consider the case of a fair coin with p=0.5. The probability of achieving the first success in 3 trials can be calculated with a geometric distribution:</p>



\(
P(X = 3) = (1-0.5)^2 \cdot 0.5 = 0.125 \\
\)



<p>where X indicates the number of trials needed to achieve the first success.</p>



<p>On the other hand, the probability of achieving 3 successes in 5 trials can be calculated with a negative binomial distribution:</p>



\(
P(X = 5) = {4 \choose 2} \cdot 0.5^3 \cdot 0.5^2 = 0.3125 \\
\)



<p>where X indicates the number of trials needed to achieve 3 successes.</p>



<h2 class="wp-block-heading">Links to Authoritative Resources for Further Study</h2>



<ul class="wp-block-list">
<li><a href="https://www.sciencedirect.com/topics/mathematics/pascal-distribution" target="_blank" rel="noreferrer noopener">Pascal Distribution &#8211; an overview | ScienceDirect Topics</a></li>



<li><a href="http://math.clarku.edu/~djoyce/ma217/distributions.pdf" target="_blank" rel="noreferrer noopener">http://math.clarku.edu/~djoyce/ma217/distributions.pdf</a> (pdf file)</li>
</ul>
]]></content:encoded>
					
					<wfw:commentRss>https://www.gironi.it/blog/en/the-negative-binomial-distribution-or-pascal-distribution/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
