{"id":3872,"date":"2026-06-25T09:50:19","date_gmt":"2026-06-25T08:50:19","guid":{"rendered":"https:\/\/www.gironi.it\/blog\/?p=3872"},"modified":"2026-06-25T09:50:20","modified_gmt":"2026-06-25T08:50:20","slug":"bayesian-conversion-rate-estimation","status":"publish","type":"post","link":"https:\/\/www.gironi.it\/blog\/en\/bayesian-conversion-rate-estimation\/","title":{"rendered":"Bayesian Conversion Rate Estimation: how much can we trust limited data"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">In the article on the <a href=\"https:\/\/www.gironi.it\/blog\/en\/bayesian-statistics-how-to-learn-from-data-one-step-at-a-time\/\">foundations of Bayesian statistics<\/a>, we saw how Bayesian updating works through simulation: generate samples from the prior, simulate data, filter. An intuitive method, but one that runs into a practical limit as soon as data becomes even slightly numerous.<br> In this article we move to the elegant analytical solution that the Bayesian approach provides for one of the most common problems in marketing analysis: estimating a conversion rate with limited data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The problem always starts the same way. A small e-commerce store has collected 23 conversions out of 412 sessions. The raw rate is 23\/412 \u2248 5.6%. A seemingly precise number. But how much do we trust it? We could be looking at the true 3% or the true 9% \u2014 with that sample, we simply do not know. The point estimate &#8220;5.6%&#8221; says nothing about its own uncertainty.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p class=\"wp-block-paragraph\"><strong>What we will cover<\/strong>:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>The Beta-Binomial model: why Beta is the natural distribution for a conversion rate<\/li><li>Non-informative prior: letting the data speak<\/li><li>Informative prior: using historical data without cheating<\/li><li>Today&#8217;s posterior is tomorrow&#8217;s prior<\/li><li>Try it yourself<\/li><li>Further reading<\/li><\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">From single click to rate: the Beta-Binomial model<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Each session is a binary event: the user converts or does not.<br> With \\( n \\) sessions and \\( k \\) conversions, the generative mechanism is binomial. The parameter we want to estimate \u2014 the true conversion rate \\( \\theta \\) \u2014 is a proportion: a value between 0 and 1.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">When the prior on \\( \\theta \\) is a <a href=\"https:\/\/www.gironi.it\/blog\/en\/the-beta-distribution-explained-simply\/\">Beta distribution<\/a> and the data are binomial, something very convenient happens: <strong>the posterior is also a Beta distribution<\/strong>. This is called a <em>conjugate<\/em> prior, and it means Bayesian updating reduces to a simple arithmetic operation on the parameters.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The updating rule is: if the prior is Beta(\u03b1, \u03b2), after observing \\( k \\) conversions out of \\( n \\) sessions the posterior is:<\/p>\n\n\n\n\\( Beta(\\alpha + k,\\ \\beta + (n &#8211; k)) \\\\ \\)\n\n\n\n<p class=\"wp-block-paragraph\">In plain words: we add the observed conversions to \u03b1 and the observed failures to \u03b2. The prior Beta(\u03b1, \u03b2) encodes in \u03b1 the &#8220;conversions already seen&#8221; (or an equivalent belief) and in \u03b2 the &#8220;non-conversions&#8221;. Each new observation updates both counters.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Non-informative prior: letting the data speak<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The most neutral prior possible for a proportion is the uniform distribution on [0, 1], which corresponds to Beta(1, 1): all values of the rate are considered equally plausible before seeing any data.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Our case: 23 conversions out of 412 sessions (389 non-conversions).<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We calculate the posterior in R:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Observed data\nconv &lt;- 23; sess &lt;- 412; nonconv &lt;- sess - conv\n\n# NON-INFORMATIVE prior: uniform Beta(1,1)\na0 &lt;- 1; b0 &lt;- 1\na1 &lt;- a0 + conv; b1 &lt;- b0 + nonconv      # posterior Beta(24, 390)\n\ncat(\"Non-informative: mean =\", round(a1\/(a1+b1), 4), \"\\n\")\ncat(\"  95% CI =\", round(qbeta(c(.025,.975), a1, b1), 4), \"\\n\")<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Output: mean = 0.058, 95% CI = [0.0376, 0.0824].<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The posterior Beta(24, 390) has mean 5.8% and a 95% credible interval between 3.8% and 8.2%.<\/strong><br> The Bayesian credible interval is not an abstract statistical exercise: it means directly that there is a 95% probability that the true conversion rate lies between 3.8% and 8.2%. Not a frequency over infinite repetitions \u2014 a direct probability statement on the parameter.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With 412 sessions, the uncertainty is still appreciable: almost 5 percentage points of width. The point estimate 5.6% was misleading in its precision.<\/p>\n\n\n\n<p class=\"has-light-gray-background-color has-background wp-block-paragraph\">A note of caution: the Bayesian credible interval and the frequentist confidence interval have similar numbers but profoundly different meanings. The frequentist 95% is a property of the procedure (&#8220;repeating the experiment 100 times, 95 intervals would contain the true parameter&#8221;); the Bayesian 95% is a direct statement about the parameter in the specific case we are analysing.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Informative prior: using historical data without cheating<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The non-informative prior is the honest starting point when we know nothing. But often we do know something: years of campaigns, sector history, category benchmarks.<br> Our e-commerce has four seasons of history with an average conversion rate around 4%. How do we translate this knowledge into a prior?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Beta(8, 192) distribution has mean exactly 8\/(8+192) \u2248 4% and \u2014 because \u03b1+\u03b2 = 200 \u2014 a concentration equivalent to &#8220;trusting&#8221; our data as much as 200 fictitious historical sessions. It is not an arbitrary number: it is a declared and verifiable choice.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We calculate the informative posterior in R:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># INFORMATIVE prior from history: mean ~4% -&gt; Beta(8, 192)\na0i &lt;- 8; b0i &lt;- 192\na1i &lt;- a0i + conv; b1i &lt;- b0i + nonconv  # posterior Beta(31, 581)\n\ncat(\"Informative: mean =\", round(a1i\/(a1i+b1i), 4), \"\\n\")\ncat(\"  95% CI =\", round(qbeta(c(.025,.975), a1i, b1i), 4), \"\\n\")<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Output: mean = 0.0507, 95% CI = [0.0347, 0.0694].<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The informative posterior Beta(31, 581) gives mean 5.1% and 95% credible interval between 3.5% and 6.9%.<\/strong><br> Two things to notice. First: the mean drops slightly from 5.8% to 5.1% \u2014 the prior &#8220;pulls&#8221; the estimate toward the historical 4%. Second: the interval narrows (from 4.6 to 3.4 percentage points) \u2014 the historical data acts as additional information, so uncertainty decreases.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With limited data, the informative prior helps: it adds information where data alone is insufficient. With many data points \u2014 thousands of sessions \u2014 the prior gets <em>overwhelmed<\/em> by the data and the difference between informative and non-informative priors becomes negligible. This is a fundamental feature of Bayesian inference: <strong>the prior matters when data is scarce; data always wins in the end<\/strong>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Today&#8217;s posterior is tomorrow&#8217;s prior<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">The most practical elegance of the Bayesian approach is sequential updating. After one month, new data arrives: 15 conversions from 300 additional sessions. We do not need to start over \u2014 the posterior we just calculated <em>becomes<\/em> the new prior.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We update in R:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># New data: 15 conversions from 300 additional sessions\nconv2 &lt;- 15; sess2 &lt;- 300\na2 &lt;- a1i + conv2; b2 &lt;- b1i + (sess2 - conv2)   # Beta(46, 866)\n\ncat(\"After update: mean =\", round(a2\/(a2+b2), 4), \"\\n\")\ncat(\"  95% CI =\", round(qbeta(c(.025,.975), a2, b2), 4), \"\\n\")<\/code><\/pre>\n\n\n\n<p class=\"wp-block-paragraph\">Output: mean = 0.0504, 95% CI = [0.0372, 0.0655].<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Three stages compared:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>Stage<\/th><th>Accumulated data<\/th><th>Mean<\/th><th>95% CI<\/th><\/tr><\/thead><tbody><tr><td>Pure prior (before any data)<\/td><td>\u2014<\/td><td>4.0%<\/td><td>[1.8%, 7.1%]<\/td><\/tr><tr><td>After first month<\/td><td>23\/412<\/td><td>5.1%<\/td><td>[3.5%, 6.9%]<\/td><\/tr><tr><td>After second month<\/td><td>38\/712<\/td><td>5.0%<\/td><td>[3.7%, 6.6%]<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>The credible interval has narrowed<\/strong> at each stage: 5.3 percentage points for the pure prior, 3.4 after the first month, 2.8 after the update. The mean remained stable: the new data confirms the previous estimate instead of shifting it, and uncertainty decreases as expected.<br> This is the real operational advantage: there is no need to wait for a &#8220;large enough&#8221; sample to accumulate before making any estimate. We start from an uncertain estimate and refine it progressively, with each new data point.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Try it yourself<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">A lead generation website has a historical conversion rate around 2%. After an optimisation campaign, 8 conversions are observed from 150 sessions.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">1. Build an <strong>informative prior<\/strong> reflecting the historical 2%: try Beta(4, 196) \u2014 it has mean exactly 2%. 2. Calculate the <strong>posterior<\/strong> after 8 conversions from 150 sessions. 3. Calculate the <strong>95% credible interval<\/strong>. 4. Now try a <strong>non-informative<\/strong> Beta(1, 1) prior: does the posterior change much? Why?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Hint: the formula is always the same \u2014 <code>qbeta(c(.025, .975), a0 + conv, b0 + nonconv)<\/code>. The only thing that changes is the starting point.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<p class=\"wp-block-paragraph\">What we have built so far is the estimation of a rate for a single variant. The next step is comparing two variants \u2014 a control page and a modified page \u2014 and calculating the Bayesian probability that one beats the other. That is exactly what we will do in the next article: Bayesian A\/B testing.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Further reading<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">If you want to understand how Bayesian reasoning enters practical decisions \u2014 from market forecasting to uncertainty estimation in real data \u2014 <a href=\"https:\/\/www.amazon.it\/dp\/0141975652?tag=consulenzeinf-21\" rel=\"nofollow sponsored noopener\" target=\"_blank\"><em>The Signal and the Noise<\/em><\/a> by Nate Silver is the book I recommend. Silver devotes explicit chapters to Bayesian updating, showing it in concrete contexts (weather forecasting, politics, sports) that make the idea of &#8220;updating beliefs with new data&#8221; immediately intuitive. Rigorous but written like a story, it is a rare kind of book that leaves you thinking differently about uncertainty.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the article on the foundations of Bayesian statistics, we saw how Bayesian updating works through simulation: generate samples from the prior, simulate data, filter. An intuitive method, but one that runs into a practical limit as soon as data becomes even slightly numerous. In this article we move to the elegant analytical solution that &hellip; <a href=\"https:\/\/www.gironi.it\/blog\/en\/bayesian-conversion-rate-estimation\/\" class=\"more-link\">Leggi tutto<span class=\"screen-reader-text\"> &#8220;Bayesian Conversion Rate Estimation: how much can we trust limited data&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","footnotes":""},"categories":[161],"tags":[],"class_list":["post-3872","post","type-post","status-publish","format-standard","hentry","category-statistics"],"lang":"en","translations":{"en":3872,"it":3871},"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false,"post-thumbnail":false},"uagb_author_info":{"display_name":"Paolo Gironi","author_link":"https:\/\/www.gironi.it\/blog\/author\/autore-articoli\/"},"uagb_comment_info":0,"uagb_excerpt":"In the article on the foundations of Bayesian statistics, we saw how Bayesian updating works through simulation: generate samples from the prior, simulate data, filter. An intuitive method, but one that runs into a practical limit as soon as data becomes even slightly numerous. In this article we move to the elegant analytical solution that&hellip;","_links":{"self":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3872","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/comments?post=3872"}],"version-history":[{"count":2,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3872\/revisions"}],"predecessor-version":[{"id":3879,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3872\/revisions\/3879"}],"wp:attachment":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/media?parent=3872"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/categories?post=3872"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/tags?post=3872"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}