{"id":3495,"date":"2026-03-06T09:07:28","date_gmt":"2026-03-06T08:07:28","guid":{"rendered":"https:\/\/www.gironi.it\/blog\/?p=3495"},"modified":"2026-03-06T09:09:28","modified_gmt":"2026-03-06T08:09:28","slug":"ab-test-sample-size-calculator","status":"publish","type":"post","link":"https:\/\/www.gironi.it\/blog\/en\/ab-test-sample-size-calculator\/","title":{"rendered":"A\/B Test Sample Size Calculator"},"content":{"rendered":"<p>One of the most common questions when planning an <strong>A\/B test<\/strong> is: <em>how many users do I need to get a reliable result?<\/em> The answer is not a magic number: it depends on the size of the effect we want to detect, the baseline conversion rate, and the level of statistical certainty we require.<\/p>\n<p>Calculating the <strong>sample size<\/strong> in advance is essential to avoid two classic mistakes: stopping the test too early and declaring a winner that does not exist, or letting it run too long, wasting traffic and time. In other words, it is about finding the right balance between resources and rigour.<\/p>\n<p>If you have read the article on <a href=\"https:\/\/www.gironi.it\/blog\/en\/guide-to-statistical-tests-for-a-b-analysis\/\">A\/B Testing<\/a>, you will recall that <strong>power analysis<\/strong> is the statistical method that lets us determine this threshold. And if you have studied <a href=\"https:\/\/www.gironi.it\/blog\/en\/confidence-intervals-what-they-are-how-to-calculate-them-and-what-they-do-not-mean\/\">confidence intervals<\/a>, you already know that significance level and test power are not abstract concepts but operational levers that directly affect sample size.<\/p>\n<p><!--more--><\/p>\n<p>The calculator below automates this process: simply enter your test parameters to instantly get the number of observations needed per variant and, if you know your daily traffic, an estimate of the test duration in days.<\/p>\n<div style=\"border: 1px solid #ccc;padding: 1.2em 1.5em;margin: 1.5em 0;border-radius: 6px\">\n<h3 style=\"margin-top: 0\">What We&#8217;ll Cover<\/h3>\n<ul>\n<li><a href=\"#calculator\">The calculator<\/a><\/li>\n<li><a href=\"#formula\">The formula: how the calculation works<\/a><\/li>\n<li><a href=\"#how-to-use\">How to use the calculator<\/a><\/li>\n<li><a href=\"#further-reading\">Further reading<\/a><\/li>\n<\/ul>\n<\/div>\n<hr \/>\n<h2 id=\"calculator\">The calculator<\/h2>\n<p>Enter the parameters of your A\/B test and the calculator will instantly return the required sample size.<\/p>\n<style>\n.ss-calc{max-width:620px;margin:2em auto;padding:1.5em 2em;background:#f8f8f8;border:1px solid #ddd;border-radius:8px;font-family:inherit}\n.ss-calc h3{margin:0 0 1em;color:#333;font-size:1.2em}\n.ss-calc label{display:block;margin:0.8em 0 0.3em;font-weight:600;color:#333;font-size:0.95em}\n.ss-calc .ss-hint{font-size:0.82em;color:#777;margin:0.15em 0 0}\n.ss-calc input[type=number],.ss-calc select{width:100%;padding:8px 10px;border:1px solid #ccc;border-radius:4px;font-size:1em;box-sizing:border-box;background:#fff}\n.ss-calc input[type=number]:focus,.ss-calc select:focus{outline:none;border-color:#0073aa;box-shadow:0 0 0 2px rgba(0,115,170,0.15)}\n.ss-calc .ss-row{display:flex;gap:1.2em}\n.ss-calc .ss-col{flex:1}\n.ss-calc .ss-result{margin-top:1.5em;padding:1.2em;background:#fff;border:2px solid #2ecc71;border-radius:6px;text-align:center}\n.ss-calc .ss-result .ss-big{font-size:2em;font-weight:700;color:#2ecc71;display:block;margin:0.2em 0}\n.ss-calc .ss-result .ss-label{font-size:0.85em;color:#666}\n.ss-calc .ss-result .ss-total{font-size:1.1em;color:#333;margin-top:0.5em}\n.ss-calc .ss-result .ss-days{font-size:1em;color:#0073aa;margin-top:0.4em;font-weight:600}\n.ss-calc .ss-warn{color:#e74c3c;font-size:0.85em;margin-top:0.5em;display:none}\n@media(max-width:520px){.ss-calc .ss-row{flex-direction:column;gap:0}.ss-calc{padding:1em 1.2em}}\n<\/style>\n<div class=\"ss-calc\" id=\"ssCalcEn\">\n<h3>Sample Size Calculator<\/h3>\n<p><label for=\"ssBaseEn\">Baseline conversion rate (%)<\/label><br \/>\n<input type=\"number\" id=\"ssBaseEn\" value=\"5\" min=\"0.1\" max=\"100\" step=\"0.1\"><\/p>\n<p class=\"ss-hint\">The current conversion rate of the control variant<\/p>\n<p><label for=\"ssMdeEn\">Minimum detectable effect &mdash; MDE (% relative)<\/label><br \/>\n<input type=\"number\" id=\"ssMdeEn\" value=\"20\" min=\"1\" max=\"100\" step=\"1\"><\/p>\n<p class=\"ss-hint\">The smallest relative improvement we consider meaningful (e.g. 20% = from 5% to 6%)<\/p>\n<div class=\"ss-row\">\n<div class=\"ss-col\">\n<label for=\"ssAlphaEn\">Significance level (&alpha;)<\/label><br \/>\n<select id=\"ssAlphaEn\"><option value=\"0.01\">0.01 (99%)<\/option><option value=\"0.05\" selected>0.05 (95%)<\/option><option value=\"0.10\">0.10 (90%)<\/option><\/select>\n<\/div>\n<div class=\"ss-col\">\n<label for=\"ssPowerEn\">Power (1&minus;&beta;)<\/label><br \/>\n<select id=\"ssPowerEn\"><option value=\"0.80\" selected>0.80<\/option><option value=\"0.85\">0.85<\/option><option value=\"0.90\">0.90<\/option><option value=\"0.95\">0.95<\/option><\/select>\n<\/div>\n<\/div>\n<p><label for=\"ssTrafficEn\">Daily traffic <span style=\"font-weight:400;color:#999\">(optional)<\/span><\/label><br \/>\n<input type=\"number\" id=\"ssTrafficEn\" value=\"\" min=\"1\" step=\"1\" placeholder=\"e.g. 1000\"><\/p>\n<p class=\"ss-hint\">Total daily visitors to estimate test duration<\/p>\n<div class=\"ss-result\" id=\"ssResultEn\">\n<span class=\"ss-label\">Sample size per variant<\/span><br \/>\n<span class=\"ss-big\" id=\"ssNEn\">&mdash;<\/span><\/p>\n<div class=\"ss-total\" id=\"ssTotalEn\"><\/div>\n<div class=\"ss-days\" id=\"ssDaysEn\"><\/div>\n<\/div>\n<div class=\"ss-warn\" id=\"ssWarnEn\"><\/div>\n<\/div>\n<p><script>\n(function(){\n  function qnorm(p){\n    if(p<=0||p>=1)return NaN;\n    if(p<0.5)return -qnorm(1-p);\n    var t=Math.sqrt(-2*Math.log(1-p));\n    var c0=2.515517,c1=0.802853,c2=0.010328;\n    var d1=1.432788,d2=0.189269,d3=0.001308;\n    return t-(c0+c1*t+c2*t*t)\/(1+d1*t+d2*t*t+d3*t*t*t);\n  }\n  function calcSS(){\n    var base=parseFloat(document.getElementById('ssBaseEn').value);\n    var mde=parseFloat(document.getElementById('ssMdeEn').value);\n    var alpha=parseFloat(document.getElementById('ssAlphaEn').value);\n    var power=parseFloat(document.getElementById('ssPowerEn').value);\n    var traffic=document.getElementById('ssTrafficEn').value;\n    var warn=document.getElementById('ssWarnEn');\n    warn.style.display='none';\n    if(isNaN(base)||isNaN(mde)||base<=0||base>100||mde<=0||mde>100){\n      document.getElementById('ssNEn').innerHTML='&mdash;';\n      document.getElementById('ssTotalEn').textContent='';\n      document.getElementById('ssDaysEn').textContent='';\n      return;\n    }\n    var p1=base\/100;\n    var p2=p1*(1+mde\/100);\n    if(p2>1){\n      warn.textContent='Warning: with these values the variant conversion rate would exceed 100%.';\n      warn.style.display='block';\n      document.getElementById('ssNEn').innerHTML='&mdash;';\n      document.getElementById('ssTotalEn').textContent='';\n      document.getElementById('ssDaysEn').textContent='';\n      return;\n    }\n    var za=qnorm(1-alpha\/2);\n    var zb=qnorm(power);\n    var diff=p1-p2;\n    var n=Math.ceil((Math.pow(za+zb,2)*(p1*(1-p1)+p2*(1-p2)))\/(diff*diff));\n    document.getElementById('ssNEn').textContent=n.toLocaleString('en-US');\n    document.getElementById('ssTotalEn').textContent='Total (2 variants): '+(n*2).toLocaleString('en-US')+' observations';\n    if(traffic && parseInt(traffic)>0){\n      var days=Math.ceil((n*2)\/parseInt(traffic));\n      document.getElementById('ssDaysEn').textContent='Estimated duration: about '+days+' days';\n    }else{\n      document.getElementById('ssDaysEn').textContent='';\n    }\n  }\n  ['ssBaseEn','ssMdeEn','ssAlphaEn','ssPowerEn','ssTrafficEn'].forEach(function(id){\n    document.getElementById(id).addEventListener('input',calcSS);\n    document.getElementById(id).addEventListener('change',calcSS);\n  });\n  calcSS();\n})();\n<\/script><\/p>\n<hr \/>\n<h2 id=\"formula\">The formula: how the calculation works<\/h2>\n<p>The calculator uses the standard formula for comparing two proportions with a <strong>two-tailed z-test<\/strong>. Let us walk through it step by step.<\/p>\n<p>We start with the parameters we enter:<\/p>\n<ul>\n<li><strong>p<sub>1<\/sub><\/strong>: the baseline conversion rate (control), expressed as a proportion. If our CR is 5%, then p<sub>1<\/sub> = 0.05.<\/li>\n<li><strong>p<sub>2<\/sub><\/strong>: the expected conversion rate for the variant. If the minimum detectable effect (MDE) is 20% relative, then p<sub>2<\/sub> = p<sub>1<\/sub> &times; (1 + MDE\/100) = 0.05 &times; 1.20 = 0.06.<\/li>\n<li><strong>&alpha;<\/strong>: the significance level, i.e. the probability of declaring an effect when there is none (Type I error). With &alpha; = 0.05 we work at 95% confidence.<\/li>\n<li><strong>1 &minus; &beta;<\/strong>: the power of the test, i.e. the probability of detecting an effect when it actually exists. With power 0.80, we have an 80% chance of catching the effect.<\/li>\n<\/ul>\n<p>The formula is:<\/p>\n\\( n = \\frac{\\left[z_{\\alpha\/2} + z_{\\beta}\\right]^2 \\cdot \\left[p_1(1-p_1) + p_2(1-p_2)\\right]}{(p_1 &#8211; p_2)^2} \\)\n<p>Where z<sub>&alpha;\/2<\/sub> and z<sub>&beta;<\/sub> are the <strong>quantiles of the standard normal distribution<\/strong>. For the most common values:<\/p>\n<ul>\n<li>&alpha; = 0.05 &rarr; z<sub>&alpha;\/2<\/sub> = 1.96<\/li>\n<li>&alpha; = 0.01 &rarr; z<sub>&alpha;\/2<\/sub> = 2.576<\/li>\n<li>&beta; = 0.20 (power 0.80) &rarr; z<sub>&beta;<\/sub> = 0.842<\/li>\n<li>&beta; = 0.10 (power 0.90) &rarr; z<sub>&beta;<\/sub> = 1.282<\/li>\n<\/ul>\n<p><strong>Worked example.<\/strong> Suppose we have a baseline conversion rate of 3% and we want to detect a 20% relative increase (i.e. going from 3% to 3.6%), with &alpha; = 0.05 and power = 0.80:<\/p>\n<ul>\n<li>p<sub>1<\/sub> = 0.03, p<sub>2<\/sub> = 0.036<\/li>\n<li>z<sub>&alpha;\/2<\/sub> = 1.96, z<sub>&beta;<\/sub> = 0.842<\/li>\n<li>Numerator: (1.96 + 0.842)<sup>2<\/sup> &times; [0.03 &times; 0.97 + 0.036 &times; 0.964] = 7.849 &times; 0.0638 = 0.5008<\/li>\n<li>Denominator: (0.03 &minus; 0.036)<sup>2<\/sup> = 0.000036<\/li>\n<li>n = 0.5008 \/ 0.000036 &asymp; <strong>13,911 per variant<\/strong><\/li>\n<\/ul>\n<p>So to detect a 20% relative effect on a 3% CR, we need roughly <strong>13,900 observations per variant<\/strong> (nearly 28,000 in total). These numbers are worth reflecting on: if our site gets 500 visitors a day, the test will take about 56 days. This is one of the reasons why, in practice, most A\/B tests on medium-traffic sites take weeks, not days.<\/p>\n<hr \/>\n<h2 id=\"how-to-use\">How to use the calculator<\/h2>\n<p><strong>How to choose the MDE.<\/strong> The minimum detectable effect is the trickiest parameter. Rather than asking &#8220;how much would we like the metric to improve&#8221;, we should ask: <em>what is the smallest improvement that would justify the effort of implementing the change?<\/em> An MDE of 5% relative requires enormous samples; an MDE of 50% is easy to detect but rarely realistic. The 10&ndash;30% range is a good starting point for most conversion rate tests.<\/p>\n<p>An important detail: the MDE in the calculator is <strong>relative<\/strong>, not absolute. An MDE of 20% on a baseline CR of 5% means we are looking to detect a shift from 5% to 6% (one absolute percentage point, but 20% of the starting value).<\/p>\n<p><strong>How to estimate daily traffic.<\/strong> The traffic to enter is that of the pages involved in the test, not the total site traffic. If the test is on the checkout page and it receives 300 visits per day, the correct value is 300. You can get this figure from your analytics tool (GA4, Matomo, or similar) by averaging the last 30 days to smooth out daily fluctuations.<\/p>\n<hr \/>\n<h3 id=\"further-reading\">You might also like<\/h3>\n<ul>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/guide-to-statistical-tests-for-a-b-analysis\/\">A\/B Testing: A Guide to Statistical Tests for A\/B Analysis<\/a><\/li>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/confidence-intervals-what-they-are-how-to-calculate-them-and-what-they-do-not-mean\/\">Confidence Intervals<\/a><\/li>\n<li><a href=\"https:\/\/www.gironi.it\/blog\/en\/hypothesis-testing-a-step-by-step-guide\/\">Hypothesis Testing<\/a><\/li>\n<\/ul>\n<hr \/>\n<h3>Further reading<\/h3>\n<p>The most comprehensive reference on the rigorous design of online experiments is: <a href=\"https:\/\/www.amazon.com\/dp\/1108724264\" rel=\"nofollow sponsored noopener\" target=\"_blank\"><em>Trustworthy Online Controlled Experiments<\/em><\/a> by Ron Kohavi, Diane Tang and Ya Xu. It covers sample size, power analysis and much more, drawing on decades of practical experience at Microsoft and Google.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the most common questions when planning an A\/B test is: how many users do I need to get a reliable result? The answer is not a magic number: it depends on the size of the effect we want to detect, the baseline conversion rate, and the level of statistical certainty we require. Calculating &hellip; <a href=\"https:\/\/www.gironi.it\/blog\/en\/ab-test-sample-size-calculator\/\" class=\"more-link\">Leggi tutto<span class=\"screen-reader-text\"> &#8220;A\/B Test Sample Size Calculator&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_uag_custom_page_level_css":"","footnotes":""},"categories":[161],"tags":[],"class_list":["post-3495","post","type-post","status-publish","format-standard","hentry","category-statistics"],"lang":"en","translations":{"en":3495,"it":3492},"uagb_featured_image_src":{"full":false,"thumbnail":false,"medium":false,"medium_large":false,"large":false,"1536x1536":false,"2048x2048":false,"post-thumbnail":false},"uagb_author_info":{"display_name":"paolo","author_link":"https:\/\/www.gironi.it\/blog\/author\/paolo\/"},"uagb_comment_info":0,"uagb_excerpt":"One of the most common questions when planning an A\/B test is: how many users do I need to get a reliable result? The answer is not a magic number: it depends on the size of the effect we want to detect, the baseline conversion rate, and the level of statistical certainty we require. Calculating&hellip;","_links":{"self":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/comments?post=3495"}],"version-history":[{"count":1,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3495\/revisions"}],"predecessor-version":[{"id":3496,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/posts\/3495\/revisions\/3496"}],"wp:attachment":[{"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/media?parent=3495"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/categories?post=3495"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.gironi.it\/blog\/wp-json\/wp\/v2\/tags?post=3495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}