A/B Test Sample Size Calculator

One of the most common questions when planning an A/B test is: how many users do I need to get a reliable result? The answer is not a magic number: it depends on the size of the effect we want to detect, the baseline conversion rate, and the level of statistical certainty we require.

Calculating the sample size in advance is essential to avoid two classic mistakes: stopping the test too early and declaring a winner that does not exist, or letting it run too long, wasting traffic and time. In other words, it is about finding the right balance between resources and rigour.

If you have read the article on A/B Testing, you will recall that power analysis is the statistical method that lets us determine this threshold. And if you have studied confidence intervals, you already know that significance level and test power are not abstract concepts but operational levers that directly affect sample size.

Continue reading “A/B Test Sample Size Calculator”

Understanding the Basics of Machine Learning: A Beginner’s Guide

Introduction

Machine Learning is changing the way we see the world around us. From weather prediction to medical diagnosis, from content recommendations on streaming platforms to financial fraud detection, Machine Learning is increasingly present in our daily lives.

But what exactly is it, and how does it work? In this post, we will explore the fundamental concepts of Machine Learning and see how it can be used to solve real-world problems. We will also look at how to get started with Machine Learning, what resources are available, and how to use this technology to improve both work and everyday life.

Continue reading “Understanding the Basics of Machine Learning: A Beginner’s Guide”

The Gini Index: What It Is, Why It Matters, and How to Compute It in R

The Gini coefficient is a measure of the degree of inequality in a distribution, and is commonly used to measure income distribution.

These few words alone are enough to grasp the extraordinary importance of this index for economic and political studies, and why it is worth getting to know it a little more closely.

Continue reading “The Gini Index: What It Is, Why It Matters, and How to Compute It in R”

Contingency Tables and Conditional Probability

Contingency tables are used to evaluate the interaction between two categorical variables (qualitative). They are also called two-way tables or cross-tabulations.

Searching for relationships between two categorical variables is a very common goal for researchers. Think, for example, of the classic question that marketers ask: who is more likely to buy certain product categories, young or old people, men or women…

Continue reading “Contingency Tables and Conditional Probability”

The Poisson Distribution, Explained Simply

The Poisson distribution is a discrete probability distribution that describes the number of events occurring in a fixed interval of time or area.

The Poisson distribution is useful for measuring how many events can occur within a given time horizon, such as the number of customers entering a shop in the next hour, or the number of pageviews on a website in the next minute, and so on.

The Poisson Distribution: Siméon-Denis Poisson
Siméon-Denis Poisson

Continue reading “The Poisson Distribution, Explained Simply”