A Guide To Statistical Significance In Conversion Rate Optimization

Ok, just to warn you, today’s post might get a touch technical. I am going to try to keep it as simple as possible, but this is an area that is important to CRO and split testing and if you want to understand the process in more detail you need to know what significance means.

Anyway, I hope you enjoy it, and I promise to make sure the next post is a little more practical.

First Lesson – Actual vs Apparent Conversion Rate

When we talk about the conversion rate of a page we are actually referring to its “measured conversion rate” (we’ll use MCR from here on). MCR simply means the conversion rate that we have found by testing or experiment.

Every page has an “actual conversion rate” (ACR) but the reality is that we can’t possibly know what the ACR is, all we can do is test it and measure the conversion rate.

An Analogy – Flipping A Coin

To illustrate this better, let’s say that you have a coin that you are flipping repeatedly. You want to get heads, so we’ll say that every time you get heads counts as a win or a “conversion” – the terminology really doesn’t matter, the point is the same.

With a coin we can calculate the ACR because we know all of the possible outcomes (heads or tails) and we know that they are equally likely. Hence we KNOW that the conversion rate (ACR) of our coin is 50%.

But we’re going to do a test anyway…

So you flip the coin once and you get heads, then you flip again and you get tails, and then you flip again and you get heads.

So far our calculation looks like this:

  • 3 throws
  • 2 heads
  • Measured conversion rate = 2/3 = 66.7%

Clearly our experiment is not accurate:
This is because 3 throws is not enough to get a statistically significant answer

The problem, in simple terms is that at this stage, luck plays too big a role. The next throw could easily be another heads (it’s just as likely as not) which would make your MCR 75%!

Luck plays a big role in the short term and a small role in the long term

Which is why you can spend an evening at a casino and come out ahead, but if you go over week you are almost guaranteed to lose on average.

Back To Our Website

Ok, so you’re testing a contact page, and your goal is to get a user to fill in the contact form. Hypothetically we’ll say that your page has an ACR of 33% although you can’t possibly know this because unlike with our coin, the only way to find your conversion rate is by testing it.

So with an ACR of 33% you would expect that one in three visitors would convert. But if you took three visitors at random, there is no guarantee that exactly 1 of them would convert. In fact, the possible outcomes are:

  • non-conversion, non-conversion, non-conversion
  • non-conversion, non-conversion, conversion
  • non-conversion, conversion, non-conversion
  • non-conversion, conversion, conversion
  • conversion, non-conversion, non-conversion
  • conversion, non-conversion, conversion
  • conversion, conversion, non-conversion
  • conversion, conversion, conversion

As you can see, of these 8 outcomes, only 3 would produce an apparent rate of 33%. The other 5 outcomes give you a different answer.

An Introduction To Variance

Hope you’re still with me, hopefully things will start making sense soon. Let’s go back to our coin, since that’s a nice simple example to use.

With the coin, there is no guarantee that you will get heads followed by tails followed by heads… In fact if you toss it 4 times you could get 4 heads, which would make your MCR 100%, which is clearly wrong.

But what if you tossed it 10 times?
The chances of 10 heads in a row are much lower

What if you tossed it 100 times?

Let’s Try A Little Experiment

If you tossed your coin 1000 times the chances of getting exactly 500 heads are actually very low… But the chances of getting somewhere between 475 and 525 heads are very high, which means that your ApR would probably fall between 47.5% and 52.5%.

As you keep throwing the coin, the ApR will get more and more accurate, which means that you can trust it with a greater degree of certainty.

How Does This Relate To Conversion Rate Optimization?

If you’ve stuck with me this far, you’re probably hoping that I will relate this back to conversion rate testing. Well you’re in luck, that’s what I’m about to do. It’ll all make sense soon (I hope). We’ll apply this to a simple split test shall we?

  • In a simple split test we have 2 versions of a page
  • For both versions we are tracking the conversion rate
  • If a version A makes a sale then its conversion rate goes up
  • If it loses a customer, the conversion rate goes down
  • Same goes for version B

Both versions have both an actual conversion rate and an apparent conversion rate

We want to keep the version with the best conversion rate. But we can’t know the actual conversion rates, so we need to run a test to find out.

For simplicity, let’s say that version A converts at 30% (ACR) and version B converts at 35% (ACR). Clearly we want to keep version B and get rid of version A. But as we’ve seen:

After 10 visitors each, version A might have had a lucky run and be showing an MCR of 38% while version B only shows an MCR of 32%

If we ended the test now, we would be choosing the wrong winner – and that’s bad.

Degrees Of Certainty

In simple terms, the more data we collect, the more likely it is that our results will be accurate. We can never be certain that our measured conversion rate (MCR) is 100% accurate, but we can be 95% sure, or even 99% depending on how much data we collect.

The reality of course is that we don’t really care what the true conversion rate of a page is. All we are trying to answer is one simple question:

How certain can we be that version X is better than version Y?

Because we always want to be able to choose the winner and get rid of the loser as soon as possible – so that we can move on to the next test. That’s how we get the best results in the least time.

Factors Effecting Duration Of Tests

As you can see then, we want to end a test as soon as possible, but without risking choosing the wrong winner. In order to do this, we have to decide how certain we want to be of a result.

Being 95% about your results takes much less time than being 99% certain

The more traffic you get, the faster you can achieve higher degrees of certainty. If you are limited on traffic then you might have to accept lower certainty in order to get results faster – in the long run this is likely to produce bigger improvements.

There are other factors which effect how much data you need though…

The conversion rate:
If the pages you are testing have conversion rates (ACR) close to 50% you can reach statistical significance much faster than if your conversion rate is closer to 1, 2 or 5% for instance. This is why when you have low traffic it can be better to optimize for bounce rate or email sign ups instead of sales (for instance).

Difference in conversion rates:
If version A is showing a rate (MCR) of 10% and version B is showing a rate (MCR) of 50% it is much more likely that version B is genuinely the winner than if the two versions were showing 20% and 21% respectively. The bigger the difference between the true conversion rates the sooner you will be able to choose a winner. This is why we often recommend testing big changes rather than subtle ones.

That’s All Folks!

This has been a pretty heavy post, so well done for sticking with it. If you have any questions please get in touch and let us know and I we may update this post later if we have missed anything.

By | 2018-02-07T10:49:30+00:00 December 28th, 2016|Tech Help|Comments Off on A Guide To Statistical Significance In Conversion Rate Optimization