Should You A/B Test Your Idea, Or Not?
Let’s say that you have an idea, or two, about changing something on your site with the hope of increasing some desired metrics. Perhaps you would like more signups, faster load times, more purchases, better completion rates, less usability errors, or lower drop off rates, etc. Great. Whatever the metric is, like most human beings you will probably have a degree of certainty about it. This personal level of belief about your idea(s) can become a very useful gauge for whether you should or should not a/b test it.
Degrees Of Certainty
As you pry into the idea(s) and their potential effects, you might realize that your level of certainty is not so black and white as you might think. That is, having complete 0% certainty about something is rarely possible, nor does it last long. Based on the way our brains are wired, making constant probabilistic predictions about pretty much everything we do is quite central to being human – as Jeff Hawkins has written in On Intelligence. Human brains simply do not allow us to reside in a state of zero certainty for long as we crave to predict everything from the weather while stepping out of our homes, changes in crossing lights while approaching an intersection, or guessing test results when launching our a/b experiments. Claiming 100% certainty is equally difficult, if not arrogant. Giving ourselves room to err and challenge existing beliefs is not just healthy, but opens up our minds as we generate knowledge. It’s just a matter of time when a new edge case is discovered and we are forced to correct our understanding.
Instead, certainty may be looked at as a gradient that is also unique to each individual – as the Bayesians advocate for. In this view, it is completely valid for me, or you, to say that one idea will somewhat likely bring an improvement, whereas another stronger idea might very very very likely win. More so, it’s also completely fine for us to have different levels of belief. And it is very much so this personal level of belief, which can be used to help us decide whether we should a/b test something or not.
Low Certainty Ideas
Let’s start with a typical scenario, where an idea exists that we have very low certainty about. Perhaps it’s our first A/B test and we want to compare two button labels as someone across the street screams that first person copy works best. So we have A: “Get Your Account” vs. B “Get My Account”. What do we do in this situation when we barely feel that B will actually win (but still has a tiny chance according to our level of belief)? Answer: we do not test this, yet. The reason being that the ideas which we do not feel strongly about, have also a high chance of wasting our and our business’ time with a likely insignificant result. Instead, it might be worthwhile to spend a few hours of time to see if additional certainty about the idea can be built up. Given that an A/B test might run a few weeks (depending on an proper estimate), those few hours (or more) can become a solid investment.
How does one build additional certainty quickly? For one, we might look at past experimental data (side note: GoodUI Evidence will offer more of this and re-launch before the end of 2015). If we can find other examples where a similar change has performed with a similar effect, we might be able to adjust our certainty. Alternatively, we may also dig a little deeper and see if the idea is based on solid general theory that is grounded on even more thorough experimentation. Although effective (yet possibly more time consuming), we may also conduct additional primary research of our own to validate the idea with real customers or customer facing staff. Finally, we can always drop the lower certainty idea, revise it or find ones that we are more certain about. Either way, we should only test low certainty ideas as the last resort.
Medium Certainty Ideas
Having medium certainty in an idea is the testing sweet spot scenario. As an example, let’s assume that we have seen something work two or three times with a given audience, and now we wish to see if it will also work with another group. Here we have good reasons to believe it will work once again, but there is still plenty of uncertainty left to actually a/b test it. Such types of ideas are good candidates for experimentation as they have a higher chance in generating significant results while at the same time adjusting our beliefs. In fact, this is exactly what A/B testing offers – it is a tool for carving away at our uncertainty.
High Certainty Ideas
This is the third scenario where you might feel very very certain that some idea will generate your desired effect. Let’s say that you’ve already observed 9 out of 10 successful a/b tests that applied Urgency as a conversion tactic, and you also read 2 books on persuasion which backed this idea. When you have such a high belief in this idea, does it really make sense for you to test it? I think you might be better off just implementing it and validating it after the fact (as we have done on one recent project in Datastories Issue #18). The business will benefit much quicker from a high certainty idea implementation, rather than diluting it with a test.
Ironically, as we take high certainty ideas into testing, the value of the test itself also diminishes. This is because, once again, tests serve us as tools for the removal of uncertainty. If you have very little uncertainty remaining and decide to test, then chances are that you will not learn much. Yes, such a test will likely have a desired effect, but our beliefs will not evolve.
Not Everything Has To Be A/B Tested
When optimizing, it is completely acceptable to apply and make use of existing knowledge for your online business to benefit from. You do not need to fall in the trap that everything has to be a/b tested. When seeking ideas to experiment with, do look for ones that are of medium certainty – ones that generate an effect and teach you something along the way.
Learn From How We Optimize & Run Tests In Datastories
Each month we share with you our exclusive optimization story to learn from, including: the changes, the testing strategy used, our reflections on the process, and transparent data.