Should You A/B Test Your Idea, Or Not?

Should You A/B Test Your Idea, Or Not?

Let’s say that you have an idea, or two, about changing something on your site with the hope of increasing some desired metrics. Perhaps you would like more signups, faster load times, more purchases, better completion rates, less usability errors, or lower drop off rates, etc. Great. Whatever the metric is, like most human beings you will probably have a degree of certainty about it. This personal level of belief about your idea(s) can become a very useful gauge for whether you should or should not a/b test it.

Degrees Of Certainty

As you pry into the idea(s) and their potential effects, you might realize that your level of certainty is not so black and white as you might think. That is, having complete 0% certainty about something is rarely possible, nor does it last long. Based on the way our brains are wired, making constant probabilistic predictions about pretty much everything we do is quite central to being human - as Jeff Hawkins has written in On Intelligence. Human brains simply do not allow us to reside in a state of zero certainty for long as we crave to predict everything from the weather while stepping out of our homes, changes in crossing lights while approaching an intersection, or guessing test results when launching our a/b experiments. Claiming 100% certainty is equally difficult, if not arrogant. Giving ourselves room to err and challenge existing beliefs is not just healthy, but opens up our minds as we generate knowledge. It's just a matter of time when a new edge case is discovered and we are forced to correct our understanding.

Instead, certainty may be looked at as a gradient that is also unique to each individual - as the Bayesians advocate for. In this view, it is completely valid for me, or you, to say that one idea will somewhat likely bring an improvement, whereas another stronger idea might very very very likely win. More so, it’s also completely fine for us to have different levels of belief. And it is very much so this personal level of belief, which can be used to help us decide whether we should a/b test something or not.

Low Certainty Ideas

Low Certainty
Let’s start with a typical scenario, where an idea exists that we have very low certainty about. Perhaps it’s our first A/B test and we want to compare two button labels as someone across the street screams that first person copy works best. So we have A: “Get Your Account” vs. B “Get My Account”. What do we do in this situation when we barely feel that B will actually win (but still has a tiny chance according to our level of belief)? Answer: we do not test this, yet. The reason being that the ideas which we do not feel strongly about, have also a high chance of wasting our and our business’ time with a likely insignificant result. Instead, it might be worthwhile to spend a few hours of time to see if additional certainty about the idea can be built up. Given that an A/B test might run a few weeks (depending on an proper estimate), those few hours (or more) can become a solid investment.

How does one build additional certainty quickly? For one, we might look at past experimental data (side note: GoodUI Evidence will offer more of this and re-launch before the end of 2015). If we can find other examples where a similar change has performed with a similar effect, we might be able to adjust our certainty. Alternatively, we may also dig a little deeper and see if the idea is based on solid general theory that is grounded on even more thorough experimentation. Although effective (yet possibly more time consuming), we may also conduct additional primary research of our own to validate the idea with real customers or customer facing staff. Finally, we can always drop the lower certainty idea, revise it or find ones that we are more certain about. Either way, we should only test low certainty ideas as the last resort.

Medium Certainty Ideas

Mid Certainty
Having medium certainty in an idea is the testing sweet spot scenario. As an example, let's assume that we have seen something work two or three times with a given audience, and now we wish to see if it will also work with another group. Here we have good reasons to believe it will work once again, but there is still plenty of uncertainty left to actually a/b test it. Such types of ideas are good candidates for experimentation as they have a higher chance in generating significant results while at the same time adjusting our beliefs. In fact, this is exactly what A/B testing offers - it is a tool for carving away at our uncertainty.

High Certainty Ideas

High Certainty
This is the third scenario where you might feel very very certain that some idea will generate your desired effect. Let’s say that you’ve already observed 9 out of 10 successful a/b tests that applied Urgency as a conversion tactic, and you also read 2 books on persuasion which backed this idea. When you have such a high belief in this idea, does it really make sense for you to test it? I think you might be better off just implementing it and validating it after the fact (as we have done on one recent project in Datastories Issue #18). The business will benefit much quicker from a high certainty idea implementation, rather than diluting it with a test.

Ironically, as we take high certainty ideas into testing, the value of the test itself also diminishes. This is because, once again, tests serve us as tools for the removal of uncertainty. If you have very little uncertainty remaining and decide to test, then chances are that you will not learn much. Yes, such a test will likely have a desired effect, but our beliefs will not evolve.

Not Everything Has To Be A/B Tested

When optimizing, it is completely acceptable to apply and make use of existing knowledge for your online business to benefit from. You do not need to fall in the trap that everything has to be a/b tested. When seeking ideas to experiment with, do look for ones that are of medium certainty - ones that generate an effect and teach you something along the way.


Learn From How We Optimize & Run Tests In Datastories

Each month we share with you our exclusive optimization story to learn from, including: the changes, the testing strategy used, our reflections on the process, and transparent data.




Comments

  • Alhan Keser

    Alhan Keser 8 years ago 11

    Asking some great questions here. There is plenty of stuff that I end up not testing. It's all part of prioritizing. And that comes down to understanding what actually needs improvement vs what I believe needs improvement.

    When I am more certain of changes, I combine them into one variation rather than spreading them out across multiple variations. When I am less certain, I isolate them in separate variations. And when I am not certain at all, I hold off on testing them until there is more evidence to back up the hypothesis.

    As for implementing without testing, that's assuming that the only thing you're hoping to get out of testing is increased conversions/revenue. What about the insights? Even if you "know" that something is going to work, it doesn't mean testing variations of it can't help you better understand WHY it works, which is, after all, the most interesting question to get answers to.

  • Matthew Clark

    Matthew Clark 8 years ago 00

    Why is testing everything "a trap" to fall into? I'm all about probabilistic reasoning as a way of life and choosing "what" you test, but not "if" you test. There seems to be a presumption that A/B testing will slow down, not speed up development.

    • Jakub Linowski

      Jakub Linowski 8 years ago 00

      I think that an extremist "always-test-everything" mindset can occasionally delay the site/page from being more optimized, more quickly, at least in two ways. This of course assumes that the person doing the optimization has lots of high certainty that they could just implement. 1) It might make the optimizer focus on a single change at a time / per test, instead of just benefiting from all of the changes all at once (that should by definition generate a higher effect more quickly). 2) The test itself dilutes the effectiveness of a winning variation by splitting traffic with a weaker control.

      Of course, we both know that having high certainty is probably not the norm. And so yes, we test quite often (but just not always). :)

  • ivan

    ivan 8 years ago 00

    Hey Jakub, I like your post, but I would disagree that you need some level of certainty for everything you want to test.

    For example, some of by biggest testing wins were the ones where I completely changed my copy into something I couldn't even think about being better than the original version.

    Sometimes (not always), you just need to be bold — especially if you don't have a lot of traffic, or time.

    • Jakub Linowski

      Jakub Linowski 8 years ago 10

      Hey Ivan. You're right, you do not always need medium certainty when going into a test. Sometimes a purely exploratory approach might also be valid. We do find however that the act of separating out the more certain ideas, from the less certain ones, tends to lead to higher test success rates. But yes, exploration (and sometimes a touch of randomness) is also important.

  • Smilyan

    Smilyan 8 years ago 00

    Great article. Key take for me "Ironically, as we take high certainty ideas into testing, the value of the test itself also diminishes. This is because, once again, tests serve us as tools for the removal of uncertainty."

    The medium certainty sweet spot was a mini revelation as I never thought about it. Thanks!