The Best Shot Strategy: The Fastest Way To Higher Conversions
If you have lots of certainty about numerous improvements, then the fastest way to higher conversions is to group them all into a single variation, instead of testing them separately. This we’ve began referring to as a Best Shot Strategy. It aims to maximize the effect size, minimize the testing duration, while doing so at the cost of a blurred cause-effect relationship.
Traditionally, A/B tests might have implied that the right way to run experiments is to make a single change to a variant, estimate the sample size to detect a desired effect, and hit start. This scientific focus has much benefit as it has the potential to shed more light on cause-effect relationships. As we change a single thing that shows an effect, we can then begin to build a stronger understanding of what has caused it (hopefully with enough repetition and properly powered tests of course).
Such a scientifically skewed approach, although great for knowledge generation, might not always be what a business wishes to maximize - the magnitude of the effect. When optimizing in a business context, we might have a large degree of existing knowledge and certainty that spans beyond a single change. In such situations it might make sense to speed up the process and benefit from what we already know. As an example, we might have built up certainty about a successful landing page needing elements of: clarity of benefit, social proof, repeated calls to action and low cognitive friction. These situations are not uncommon when for example experienced professionals glance at unoptimized pages (and lists of things to improve grows quickly). Because it takes a shorter time to detect a larger effect, the fastest way to higher conversion may in fact be by addressing all bottlenecks, missing elements and improvements all at once.
Larger Effects Need Less Testing Time
In order for experiments to have enough power (or sensitivity) to detect an effect, a sample size estimate is needed before starting the test. Essentially, the smaller the effect we wish to observe with some degree of probability, the more of a sample size (unique visitors) we’ll need. With this fundamental dynamic, if we have reason to believe that our test can bring in a larger effect, it does not make sense to waste precious time with a high powered test, aiming to detect a tiny effect. To illustrate this, here is a comparison of two situations, both starting with an absolute conversion rate of 10% and finishing with a 13.3% after the experiments.
In the first situation, we assume that we have three separate tests each with a single change, powered to detected a relative +10% lift, while actually winning and compounding on top of each other. If we had structured the experiment in this way, we would have needed the following number of visitors per variation: 14,313 for Test 1, 12,865 for Test 2 and 11,548 for Test 3. That’s a total of 38,726 visitors per variation.
In the second situation, we assume that we have a single test with three of the same changes grouped together and powered to detect a relative +33% lift all in the same go. If we were estimating a test with such a large effect, all we’d need would be a sample size of 1,343.
Comparing the second situation to the first, the single Big Shot test needs almost 29 times less of a sample size than the three separate tests in the first situation. Now if sample size equates to an online business’ traffic, then this usually means that such a test might be 29 times shorter in duration - a huge benefit.
The Blurring Of Effects
The reality however is that some tests will fail as we never start them with full knowledge or certainty. Instead, we start tests with degrees of partial knowledge which translate into probabilities of the effect repeating itself to some degree (or not). One could even argue, that if we have extreme certainty, we could even reach higher optimums faster by simply implementing the change and bypassing testing altogether (which dilutes the gain with the presumably weaker control). Nevertheless, multiple change Best Shot tests will blur these multiple and varied effects into a single one - making more granular explanations very difficult (if not impossible). Hence, it’s very possible that given a test with 3 changes, two of them might have a positive effect, while the remaining change might have a very strong negative effect - possibly erasing the gained benefit.
Has It Worked In The Past?
Yes it has, but not always. We have tried this strategy one many of our own projects with real clients and we can say that it sometimes does work. As one example illustrates above, we have redesigned and tested a checkout page based on a number of changes which resulted in +34% more purchases. We also used this same strategy on our very first test when a quoter page was redesigned with a +54% lift to quotes started. Or even more recently, the same strategy was used to increase signup rates by +15% on a long home page.
Of course things are not always so green. We also had a few handful of tests where we were not able to reach significant effects with Best Shot Tests. It might have been as a result from under powering the test while being too optimistic. The effect might have been there, but our test wasn’t sensitive enough to detect it because we were expecting too large of an effect. Another reason why this strategy might fail is from net-negative effects, as mentioned before. By grouping multiple changes together, we run the risk that some of the changes will in fact drag the test down, even though if the remaining changes are in fact helpful.
I believe what’s really important with this strategy is to become aware of and true to your own certainty about possible improvements. If a situation presents itself where you really feel strongly that a page is weak in numerous aspects, then that just might be a signal to aim big. When that is not true however, it might be best to stick to more traditional single change tests. We have noticed that it’s very easy to begin listing out ideas to test while looking at a screen. Random low confidence ideas also try to break through and tag along into tests. It is precisely at such moments where it becomes critical to separate the stronger ideas from the pseudo-certain ones. As we respect the knowledge, experience and certainty which we build over time, best shot strategies can become a really effective way for applying what we have learned - while saving us lots of precious time.
Have Us Run A Best Shot A/B Test On Your Site
Schedule a call with us to talk metrics and see if this strategy is possible on your site.
Comments
Paras 9 years ago ↑1↓0
Interesting writeup. I do agree with the article. In fact, with our Bayesian statistics engine, we've begun emphasizing doing what is right for the business rather getting strict scientific results. So combining multiple ideas into one variation could certainly work provided you know what you are doing. Changing for the sake of changing might actually hurt the conversion. As long as you're clear why conversions could not be optimal on a landing page, and you go on to introduce relevant elements in a good aesthetic fashion, all is good.
Reply