A/B tests are a staple of digital marketing, with good reason.
Think of a split test as a survey of your customers. You give them two or more different versions of marketing content, and their actions tell you which one works best. This lets you make data-driven decisions about future email and website content.
When it comes to personalization, A/B tests help you work out the impact of new tactics, and optimize existing content to get better results. They can be used to refine bulk and automated marketing emails, and boost website conversions.
Of course, systematic testing is much more reliable than simply following your intuition. But the approach is not without pitfalls. It’s not uncommon to implement the findings from an A/B test, only to find that your results don’t hold up in the long run.
If your A/B tests aren’t working as you expected, here’s what might be going wrong, and how to fix it.
1. Your A/B test didn’t go wrong!
So, you didn’t get the result you expected from your A/B test. Version B didn’t perform significantly better than version A. Or it even performed worse.
In fact, if your A/B test was set up correctly, these results aren’t failures at all.
The purpose of testing is to choose the best content and tactics for your marketing. And you’ve just learned that a given tactic isn’t as effective as you’d hoped. If you hadn’t done the test, you would never have known.
‘Negative’ and ‘neutral’ split test results let you steer away from marketing decisions that are unhelpful or even detrimental in the long run.
2. Your test isn’t configured correctly
Occasionally, an A/B test can misfire because of problems in your setup. Perhaps your technology isn't configured correctly, or the samples of people aren't comparable. If you've got problems like these, then test results will be because of your configuration, not the marketing that you think you're testing.
The solution is to run occasional A/A tests. This is the most basic type of A/B test; you simply create two identical versions of the same marketing and test them against each other. If you get significantly different results, then it’s time to revisit your setup and make sure everything’s working as it should be.
3. Data dredging
This happens when you analyze your data to spot effects, without previously deciding what you're looking for. If you search for a while, you're bound to find false coincidences by chance.
It’s much more effective to take a scientific approach A/B testing.
- Define your high level goals (probably to generate more revenue!)
- Observe your challenges or obstacles to achieving those goals.
- Hypothesize how best to overcome the challenge.
- Measure the results to prove or disprove your hypothesis.
For instance, imagine your goal is to drive more revenue from marketing emails. But your browse abandonment emails aren’t performing as you’d hoped. Visitors open the email but don’t add anything to their cart. You might hypothesize that “Adding product ratings to browse abandonment emails will result in a higher order value”
With this in mind, you can run an A/B test to determine if product star ratings do generate a higher order value from browse abandonment emails. If the hypothesis holds true, you can use this information to shape future browse recovery messages.
By following a scientific method, you can be sure that any changes you make are designed to contribute to your overarching marketing goals.
4. You’re not testing what you thought
In a true A/B test, it’s crucial that every aspect of your email or web page remains identical, except for the aspect you’re testing. Otherwise, you won’t know which variable caused the success.
Of course, you already knew that. But even with the best of intentions, it’s possible to accidentally change multiple aspects without meaning to.
- If testing the effect of tone-of-voice in email subject lines, did the lengths vary so that one didn't fully show on mobile?
- If trying different product images, did you accidentally vary page size and load time?
- Did you push some other marketing below the fold?
In this case, your A/B test didn’t fail - the winner is better than the loser - but you might draw incorrect conclusions for future marketing.
The solution is to thoroughly check how content appears in each version. Make a list of elements that you need to double check, and note if anything appears differently. Be sure to review how content appears on different devices.
5. You’re a victim of the novelty effect
The Novelty Effect means that people pay special attention to recent changes. Basically, humans have evolved for millions of years to pay special attention to changes in the world around us.
Regular visitors are inclined to try out a new feature because it’s new. Or their eyes might be drawn to something that's changed. This could give a short-term advantage to the new variant.
So be careful when drawing general conclusions. Suppose you're doing email marketing for a travel company and you find that including emojis like an aircraft and a palm tree in the subject line increases opens. This doesn’t necessarily mean that the tactic will consistently deliver results.
There are two possible solutions. You can repeat your split test after a few weeks have passed and the novelty effect has worn off, to see if the winner still does better. Or keep regularly refreshing your marketing, so that the novelty effect is constantly working to your benefit.
6. You stopped the test too early
When you can see the results stacking up in real time, it’s tempting to stop the test as soon as you notice one version performing much better or worse. Especially if you’re time-strapped and want to deploy the best version as quickly as possible.
However in the first days and weeks of a test, the number of data points might be too low to draw a conclusion. If the conversion rate is averaged across a small number of customers, there’s a strong chance of seeing a false positive or negative. The longer a test goes on, the more likely you are to observe a true trend.
By stopping a test prematurely, you could lose revenue by disregarding a good tactic just because it underperformed early on. You also risk disappointment if a version that initially performed well seems to plateau in the long run.
When planning a test, it’s crucial to determine how many customers need to see the web page/receive the email before you declare a winning result. Then stick to it, even if you seem to see a statistically significant result after just a couple of days.
7. Your A/B/n test winner is actually a joint winner
A/B/n testing is an extension of A/B testing. Rather than testing two versions of an email or web page against each other (A and B), you test more than two versions. ‘N’ simply means the number of variations.
For example, you might test several different types of product recommendations to work out which one drives more email click throughs, compared to an email with no product recommendations.
It’s tempting to declare the version with the highest uplift as the winner of the test. But even if the difference between the winning version and the control version is significant, the difference between winner and the runner up might not be.
The danger here is that you decide to deploy the winning variation, discounting other variations that could have similar – or better – results.
The solution is to view A/B/n tests as multiple variations of content competing against one another, not simply competing against the control version. If the leading two variants don’t have significantly different results, you could view the test as a tie and look to other considerations to decide which version to implement. For example, you might opt for the dynamic content that fits best with your brand voice.
8. You aren’t using testing to its full potential
If you’re only using A/B tests to choose between different types of email subject lines, your business could be losing out on crucial incremental revenue.
Once you’ve got your goals in place, you can make educated hypotheses about which tactics will help you meet those goals. Think about your ideal customer and what content they need at each stage of the customer journey to help them make a decision. Then use A/B tests to prove your hypotheses with real data.
You can A/B test anything that affects customers’ actions at each stage of the buying journey. For example:
- Trying a shorter delay before a shopping cart abandonment email.
- Adding more emails to a triggered email sequence.
- Testing a different mix of personalized product recommendations on your homepage.
- Exploring different types of social proof on the product detail page.
9. You aren’t using testing!
There’s no need to use guesswork to optimize your cross-channel marketing efforts.
Regular testing and optimization doesn’t just help you choose the best version of any given marketing content. It also provides insights into your customers’ behavior and interests, that can be used to shape marketing across channels.
And marketers today have plenty of options when it comes to testing and optimizing personalized content. A good personalization platform will let you mix and match testing tactics to meet your short and long term goals:
- A/B tests to optimize marketing based on shoppers’ actions.
- A/B/n tests to find the types of content that resonate with customers.
- Control groups to calculate the removal effect and measure the sales uplift generated by a given tactic.
- Real-time, revenue-based reporting to make business-driven decisions.
For more insights on how to get the most out of email testing, join Fresh Relevance and Kath Pay, CEO of Holistic Email Marketing, for a dedicated webinar on Thursday 5th November: