< Back


9 ways your eCommerce A/B tests are going wrong

September 27th 2019

Roheena Chogley

By Roheena Chogley

Account Manager

9 ways your eCommerce A/B tests are going wrong - featured image

A/B tests are a staple of digital marketing, with good reason.

Think of a A/B testing as a survey of your customers. You give them two or more different versions of marketing content, and their actions tell you which one works best. This lets you make data-driven decisions about future email and website content.

When it comes to personalization on your eCommerce site, whether you want to test your category pages, your homepage, your product pages, or even your email campaigns, A/B tests help you work out the impact of new tactics, and optimize existing content to get better results. They can be used to refine bulk and automated marketing emails, and boost website conversions.

Of course, systematic testing is much more reliable than simply following your intuition. But the approach is not without pitfalls. It’s not uncommon to implement the findings from an A/B test, only to find that your results don’t hold up in the long run.

If your A/B tests aren’t working as you expected, here’s what might be going wrong, and how to fix it

1. Your A/B test didn’t go wrong!

So, you didn’t get the result you expected from your A/B test for your ecommerce store. Version B didn’t perform significantly better than version A. Or it even performed worse.

Version B performed worse than version A

In fact, if your A/B test was set up correctly, these results aren’t failures at all.

The purpose of testing is to choose the best content and tactics for your marketing and for producing the best possible customer experience to drive results for your eCommerce business, and to get the results you want from online shoppers and potential customers. And you’ve just learned that a given tactic isn’t as effective as you’d hoped. If you hadn’t done the test, you would never have known.

‘Negative’ and ‘neutral’ split test results let you steer away from marketing decisions that are unhelpful or even detrimental in the long run.

2. Your split testing isn’t configured correctly

Occasionally, an A/B test can misfire because of problems in your setup. Perhaps your technology isn’t configured correctly, or the samples of people aren’t comparable. If you’ve got problems like these, then results will be because of your configuration, not the marketing that you think you’re testing.

The solution is to run occasional A/A tests. This is the most basic type of A/B test; you simply create two identical versions of the same marketing and test them against each other. If you get significantly different results with your KPIs for your online store, then it’s time to revisit your setup and make sure everything’s working as it should be.

3. Data dredging

This happens when you analyze your data to spot effects, without previously deciding what you’re looking for. If you search for a while, you’re bound to find false coincidences by chance.

Scientific A/B tests are more effective than data dredging

It’s much more effective to take a scientific approach to A/B testing.

  • Define your high level goals (probably to generate more revenue!)
  • Observe your challenges or obstacles to achieving those goals.
  • Hypothesize how best to overcome the challenge.
  • Measure the results to prove or disprove your hypothesis.

For instance, imagine your goal is to drive more revenue from marketing emails. But your browse abandonment emails aren’t performing as you’d hoped. Visitors open the email but don’t add anything to their cart. You might hypothesize that “Adding product ratings to browse abandonment emails will result in a higher order value from our target audience.”

With this in mind, you can run an A/B test to determine if product star ratings do generate a higher order value from browse abandonment emails. If the hypothesis holds true, you can use this information to shape future browse recovery messages.

By following a scientific method, you can be sure that any changes you make are designed to contribute to your overarching marketing goals.

4. You’re not testing what you thought

In a true A/B test, it’s crucial that every aspect of your email or eCommerce website remains identical, except for the aspect you’re testing. Otherwise, you won’t know which variable caused the success.

Of course, you already knew that. But even with the best of intentions, it’s possible to accidentally change multiple aspects without meaning to.

For example:

  • If testing the effect of tone-of-voice in email subject lines, did the lengths vary so that one didn’t fully show on mobile?
  • If trying different product images, did you accidentally vary page size and load time?
  • Did you push some other marketing below the fold?

In this case, your A/B test didn’t fail – the winner is better than the loser – but you might draw incorrect conclusions for future marketing.

The solution is to thoroughly check how content appears in each version. Make a list of elements that you need to double check, and note if anything appears differently. Be sure to review how content appears on different devices for a true reflection of what helps with your conversion rate optimization and your other KPIs.

5. You’re a victim of the novelty effect

The Novelty Effect means that people pay special attention to recent changes. Basically, humans have evolved for millions of years to pay special attention to changes in the world around us.

A/b split testing can be skewed by the novelty effect

Regular visitors are inclined to try out a new feature because it’s new. Or their eyes might be drawn to something that’s changed. This could give a short-term advantage to the new variant.

So be careful when drawing general conclusions. Suppose you’re doing email marketing for a travel company and you find that including emojis like an aircraft and a palm tree in the subject line increases opens. This doesn’t necessarily mean that the tactic will consistently deliver results.

There are two possible solutions. You can repeat your split test after a few weeks have passed and the novelty effect has worn off, to see if the winner still does better. Or keep regularly refreshing your marketing, so that the novelty effect is constantly working to your benefit.

6. You stopped the test too early

When you can see the results stacking up in real time, it’s tempting to stop the test as soon as you notice one version performing much better or worse. Especially if you’re time-strapped and want to deploy the best version as quickly as possible.

However in the first days and weeks of a test, the number of data points might be too low to draw a conclusion. If the conversion rate is averaged across a small number of customers, there’s a strong chance of seeing a false positive or negative. The longer a test goes on, the more likely you are to observe a true trend.

By stopping a test prematurely, you could lose revenue by disregarding a good tactic just because it underperformed early on. You also risk disappointment if a version that initially performed well seems to plateau in the long run.

When planning a test, it’s crucial to determine how many customers need to see the web page/receive the email before you declare a winning result. Then stick to it, even if you seem to see a statistically significant result after just a couple of days.

7. Your A/B/n test winner is actually a joint winner

A/B/n testing is an extension of A/B testing. Rather than testing two versions of an email or web page against each other (A and B), you test more than two versions. ‘N’ simply means the number of variations.

a/b/n tests can have multiple winners

For example, you might test several different types of product recommendations to work out which one drives more email click throughs, compared to an email with no product recommendations.

It’s tempting to declare the version with the highest uplift as the winner of the test. But even if the difference between the winning version and the control version is significant, the difference between winner and the runner up might not be.

The danger here is that you decide to deploy the winning variation, discounting other variations that could have similar – or better – results.

The solution is to view A/B/n tests as multiple variations of content competing against one another, not simply competing against the control version. If the leading two variants don’t have significantly different results, you could view the test as a tie and look to other considerations to decide which version to implement. For example, you might opt for the dynamic content that fits best with your brand voice, or use social proof or user generated content on your product category page.

8. You aren’t using testing to its full potential

If you’re only using A/B tests to choose between different types of email subject lines, your business could be losing out on crucial incremental revenue.

Once you’ve got your goals in place, you can make educated hypotheses about which tactics will help you meet those goals. Think about your ideal customer and what content they need at each stage of the customer journey to help them make a decision. Then use A/B tests to prove your hypotheses with real data.

You can A/B test anything that affects customers’ actions at each stage of the buying journey. For example:

9. You aren’t using testing!

There’s no need to use guesswork to optimize your cross-channel marketing efforts.

If you aren't using split testing, you could be leaving revenue behind

Regular testing and optimization doesn’t just help you choose the best version of any given marketing content. It also provides insights into your customers’ behavior and interests, that can be used to shape marketing across channels.

And marketers today have plenty of options when it comes to testing and optimizing personalized content. A good personalization platform will let you mix and match testing tactics to meet your short and long term goals:

  • A/B tests to optimize marketing based on shoppers’ actions.
  • A/B/n tests to find the types of content that resonate with existing customers.
  • Control groups to calculate the removal effect and measure the sales uplift generated by a given tactic.
  • Real-time, revenue-based reporting to make business-driven decisions.

Take testing further with automatic optimization

Fresh Relevance’s automatic optimization functionality is a smart way to take your testing further and save precious time. Instead of analysing results at the end of a test and manually implementing the winning variation, our automatic optimization empowers you to try out several versions of your content and automatically deploy the best performing one towards the goals you set (such as conversion rate increase).

With Fresh Relevance, you can also automatically end tests on a set date or number of impressions and display the variant that generated the most revenue.

Book a demo to learn more about taking your testing and optimization further with Fresh Relevance.

Book a demo to get started with testing and optimization today

Roheena Chogley

By Roheena Chogley

Account Manager

As Account Manager at Fresh Relevance, Roheena works with renowned brands in the eCommerce and travel space to ensure success with their content personalization programmes in email and online.