How to Evaluate Metrics and Determine the Outcome of Your Campaign

reIf your A/B tests work, make sure they’re working for the whole business. Without a strategy and a North Star metric, it’s possible to stray from the ideal product roadmap and optimize for the wrong things—like when Blockbuster became so reliant on late fees that it couldn’t pivot to streaming. Measure carefully, measure often, and always ask why.

Your overall, organization-wide A/B testing strategy isn’t so different from your tactical A/B testing strategy. With both, you validate iteratively and test to the funnel. Think of the overall strategy like a murmuration of starlings—millions of A/B tests form one giant A/B testing strategy.

A murmuration of starlings side by side a bunch of small A/B tests building up to a strategy

[Photo credit: Daniel Biber]

What is A/B Testing in Analytics?

A/B testing in analytics is about validating your assumptions. A cohort of new users may seem to prefer reading long articles over short ones. But is that always true? And is it true for all users? Before codifying your assumptions into rules that everyone on the team relies upon, use A/B tests to establish facts.

Defining a Strategy

A/B testing is strategic in and of itself. With it, businesses reduce uncertainty in their decision making. When starting your A/B test you want to determine what your desired outcome is. If you want to increase people adding items to their cart online then you should think about the ways you’re going to lead people to take the desired action.

This could be done by changing the CTA on the shopping cart, changing the button colour or adding prompts for people to add items to their cart. Whenever you define your A/B testing strategy you should consider what your ideal outcome is and think about what you can test to get you to get there.

How to Measure Results For Your Entire Organization

You can measure the organization-wide impact of your A/B testing efforts by looking at quality, quantity, and downstream effects: Are you running good tests? Are you running enough of them? Can you see a net impact on key metrics for the business, like quarterly revenue?

At Airbnb, we are constantly iterating on the user experience and product features. This can include changes to the look and feel of the website or native apps, optimizations for our smart pricing and search ranking algorithms, or even targeting the right content and timing for our email campaigns.

– Jonathan Parks, Data Engineering Manager, Airbnb

Quality of experiments

Good experiments have definitive outcomes. It doesn’t matter if you proved or disproved the hypothesis, so long as the outcome was certain. You can lift conversions, sentiment, and sign ups by ceasing to do the wrong things as well as you can beginning to do more of the right things.

If too high of a percentage of a team or the company’s experiments are inconclusive, it might be a sign that they’re setting experiments up incorrectly. Perhaps they’re writing ambiguous hypotheses like ‘This change will make the app better’ rather than explicit ones like ‘This change will increase monthly sign-ups five percent.’

If there are vastly more negative results (disproven hypotheses) than positive results across the organization, it may suggest that your team’s intuition isn’t yet refined enough. Companies don’t arrive at product perfection through endless blind tests—they get there, like a scientist, with highly educated guesses. Over time, you should see positive results increase. For reference, at Google and Bing, about 10-20% of tests have positive results.

The observer’s choice of what he shall look for has an inescapable consequence for what he shall find.

– John Archibald Wheeler

Quantity of experiments

Running too few experiments can be a bad thing, but so can running too many. If leadership mandates top-down product recommendations, the tail is wagging the dog and you probably need to test to uncover and document users’ needs and preferences. But if the team is testing everything, it can grind the entire organization to a halt.

For reference, in 2018, Airbnb tests a lot, and was running 500 experiments and using 2,500 distinct metrics across its platforms at any given time. But they’ve got a sizeable team.

One danger with learning to A/B test is that it can lead to over-reliance, and nobody will want to make decisions without testing. Don’t get lazy. Tests are costly. Intuition and judgement point you to the target, A/B tests simply fire the arrow.

– Josh Decker, UX Researcher

Speed to change

This measures how long it takes the team to implement something they learned from a test. For example, if the test results proved that a “Skip the intro” button on a streaming video site makes users more likely to renew, how long did it take the company to roll it out? Generally, the shorter the speed to change, the better: Teams that implement their lessons quickly can run more tests. Their product evolves and improves at a more competitive pace.

Turning big ships takes time. The larger the organization, the slower speed to change is, generally. If you’re a big organization that implements changes quickly, you’re at a serious competitive advantage.

We know more about our customers, statistically, than anyone else in our market. It also means that we can run more experiments with statistical significance faster than businesses with less user data. It’s one of our most important competitive advantages.

– Wyatt Jenkins, VP of Product, Shutterstock

Overall impact

The overall revenue impact of your A/B testing is what’s known as a lagging indicator: You often can’t measure it until after the fact. For instance, when Microsoft’s Bing made a feature change that resulted in $100 million in additional revenue, that wasn’t clear until the following quarter. Neither was the loss of 100,000 monthly visitors they experienced after making an SEO change. Lagging metrics are, however, the truest measures: It’s the only sure way to know that your A/B tests are having a positive impact.

Hard A/B test result metrics you can measure:

Revenue: Does testing increase purchases? Sign-ups? Upsell or cross-sell? Look at revenue, revenue per user, and average order value.
Support costs: Does testing decrease complaints or questions about how to use the app or site? Look at ticket response times, handling time, resolution time, and sentiment.

Soft A/B test result metrics you can measure:

Product team feedback: Does the product team have more user insights and data than before? Are their product launches growing more successful?
Satisfaction: How does testing affect your users’ loyalty and satisfaction? Look at NPS, CSAT, referrals, and sharing.

As organizations grow, so do their metrics. Give yours a clear hierarchy: A North Star metric (a bit of a misnomer, it can be more than one metric) as the primary focus, followed by core metrics, target metrics, and certified metrics, all of which roll up to the North Star.

Conversion Rate

How can you tell if your conversion funnel and marketing strategy is working? The only way to know for sure is to get data from customers directly. A/B testing lets you do just that. Any business can use data from A/B testing to its advantage. With a little research, a business can find out what changes will make their website visitors more successful. By making changes that are proven to be effective to make your users take action, you can increase conversions.

Click-Through Rate

A/B testing can also help you improve your click-through rate. By testing various versions of CTAs, imagery, or copy, you can find the combination that works best for engaging your users. The results of the A/B test will also give you insights into future content on your site.

Make it a Core Value in Your Company Culture

Once each business unit understands how A/B testing can benefit them, it becomes second nature. But that benefit isn’t always obvious. People are busy, they’re accustomed to how they already do things, and there’s a switching cost. Learning to A/B test takes mental effort and competes with their other priorities.

To ensure that an A/B testing culture takes hold within your organization, make the benefits clear to all:

Marketing benefits to A/B testing:

Gather valuable messaging feedback
Gather detailed user data
Increase conversions and engagement
Increase user trust

Limit negative impact of projects: Avoid “featuritis”, where the product accumulates so many features that it collapses and needs a full redesign
Learn why successful marketing campaigns worked

Product benefits to A/B testing:

Ship better designs faster with less guessing

Quantify impact, measure investment of design and research resources
Limit effort on bad ideas
Validate prototypes

Engineering benefits to A/B testing:

Validate before building
Shorter development cycles
Fewer redesigns
Testing creates data and data settles debates
Work on interesting, new projects

Business / Analyst / Finance team benefits to A/B testing:

Reduce fraud
Increase profit margin
Identify demand for new features and products

Learn More with Taplytics

Taplytics is an A/B testing solution designed to help product, engineering, and marketing teams drive more revenue through any of their client or server-side applications. With Taplytics, teams can test their app to find the most profitable design.

Start A/B testing with a 14-day free trial of Taplytics.

Start A/B testing fast with Taplytics — Learn more here.

Back to all posts