Incrementality Testing vs A/B Testing: What’s the Difference?

Lukas Reinhardt Lukas Reinhardt · · 7 min read
Incrementality Testing vs A/B Testing: What’s the Difference?

Marketers love a good showdown, and “incrementality testing vs A/B testing” is one of the most common questions I get when I sit down with a growth team. The two methods sound similar — both involve controlled experiments, both promise to tell you what actually works. But they answer fundamentally different questions, and using one when you needed the other is a mistake I have watched cost teams real money.

I have run hundreds of experiments for SaaS companies over the past decade, everything from button-color tests to multi-million-dollar channel holdouts. In this guide I will break down exactly how these two approaches differ, when each one wins, and how to avoid the trap of treating them as interchangeable.

The Core Difference in One Sentence

Here is the cleanest way I have found to explain it: A/B testing tells you which version is better. Incrementality testing tells you whether something was worth doing at all.

An A/B test compares two variants of the same thing — headline A versus headline B, landing page A versus landing page B. You split traffic, measure the outcome, and pick the winner. The assumption baked in is that you are going to ship one of the versions no matter what. You are choosing between options.

An incrementality test asks a different question entirely: if I turned this off completely, would I lose conversions? You hold back an audience from seeing a campaign at all, then compare the holdout group to the exposed group. The gap between them is the true causal lift — the conversions that would not have happened without the spend.

“Correlation is not causation, but it sure is a hint.”

— Edward Tufte, statistician and data visualization pioneer

Quick Comparison Table

Dimension A/B Testing Incrementality Testing
Question answered Which version performs better? Did this drive conversions that would not have happened anyway?
Structure Variant A vs Variant B (both shipped to a slice) Exposed group vs holdout group (one sees nothing)
Best for On-site optimization, creative, copy, UX Channel value, retargeting, brand spend, always-on campaigns
Output Relative winner (lift between variants) Absolute causal lift (incremental conversions)
Typical duration Days to a few weeks Weeks to a couple of months
Cost of running Low — no revenue held back Higher — you withhold spend from a real audience

How A/B Testing Works

A/B testing (also called split testing) is the workhorse of conversion rate optimization. You take a single element, create two or more versions, and randomly assign visitors to each. After enough traffic, you compare conversion rates and declare a statistically significant winner.

The strength of A/B testing is precision on a narrow question. When I tested two pricing-page layouts for a B2B SaaS client, the variant with annual billing shown first lifted trial starts by a measurable margin. That is exactly the kind of question A/B testing answers well: same audience, same channel, two flavors of the same experience.

The frameworks here are mature. Tools like VWO, PostHog, and Google’s GA4 experiments handle randomization and significance calculations for you. The methodology is well documented by sources like the Harvard Business Review.

But A/B testing has a blind spot. It assumes you are keeping the activity. It can tell you headline B beats headline A, but it cannot tell you whether running that ad campaign at all generated any net-new revenue. That is where incrementality comes in.

How Incrementality Testing Works

Incrementality testing measures the true causal effect of a marketing activity by withholding it from a randomized control group. Instead of comparing two versions of something, you compare “got the treatment” against “got nothing.”

The classic example is retargeting. A retargeting campaign almost always looks profitable in your ad platform’s reporting, because it shows ads to people who were already going to buy. Run a proper holdout test — show retargeting to 80% of your audience and nothing to the other 20% — and the picture often changes dramatically. I have seen retargeting that looked like a 6x return collapse to barely break-even once we measured the genuine incremental lift.

For a deeper walkthrough of the methodology, including the four main test designs and the statistics behind sample sizing, see my full guide on incrementality testing explained. The short version: you need a clean randomized split, a long enough window to capture your sales cycle, and enough volume to reach statistical significance.

When to Use Each Method

This is where teams go wrong. They reach for A/B testing when they should be measuring incrementality, or they try to run an incrementality test on something that calls for a simple split test. Here is how I decide.

Reach for A/B Testing When:

  • You are optimizing something on your own property — landing pages, emails, onboarding flows, pricing displays.
  • You have already decided to keep the activity and just want the best version.
  • You need a fast answer and have steady traffic.
  • The change is reversible and low-risk.

Reach for Incrementality Testing When:

  • You are questioning whether a channel or campaign deserves its budget at all.
  • Platform-reported ROAS looks suspiciously high (retargeting, branded search, audience network).
  • You suspect your attribution model is taking credit for conversions that would have happened anyway.
  • You are about to scale spend and want proof of causal impact first.

The Mistake That Costs the Most

The single most expensive error I see is using A/B test logic to justify channel spend. A team runs a “test” where they turn a campaign on for one month and off the next, then compares the two periods. That is not an A/B test and it is not a clean incrementality test either — it is a before-and-after comparison contaminated by seasonality, other campaigns, and a hundred outside variables.

Tip: Time-based on/off comparisons are not experiments. Without a simultaneous randomized control group, you cannot separate your campaign’s effect from everything else happening that month.

If you want to know whether a channel is pulling its weight, you need a real holdout running at the same time as the exposed group. Anything else is a guess dressed up as data.

Can You Use Both Together?

Yes, and the best growth teams do. The two methods sit at different layers of the measurement stack.

Use incrementality testing at the strategic layer to decide which channels and campaigns genuinely move revenue and deserve budget. Then use A/B testing at the tactical layer to squeeze the most performance out of the channels you have validated. Incrementality tells you where to play; A/B testing tells you how to win once you are there.

This pairing also strengthens your broader measurement approach. Incrementality findings are one of the best ways to calibrate an attribution model — when your attribution gives a channel credit your holdout test cannot confirm, you know the model is over-crediting it.

Frequently Asked Questions

Is incrementality testing the same as a holdout test?

A holdout test is the most common way to run an incrementality test, so the terms are often used interchangeably. You hold back a randomized group from seeing a campaign, then compare their conversion rate to the exposed group. The difference is the incremental lift. Other incrementality designs include geo-based tests and ghost ads.

Why does my A/B test show a winner but revenue did not move?

An A/B test measures relative performance between two variants, not whether the underlying activity created net-new demand. Your winning variant may simply be capturing the same conversions more efficiently rather than generating additional ones. Pair the A/B test with an incrementality test to confirm real causal impact.

How much traffic do I need for each method?

A/B tests generally need enough traffic to detect a small relative difference, often thousands of conversions per variant. Incrementality tests usually require more volume because the holdout group must be large enough to measure an absolute lift reliably. Low-traffic sites should prioritize A/B testing first.

Can I run incrementality tests without a big budget?

It is harder but possible. Geo-based holdouts let smaller advertisers withhold spend in select regions rather than from individual users, which works with lower volumes. Many ad platforms also offer built-in lift studies. The key constraint is having enough conversions to reach statistical significance.

The Bottom Line

Incrementality testing vs A/B testing is not really a competition — they are different tools for different jobs. A/B testing optimizes the experiences you have already committed to. Incrementality testing tells you whether those commitments were worth making in the first place.

If you only ever run A/B tests, you will get very good at optimizing activities that may not be generating any incremental revenue. If you only run incrementality tests, you will validate your channels but leave easy conversion gains on the table. Use both, at the right layer, and you will spend your budget with far more confidence than the team still arguing over which headline won.

Lukas Reinhardt

Lukas Reinhardt

Independent analytics consultant · Remote

10+ years building analytics stacks for SaaS companies. Every tool reviewed here is personally tested — no sponsored content, no affiliate bias.

More about me →