Skip to main content

The 8 Steps of a Rigorous Incremental Measurement

1

Ask a business question

Before any numbers, define exactly what you want to prove. A good question is observable, measurable, and tied to a real customer behaviour.Examples of good questions:
  • “Did I convert one-timers into repeat buyers?”
  • “Did I increase loyalty programme enrolment among non-members?”
  • “Do people exposed to Reelevant Content have a higher purchase frequency than Non-Exposed people?”
Avoid vague questions like “Does Reelevant work?” — target a specific segment, a specific email, and a specific behaviour.
2

Identify the target population

The scope must be clear. With Reelevant, the observed population is everyone who opens email communications containing a Reelevant block.
CriterionDefinitionExample
Observed populationEveryone who opens emails with a Reelevant blockOpt-in customers who opened at least one email with Reelevant Content during the period
3

Split into Exposed and Non-Exposed Populations

This is the core of the method. Divide customers into two groups: those who see the personalised Reelevant blocks (Exposed Population, 90%) and those who see the version without personalisation (Non-Exposed Population, 10%). This split must be random to be valid.
PopulationShareWhat they see
Exposed Population90%Email with personalised Reelevant blocks
Non-Exposed Population10%Same email, without Reelevant blocks — the reference baseline

Sample size requirements

Population sizes depend heavily on the volume of exposed individuals. Frame the approach carefully to obtain significant results.
Example: base rate 4%, target Uplift 1.5 points
→ You need ≈ 15,000 in the Exposed Population
  and ≈ 1,700 in the Non-Exposed Population (90/10 split)
Below these thresholds, the result may be due to chance and unreliable.
TermMeaning
Base rateThe behaviour rate observed WITHOUT Reelevant (e.g. 4% second-purchase rate)
Target UpliftThe minimum gain you want to be able to detect (e.g. +1.5 points)
4

Verify that both populations are comparable

Before drawing conclusions, confirm that both populations have the same starting profile. Otherwise, a difference in results could come from customer profiles, not from Reelevant.
What to checkHowAlert signal
Historical average basketCompare means of both populationsGap > 5%
Past purchase frequencyCompare distributionsProfiles too different
Customer tenureCompare averagesOne population significantly “younger”
GeographyCheck regional distributionOver-representation of a region
With Reelevant, the at-open random assignment already guarantees equivalence (see Step 8 — Reelevant’s assignment mechanism). This check is an additional safety measure.
5

Choose the observation window

How long do you observe behaviours after the send? Too short: you miss late effects. Too long: other marketing actions contaminate the measurement.
Objective measuredRecommended windowRationale
First purchase (prospect)7–14 daysPurchase decision is fast for this segment
Loyalty programme enrolment14–30 daysDecision takes slightly longer
Dormant customer reactivation30–60 daysCustomer needs time to return
Purchase frequency90–180 daysRequires observing multiple purchase cycles
LTV and long-term retention6–12 monthsEffect is measured over duration
6

Calculate the Uplift

The Uplift is the behavioural difference between the Exposed Population and the Non-Exposed Population. It is the key figure that says “thanks to Reelevant, X more happened.”
Uplift = Rate(Exposed Population) − Rate(Non-Exposed Population)

Example:
  Second-purchase rate (Exposed)     = 20.7%
  Second-purchase rate (Non-Exposed) = 15.3%
  Uplift = 20.7% − 15.3% = +5.4 points

  Relative Uplift = 5.4 / 15.3 × 100 = +35%
  → Reelevant generated 35% more second purchases
TermMeaning
Absolute UpliftDifference in percentage points (+5.4 pts here)
Relative UpliftGain expressed as % relative to the Non-Exposed Population (+35% here)

How to know if the Uplift is real or just chance?

A statistical test calculates the probability that the observed gap occurred by accident. If this probability is below 5% (p-value < 0.05), the result is reliable.
Analogy: If you flip a coin 10 times and get 7 heads, it might be chance. If you flip it 10,000 times and get 70% heads, that is statistically impossible by chance — the coin is biased. The same principle applies here.
7

Translate Uplift into euros

The behavioural Uplift (more clicks, more purchases, less churn) must be converted into financial value. This step makes results meaningful for leadership and the CFO.
Incremental Value =
  Number of incremental behaviours × Unit value of each behaviour

Incremental count = Uplift (pts) × Size of Exposed Population

Example:
  Uplift = +5.4 pts = 0.054
  Exposed Population size = 37,800 customers
  Incremental count = 0.054 × 37,800 = 2,041 additional converted customers
  Value per customer = €72 (incremental customer value of a one-timer
    converted to second purchase, accounting for projected retention rate)
  Incremental Value = 2,041 × €72 = €146,952
CRM objectiveUnit value to useExample
Loyalty enrolmentAnnual spend(member) − Annual spend(non-member)€215 − €130 = €85/member
ReactivationAverage revenue over 12 months post-return€180 per reactivated customer
Churn reductionAverage annual revenue × estimated remaining lifetime€150 × 2 years = €300
One-timer → second purchaseProjected CLV × estimated retention rate€72 per converted customer
8

Document, interpret, and reproduce

A good test must be recorded in writing with its conditions, results, and limitations. This is the requirement for reproducing it and getting internal validation.

What you CAN affirm

  • Reelevant generated X additional behaviours on this population
  • This represents €Y of Incremental Value over the measured period
  • The effect is statistically significant (reliable, not due to chance)

What you CANNOT affirm

  • That the effect will be the same on another segment or another period
  • That the effect will persist indefinitely without new personalisations
  • That unit values will remain unchanged over time
Key pitfalls to avoid:
  • Never confuse the Exposed Population with the clicking population. The unit of analysis is the customer who opened the email, not only those who clicked. Analysing only clickers means looking only at the most engaged and overestimates the effect.
  • Do not recalculate a segment’s status (e.g. one-timer, dormant) after the send date. The status must be frozen at the send date and never re-evaluated.
  • Do not conclude on an Uplift with p-value ≥ 0.05. A non-significant result is not a null result — it is an inconclusive result.

Priority Segments

Each segment has its own logic: a question, an observation window, and a unit value for financial valorisation.

One-timers

Question: “Did I convert one-timers into repeat buyers?”Window: 90 daysUnit value: CLV × estimated retention rate (e.g. €72 per converted customer)

Prospects (first purchase)

Question: “Does Reelevant Content increase first-purchase rate on non-buyer contacts?”Window: 7–14 daysUnit value: Average basket of first purchase

Dormant customers

Question: “Does the reactivation block re-engage customers inactive for 3+ months?”Window: 30–60 daysUnit value: Average revenue over 12 months post-return (e.g. €180)

Non-members (loyalty)

Question: “Did I enrol non-members into the loyalty programme?”Window: 14–30 daysUnit value: Annual spend(member) − Annual spend(non-member) (e.g. €215 − €130 = €85)

Reference Thresholds

SituationThreshold / Rule
Minimum Non-Exposed Population size1,700 contacts (base CVR 4%, target Uplift +1.5 pts)
Minimum Exposed Population size15,000 contacts under the same conditions
Profile gap between populationsAlert if > 5% on average basket, frequency, or tenure
Statistical significancep-value < 0.05
Segment definition snapshotAlways frozen at the send date — never recalculated after

Three Levels of Measurement

You can analyse Reelevant’s impact at three levels of depth. Each level provides a different reading.
LevelNameExampleProsCons
1Behaviour”+5.4 pts second-purchase rate”Quick to measure, easy to understandDoes not yet tell you how much it is worth in euros
2Immediate revenue”+€146,952 incremental revenue over 90 days”Directly financial, defensible in steering committeesDoes not capture effects on customer lifetime
3Lifetime value (LTV)”+€238,797 incremental CLV at 12 months”Complete picture — 1.6× more value than revenue aloneRequires reliable historical data for projection

How Reelevant’s Assignment Works

Unlike a classic A/B test where populations are defined before the send, Reelevant assigns at the moment of email open. When the server generates the Content in real time, a random draw determines whether the customer sees personalised blocks (90%) or the standard version (10%). This mechanism is statistically valid because:
  1. Stable assignment: A customer assigned to the Non-Exposed Population for a given send stays in that population for the entire experiment (via a persistent customer identifier).
  2. Independent of behaviour: The draw is independent of past behaviour — opens, profile, and purchase history do not influence the result.
  3. Law of large numbers: The 90/10 split is respected on average across the population. With 5,000+ individuals, both groups converge to identical behavioural profiles with no adjustment needed.
Analogy: Whether you draw all lottery tickets at once before the event or one by one as people arrive — if the draw is random, the statistical properties are identical.