Skip to main content

Consolidation Without Double-Counting

The golden rule: never add metrics that have a hierarchical relationship. A final metric already includes its component metrics.

Metric hierarchy — only add metrics at the same level

LevelMetricsRule
Level 3 — FinalIncremental LTVSynthetic metric — never cumulate with lower levels
Level 2 — IntermediateIncremental Annual SpendIncludes frequency and basket
Level 2 — IntermediateIncremental Retention / LifetimeIncludes churn reduction
Level 1 — ComponentIncremental purchase frequency
Level 1 — ComponentIncremental average basket
Level 1 — ComponentIncremental retention rate
Level 1 — ComponentReduced churn rate
Frequency and average basket compose Annual Spend. Annual Spend and customer lifetime compose LTV. Always choose the highest available level to avoid double-counting.

Arbitration rules

SituationMetrics you CAN useMetrics to EXCLUDERationale
You measure incremental Annual SpendAnnual Spend, prospect conversion, loyalty enrolmentFrequency, average basketFrequency × Basket = Annual Spend → double-counting
You measure incremental LTVLTV by segment (if different segments)Annual Spend, frequency, basket, retention, churnAll are included in LTV
Multiple independent Use Cases (different segments)Sum of Incremental Values if populations are disjointEverything if populations overlapOnly add non-overlapping populations
You measure loyalty + retention on the same segmentOne or the other, depending on data availabilityBoth simultaneouslyA loyalty member is a retained customer — correlated metrics

The Three Most Common Consolidation Errors

  1. Adding frequency + basket + Annual Spend: Frequency × Basket = Annual Spend. You would count the same value 3 times.
  2. Consolidating metrics from overlapping populations without verification: A customer may appear in multiple segments.
  3. Projecting annual value from a 30-day measurement without verifying that the effect is stable over the full year.

Consolidation Example

This fictitious example shows how to consolidate multiple CRM levers on distinct segments.
CRM leverSegmentExposed pop.UpliftUnit valueIncremental ValueLevel
Prospect → BuyerEmail prospects20,000+1.3 pts€115 (1st purchase)€29,900N2
Loyalty enrolmentActive customers50,000+2.5 pts€30/year€37,500N2
ReactivationInactive 90–180d15,000+3.6 pts€180 (12m)€97,200N2
Churn reductionHigh-risk10,000−4.5 pts€300 (residual LTV)€135,000N2
Status upgradeSilver members30,000+3.3 pts€80/year€79,200N2
TOTAL5 disjoint segments125,000€378,800
Why these levers are addable: The five levers concern distinct, non-overlapping segments (prospects, active customers, inactive, high-risk, Silver members). The metrics used are at the same level (N2 intermediate), with no double-counting against overall LTV.
SummaryValue
Total Incremental Value€378,800
Average value / exposed customer€3.03
ScopeEmail CRM — single channel (not omnichannel)

What this table does NOT say

This table shows the estimated Incremental Value over a given period and segments. It does not say:
  • Whether this value recurs every year
  • Whether it is net of Reelevant costs
  • Whether it will persist without new exposures

Statistical Interpretation Mistakes

Mistake 1 — Confusing significance with importance

An Uplift of +0.1% can be statistically significant but have zero business interest. Always assess practical relevance alongside statistical significance.

Mistake 2 — Peeking (stopping early)

Stopping the test as soon as a positive result appears artificially inflates the false-positive rate. Define the observation window before starting and stick to it.

Mistake 3 — Multiple testing without correction

Running 20 tests at the 5% significance threshold means 1 false positive is expected by construction. Apply Bonferroni or Benjamini-Hochberg corrections when testing multiple hypotheses simultaneously.

Mistake 4 — Ignoring test assumptions

The t-test assumes an approximately normal distribution on large samples. For skewed distributions (e.g. purchase frequency, LTV), use the Mann-Whitney test instead.

Experimental Design Mistakes

Mistake 5 — Selection bias

If the assignment is not random, the groups can differ at baseline. Results then reflect customer profiles, not Reelevant’s effect.
With Reelevant’s at-open random assignment, selection bias is eliminated by design. However, if you retroactively filter populations (e.g. “only customers who clicked”), you reintroduce this bias.

Mistake 6 — Contamination

If Non-Exposed customers see Reelevant Content via another channel (e.g. web, push), the measured Uplift is artificially lowered. Document all channels where Reelevant blocks are active.

Mistake 7 — Confounding with other actions

If a promotion or external event affects one population more than the other during the measurement, the comparison is invalid. Record all concurrent marketing actions.

Mistake 8 — Survivor bias

Measuring only customers who opened the email means looking only at the most engaged. This overestimates the effect. The unit of analysis should include all recipients, not just openers, when possible.

Measurement Window Mistakes

Mistake 9 — Wrong observation window

PitfallExampleConsequence
Too shortMeasuring reactivation over 7 daysMisses late responders — underestimates the effect
Too longMeasuring first purchase over 6 monthsOther campaigns contaminate the result — overestimates
Not defined a prioriChoosing the window after seeing the dataCherry-picking the best result — inflates false positives

Mistake 10 — Recalculating segment status post-send

The segment status (one-timer, dormant, Silver member) must be frozen at the send date. Never recalculate it after the send. A one-timer who purchases during the measurement becomes a repeat buyer — but they were a one-timer at the time of exposure.

Valorisation Mistakes

Mistake 11 — Using global averages instead of segment-specific values

The unit value used for financial valorisation must correspond to the specific segment, not the overall customer base. A reactivated dormant customer does not have the same annual revenue as a loyal VIP.

Mistake 12 — Annualising from a short window

Projecting 30-day results to 12 months assumes the effect is constant. Verify stability over multiple periods before making annual projections.

Quick Reference — What You CAN and CANNOT Affirm

What you CAN affirm

  • Reelevant generated X incremental behaviours on population Y during period Z
  • This represents €N of Incremental Value based on unit value V
  • The effect is statistically significant (p < 0.05, with confidence interval)

What you CANNOT affirm

  • That the effect will be identical on a different segment or period
  • That the effect persists indefinitely without continued personalisation
  • That unit values are stable over time (recalculate annually)
  • That a 30-day measurement represents a full year’s impact