Consolidation Without Double-Counting
The golden rule: never add metrics that have a hierarchical relationship. A final metric already includes its component metrics.
Metric hierarchy — only add metrics at the same level
| Level | Metrics | Rule |
|---|
| Level 3 — Final | Incremental LTV | Synthetic metric — never cumulate with lower levels |
| Level 2 — Intermediate | Incremental Annual Spend | Includes frequency and basket |
| Level 2 — Intermediate | Incremental Retention / Lifetime | Includes churn reduction |
| Level 1 — Component | Incremental purchase frequency | |
| Level 1 — Component | Incremental average basket | |
| Level 1 — Component | Incremental retention rate | |
| Level 1 — Component | Reduced churn rate | |
Frequency and average basket compose Annual Spend. Annual Spend and customer lifetime compose LTV. Always choose the highest available level to avoid double-counting.
Arbitration rules
| Situation | Metrics you CAN use | Metrics to EXCLUDE | Rationale |
|---|
| You measure incremental Annual Spend | Annual Spend, prospect conversion, loyalty enrolment | Frequency, average basket | Frequency × Basket = Annual Spend → double-counting |
| You measure incremental LTV | LTV by segment (if different segments) | Annual Spend, frequency, basket, retention, churn | All are included in LTV |
| Multiple independent Use Cases (different segments) | Sum of Incremental Values if populations are disjoint | Everything if populations overlap | Only add non-overlapping populations |
| You measure loyalty + retention on the same segment | One or the other, depending on data availability | Both simultaneously | A loyalty member is a retained customer — correlated metrics |
The Three Most Common Consolidation Errors
- Adding frequency + basket + Annual Spend: Frequency × Basket = Annual Spend. You would count the same value 3 times.
- Consolidating metrics from overlapping populations without verification: A customer may appear in multiple segments.
- Projecting annual value from a 30-day measurement without verifying that the effect is stable over the full year.
Consolidation Example
This fictitious example shows how to consolidate multiple CRM levers on distinct segments.
| CRM lever | Segment | Exposed pop. | Uplift | Unit value | Incremental Value | Level |
|---|
| Prospect → Buyer | Email prospects | 20,000 | +1.3 pts | €115 (1st purchase) | €29,900 | N2 |
| Loyalty enrolment | Active customers | 50,000 | +2.5 pts | €30/year | €37,500 | N2 |
| Reactivation | Inactive 90–180d | 15,000 | +3.6 pts | €180 (12m) | €97,200 | N2 |
| Churn reduction | High-risk | 10,000 | −4.5 pts | €300 (residual LTV) | €135,000 | N2 |
| Status upgrade | Silver members | 30,000 | +3.3 pts | €80/year | €79,200 | N2 |
| TOTAL | 5 disjoint segments | 125,000 | | | €378,800 | |
Why these levers are addable: The five levers concern distinct, non-overlapping segments (prospects, active customers, inactive, high-risk, Silver members). The metrics used are at the same level (N2 intermediate), with no double-counting against overall LTV.
| Summary | Value |
|---|
| Total Incremental Value | €378,800 |
| Average value / exposed customer | €3.03 |
| Scope | Email CRM — single channel (not omnichannel) |
What this table does NOT say
This table shows the estimated Incremental Value over a given period and segments. It does not say:
- Whether this value recurs every year
- Whether it is net of Reelevant costs
- Whether it will persist without new exposures
Statistical Interpretation Mistakes
Mistake 1 — Confusing significance with importance
An Uplift of +0.1% can be statistically significant but have zero business interest. Always assess practical relevance alongside statistical significance.
Mistake 2 — Peeking (stopping early)
Stopping the test as soon as a positive result appears artificially inflates the false-positive rate. Define the observation window before starting and stick to it.
Mistake 3 — Multiple testing without correction
Running 20 tests at the 5% significance threshold means 1 false positive is expected by construction. Apply Bonferroni or Benjamini-Hochberg corrections when testing multiple hypotheses simultaneously.
Mistake 4 — Ignoring test assumptions
The t-test assumes an approximately normal distribution on large samples. For skewed distributions (e.g. purchase frequency, LTV), use the Mann-Whitney test instead.
Experimental Design Mistakes
Mistake 5 — Selection bias
If the assignment is not random, the groups can differ at baseline. Results then reflect customer profiles, not Reelevant’s effect.
With Reelevant’s at-open random assignment, selection bias is eliminated by design. However, if you retroactively filter populations (e.g. “only customers who clicked”), you reintroduce this bias.
Mistake 6 — Contamination
If Non-Exposed customers see Reelevant Content via another channel (e.g. web, push), the measured Uplift is artificially lowered. Document all channels where Reelevant blocks are active.
Mistake 7 — Confounding with other actions
If a promotion or external event affects one population more than the other during the measurement, the comparison is invalid. Record all concurrent marketing actions.
Mistake 8 — Survivor bias
Measuring only customers who opened the email means looking only at the most engaged. This overestimates the effect. The unit of analysis should include all recipients, not just openers, when possible.
Measurement Window Mistakes
Mistake 9 — Wrong observation window
| Pitfall | Example | Consequence |
|---|
| Too short | Measuring reactivation over 7 days | Misses late responders — underestimates the effect |
| Too long | Measuring first purchase over 6 months | Other campaigns contaminate the result — overestimates |
| Not defined a priori | Choosing the window after seeing the data | Cherry-picking the best result — inflates false positives |
Mistake 10 — Recalculating segment status post-send
The segment status (one-timer, dormant, Silver member) must be frozen at the send date. Never recalculate it after the send. A one-timer who purchases during the measurement becomes a repeat buyer — but they were a one-timer at the time of exposure.
Valorisation Mistakes
Mistake 11 — Using global averages instead of segment-specific values
The unit value used for financial valorisation must correspond to the specific segment, not the overall customer base. A reactivated dormant customer does not have the same annual revenue as a loyal VIP.
Mistake 12 — Annualising from a short window
Projecting 30-day results to 12 months assumes the effect is constant. Verify stability over multiple periods before making annual projections.
Quick Reference — What You CAN and CANNOT Affirm
What you CAN affirm
- Reelevant generated X incremental behaviours on population Y during period Z
- This represents €N of Incremental Value based on unit value V
- The effect is statistically significant (p < 0.05, with confidence interval)
What you CANNOT affirm
- That the effect will be identical on a different segment or period
- That the effect persists indefinitely without continued personalisation
- That unit values are stable over time (recalculate annually)
- That a 30-day measurement represents a full year’s impact