Time-decay Attribution and Causal Analysis#

The problem: Advertisers run campaigns across multiple channels (search, display, video) but struggle to know which ads actually drive sales. The standard approach, last-touch attribution (LTA), credits the final ad a customer saw before purchasing. This can systematically overcredit lower-funnel ads (search) that appear right before conversion, while undercrediting upper-funnel ads (display, video) that built awareness earlier. The result is misallocated budgets and underinvestment in channels with higher true ROI.

Two measurement approaches:

Time Decay Attribution (TDA): Reallocates LTA credit by modeling ad carryover, i.e., how impressions accumulate and decay over time. If upper-funnel ads have longer carryover, they gain credit in the time-decay approach vs LTA.
Causal Attribution (CA): Uses DoWhy to decompose YoY sales growth into ad-driven contributions. Thus we know how much each spend change contributed to business outcome.

These two approaches together tell you what types of ads influenced conversions beyond rule-based approaches and what drove growth.

[1]:

import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import pandas as pd
from scipy import optimize
from sklearn.linear_model import Ridge
from sklearn.model_selection import TimeSeriesSplit
from sklearn.metrics import mean_absolute_error
import warnings
warnings.filterwarnings('ignore')

from dowhy import gcm
from dowhy.gcm.util import plot as plot_graph
from dowhy.gcm.falsify import falsify_graph, apply_suggestions

1. Data Generation#

Synthetic data with 4 ad channels: SP, SB, Display, Video:

SP/SB: high LTA share (75% combined), lower iROI ($2.20-2.50), fast decay
Display/Video: low LTA share (25%), higher iROI ($3.80-4.50), slow decay

[2]:

def generate_advertising_data(start_date='2024-01-01', end_date='2025-12-31', seed=0):
    np.random.seed(seed)

    dates = pd.date_range(start=start_date, end=end_date, freq='D')
    n_days = len(dates)

    day_of_year = np.array([d.dayofyear for d in dates])
    year = np.array([d.year for d in dates])
    day_of_week = np.array([d.dayofweek for d in dates])

    # Seasonality: ±15% weekly, ±30% yearly (peaks ~June)
    weekly_pattern = 1 + 0.15 * np.sin(2 * np.pi * day_of_week / 7)
    yearly_pattern = 1 + 0.3 * np.sin(2 * np.pi * (day_of_year - 80) / 365)
    yoy_growth = np.where(year == 2025, 1.08, 1.0)  # 8% YoY growth

    # Events: with year-over-year variation
    special_shopping_event = np.zeros(n_days)
    other_shopping_event = np.zeros(n_days)

    for i, d in enumerate(dates):
        # === SPECIAL SHOPPING EVENTS ===
        # Prime Day
        if d.year == 2024 and d.month == 7 and 16 <= d.day <= 17:
            special_shopping_event[i] = 1
        if d.year == 2025 and d.month == 7 and 8 <= d.day <= 9:
            special_shopping_event[i] = 1
        # Black Friday week
        if d.year == 2024 and d.month == 11 and 25 <= d.day <= 30:  # Thanksgiving Nov 28
            special_shopping_event[i] = 1
        if d.year == 2025 and d.month == 11 and 24 <= d.day <= 29:  # Thanksgiving Nov 27
            special_shopping_event[i] = 1
        # Holiday shopping season
        if d.month == 12 and d.day <= 24:
            special_shopping_event[i] = 0.7
        # Cyber Monday extended
        if d.year == 2024 and d.month == 12 and 1 <= d.day <= 2:
            special_shopping_event[i] = 1
        if d.year == 2025 and d.month == 11 and d.day == 30:
            special_shopping_event[i] = 1

        # === OTHER SHOPPING EVENTS ===
        # Valentine's Day
        if d.month == 2 and 7 <= d.day <= 14:
            other_shopping_event[i] = 1
        # Easter
        if d.year == 2024 and d.month == 3 and 25 <= d.day <= 31:  # Easter Mar 31
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 4 and 14 <= d.day <= 20:  # Easter Apr 20
            other_shopping_event[i] = 1
        # Mother's Day week
        if d.year == 2024 and d.month == 5 and 6 <= d.day <= 12:  # May 12
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 5 and 5 <= d.day <= 11:  # May 11
            other_shopping_event[i] = 1
        # Memorial Day
        if d.year == 2024 and d.month == 5 and 24 <= d.day <= 27:
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 5 and 23 <= d.day <= 26:
            other_shopping_event[i] = 1
        # Father's Day week
        if d.year == 2024 and d.month == 6 and 10 <= d.day <= 16:  # Jun 16
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 6 and 9 <= d.day <= 15:  # Jun 15
            other_shopping_event[i] = 1
        # Back to school
        if d.month == 8 and 1 <= d.day <= 31:
            other_shopping_event[i] = 0.7
        # Labor Day
        if d.year == 2024 and d.month == 9 and 1 <= d.day <= 2:
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 9 and 1 <= d.day <= 1:
            other_shopping_event[i] = 1
        # Halloween
        if d.month == 10 and 20 <= d.day <= 31:
            other_shopping_event[i] = 0.7
        # Super Bowl week
        if d.year == 2024 and d.month == 2 and 5 <= d.day <= 11:  # Feb 11
            other_shopping_event[i] = 1
        if d.year == 2025 and d.month == 2 and 3 <= d.day <= 9:  # Feb 9
            other_shopping_event[i] = 1

    # Spend: SP/SB track demand closely; Display/Video more stable for brand building
    base_spend_sp = 15000 * yoy_growth * weekly_pattern * yearly_pattern
    base_spend_sb = 8000 * yoy_growth * weekly_pattern * yearly_pattern
    base_spend_display = 12000 * yoy_growth * weekly_pattern * (yearly_pattern ** 0.5)
    base_spend_video = 6000 * yoy_growth * weekly_pattern * (yearly_pattern ** 0.3)

    event_multiplier = 1 + 0.5 * special_shopping_event + 0.2 * other_shopping_event

    # Noise: upper-funnel has more variance
    sp_spend = base_spend_sp * event_multiplier * (1 + 0.15 * np.random.randn(n_days))
    sb_spend = base_spend_sb * event_multiplier * (1 + 0.12 * np.random.randn(n_days))
    display_spend = base_spend_display * event_multiplier * (1 + 0.18 * np.random.randn(n_days))
    video_spend = base_spend_video * event_multiplier * (1 + 0.20 * np.random.randn(n_days))

    sp_spend = np.maximum(sp_spend, 100)
    sb_spend = np.maximum(sb_spend, 50)
    display_spend = np.maximum(display_spend, 100)
    video_spend = np.maximum(video_spend, 50)

    # Impressions via CPM: SP=$2.50, SB=$3.00, Display=$1.50, Video=$8.00
    sp_impressions = sp_spend / 2.5 * 1000
    sb_impressions = sb_spend / 3.0 * 1000
    display_impressions = display_spend / 1.5 * 1000
    video_impressions = video_spend / 8.0 * 1000

    # Ground truth: upper-funnel has higher iROI but gets less LTA credit
    iroi_sp, iroi_sb, iroi_display, iroi_video = 2.5, 2.2, 3.8, 4.5
    carryover_sp, carryover_sb, carryover_display, carryover_video = 0.3, 0.4, 0.7, 0.85
    lta_rate_sp, lta_rate_sb, lta_rate_display, lta_rate_video = 0.45, 0.30, 0.18, 0.07

    # Adstock: effective_spend[t] = spend[t] + carryover * effective_spend[t-1]
    def apply_adstock(x, carryover):
        adstocked = np.zeros_like(x)
        adstocked[0] = x[0]
        for t in range(1, len(x)):
            adstocked[t] = x[t] + carryover * adstocked[t-1]
        return adstocked

    sp_adstocked = apply_adstock(sp_spend, carryover_sp)
    sb_adstocked = apply_adstock(sb_spend, carryover_sb)
    display_adstocked = apply_adstock(display_spend, carryover_display)
    video_adstocked = apply_adstock(video_spend, carryover_video)

    # Total sales = organic + (iROI × adstocked_spend) + event_lift + noise
    organic_baseline = 80000 * yoy_growth * weekly_pattern * yearly_pattern
    incremental_from_sp = iroi_sp * sp_adstocked
    incremental_from_sb = iroi_sb * sb_adstocked
    incremental_from_display = iroi_display * display_adstocked
    incremental_from_video = iroi_video * video_adstocked
    event_lift = 30000 * special_shopping_event + 10000 * other_shopping_event

    total_sales = (organic_baseline +
                   incremental_from_sp + incremental_from_sb +
                   incremental_from_display + incremental_from_video +
                   event_lift + 5000 * np.random.randn(n_days))
    total_sales = np.maximum(total_sales, 10000)

    # LTA: 60% of sales are ad-touched, distributed by LTA contribution rates (independent of true iROI)
    ad_touched_sales = 0.6 * total_sales * (1 + 0.1 * np.random.randn(n_days))
    lta_sp = np.maximum(lta_rate_sp * ad_touched_sales * (1 + 0.08 * np.random.randn(n_days)), 0)
    lta_sb = np.maximum(lta_rate_sb * ad_touched_sales * (1 + 0.10 * np.random.randn(n_days)), 0)
    lta_display = np.maximum(lta_rate_display * ad_touched_sales * (1 + 0.12 * np.random.randn(n_days)), 0)
    lta_video = np.maximum(lta_rate_video * ad_touched_sales * (1 + 0.15 * np.random.randn(n_days)), 0)

    df = pd.DataFrame({
        'activity_date': dates,
        'year': year,
        'quarter': [d.quarter for d in dates],
        'month': [d.month for d in dates],
        'day_of_week': day_of_week,
        'sp_spend': sp_spend, 'sb_spend': sb_spend,
        'display_spend': display_spend, 'video_spend': video_spend,
        'sp_impressions': sp_impressions, 'sb_impressions': sb_impressions,
        'display_impressions': display_impressions, 'video_impressions': video_impressions,
        'special_shopping_event': special_shopping_event,
        'other_shopping_event': other_shopping_event,
        'total_sales': total_sales,
        'ad_touched_sales': ad_touched_sales,
        'lta_sales_sp': lta_sp, 'lta_sales_sb': lta_sb,
        'lta_sales_display': lta_display, 'lta_sales_video': lta_video,
    })

    ground_truth = {
        'iroi': {'SP': iroi_sp, 'SB': iroi_sb, 'Display': iroi_display, 'Video': iroi_video},
        'carryover': {'SP': carryover_sp, 'SB': carryover_sb, 'Display': carryover_display, 'Video': carryover_video},
        'lta_rate': {'SP': lta_rate_sp, 'SB': lta_rate_sb, 'Display': lta_rate_display, 'Video': lta_rate_video}
    }

    return df, ground_truth

df, ground_truth = generate_advertising_data()
print(f"Data: {len(df)} days, {df['activity_date'].min().date()} to {df['activity_date'].max().date()}")
print(f"Special event days: {(df['special_shopping_event'] > 0).sum()} ({(df['special_shopping_event'] > 0).mean()*100:.1f}%)")
print(f"Other event days: {(df['other_shopping_event'] > 0).sum()} ({(df['other_shopping_event'] > 0).mean()*100:.1f}%)")
df.head()

Data: 731 days, 2024-01-01 to 2025-12-31
Special event days: 65 (8.9%)
Other event days: 161 (22.0%)

[2]:

	activity_date	year	quarter	month	day_of_week	sp_spend	sb_spend	display_spend	video_spend	sp_impressions	...	display_impressions	video_impressions	total_sales	ad_touched_sales	lta_sales_sp	lta_sales_sb	lta_sales_display	lta_sales_video
0	2024-01-01	2024	1	1	0	13404.441671	5099.747577	8952.575520	3858.257279	5.361777e+06	...	5.968384e+06	482282.159891	155454.198911	87939.951886	37143.728760	23460.738849	13858.642324	6293.988387
1	2024-01-02	2024	1	1	1	12573.576417	5940.908870	10486.819125	5283.133227	5.029431e+06	...	6.991213e+06	660391.653381	219463.068045	145232.875195	52965.494493	44159.286689	30375.435747	14314.282119
2	2024-01-03	2024	1	1	2	13979.535290	5680.663153	11816.388289	7530.793229	5.591814e+06	...	7.877592e+06	941349.153615	284656.682628	172537.129670	76023.283438	46311.842059	27777.802951	9154.003839
3	2024-01-04	2024	1	1	3	15161.875157	7865.616003	9500.538566	5050.795393	6.064750e+06	...	6.333692e+06	631349.424145	313792.139323	236249.947675	107409.255926	85420.610730	53827.211856	20315.500984
4	2024-01-05	2024	1	1	4	12775.945666	3888.599153	9579.377571	6822.546123	5.110378e+06	...	6.386252e+06	852818.265345	323851.281693	185840.293859	75548.541173	61245.391132	32032.104714	8354.883341

5 rows × 21 columns

[3]:

print("Ground Truth:")
print(f"  iROI: {ground_truth['iroi']}")
print(f"  Carryover: {ground_truth['carryover']}")
print(f"  LTA rates: {ground_truth['lta_rate']}")

Ground Truth:
  iROI: {'SP': 2.5, 'SB': 2.2, 'Display': 3.8, 'Video': 4.5}
  Carryover: {'SP': 0.3, 'SB': 0.4, 'Display': 0.7, 'Video': 0.85}
  LTA rates: {'SP': 0.45, 'SB': 0.3, 'Display': 0.18, 'Video': 0.07}

[4]:

df_2024 = df[df['year'] == 2024]
df_2025 = df[df['year'] == 2025]

print(f"YoY Change (2025 vs 2024):")
for col in ['total_sales', 'sp_spend', 'sb_spend', 'display_spend', 'video_spend']:
    pct = (df_2025[col].mean() - df_2024[col].mean()) / df_2024[col].mean() * 100
    print(f"  {col}: {pct:+.1f}%")

YoY Change (2025 vs 2024):
  total_sales: +8.2%
  sp_spend: +5.9%
  sb_spend: +8.4%
  display_spend: +7.9%
  video_spend: +7.1%

2. Time Decay MTA#

Reallocate LTA credit using adstock transformation. Channels with higher carryover (Display/Video) accumulate more effective impressions over time, gaining credit vs LTA.

[5]:

def apply_adstock(series, carryover):
    """Geometric adstock: result[t] = x[t] + carryover * result[t-1]"""
    result = np.zeros_like(series, dtype=float)
    result[0] = series[0]
    for t in range(1, len(series)):
        result[t] = series[t] + carryover * result[t-1]
    return result

def build_adstocked_features(X, carryover_map):
    """Apply channel-specific carryover to build adstocked feature matrix."""
    X_adstocked = pd.DataFrame(index=X.index)
    for col in X.columns:
        carry = carryover_map.get(col, 0.5)
        X_adstocked[col] = apply_adstock(X[col].values, carry)
    return X_adstocked

def time_decay_attribution(X_adstocked, y, lam=10.0):
    """Ridge regression on adstocked features, rescaled to sum to total LTA."""
    X_arr = X_adstocked.values
    X_mean = X_arr.mean(axis=0)
    X_std = X_arr.std(axis=0) + 1e-8
    X_scaled = (X_arr - X_mean) / X_std

    model = Ridge(alpha=lam, fit_intercept=True)
    model.fit(X_scaled, y.values)
    coefs = model.coef_ / X_std
    y_pred = model.predict(X_scaled)

    contributions = {}
    for i, col in enumerate(X_adstocked.columns):
        contrib = coefs[i] * X_adstocked[col].values
        contributions[col] = np.maximum(contrib, 0).sum()

    total_contrib = sum(contributions.values())
    if total_contrib > 0:
        scale = y.sum() / total_contrib
        contributions = {k: v * scale for k, v in contributions.items()}

    return contributions, model, y_pred

def cv_objective(params, X, y, channels, carryover_bounds, lambda_bounds, n_splits=5):
    """Cross-validation objective: validation MAE + overfitting penalty + regularization on carryover."""
    n_channels = len(channels)
    carryover_map = {channels[i]: params[i] for i in range(n_channels)}
    lam = params[n_channels]

    # Boundary check
    for carry in carryover_map.values():
        if carry < carryover_bounds[0] or carry > carryover_bounds[1]:
            return 1e10
    if lam < lambda_bounds[0] or lam > lambda_bounds[1]:
        return 1e10

    X_adstocked = build_adstocked_features(X, carryover_map)
    tscv = TimeSeriesSplit(n_splits=n_splits)

    train_maes, val_maes = [], []

    for train_idx, val_idx in tscv.split(X_adstocked):
        X_train, X_val = X_adstocked.iloc[train_idx], X_adstocked.iloc[val_idx]
        y_train, y_val = y.iloc[train_idx], y.iloc[val_idx]

        # Standardize using train stats only
        X_train_arr = X_train.values
        X_mean = X_train_arr.mean(axis=0)
        X_std = X_train_arr.std(axis=0) + 1e-8
        X_train_scaled = (X_train_arr - X_mean) / X_std

        model = Ridge(alpha=lam, fit_intercept=True)
        model.fit(X_train_scaled, y_train.values)

        y_train_pred = model.predict(X_train_scaled)
        X_val_scaled = (X_val.values - X_mean) / X_std
        y_val_pred = model.predict(X_val_scaled)

        train_maes.append(mean_absolute_error(y_train, y_train_pred))
        val_maes.append(mean_absolute_error(y_val, y_val_pred))

    train_mae = np.mean(train_maes)
    val_mae = np.mean(val_maes)
    gap_penalty = max(0, val_mae - train_mae) / (train_mae + 1e-8)
    carryover_penalty = sum(c**2 for c in carryover_map.values()) / n_channels

    return val_mae + 0.1 * gap_penalty * val_mae + 0.05 * carryover_penalty * val_mae

def optimize_carryover(X, y, channels, carryover_bounds=(0.1, 0.95), lambda_bounds=(1, 100), maxiter=50):
    """Find optimal carryover rates via differential evolution."""
    n_channels = len(channels)
    bounds = [carryover_bounds] * n_channels + [lambda_bounds]

    result = optimize.differential_evolution(
        cv_objective,
        bounds=bounds,
        args=(X, y, channels, carryover_bounds, lambda_bounds),
        seed=0,
        maxiter=maxiter,
        tol=1e-6,
        disp=False,
        workers=1
    )

    carryover_map = {channels[i]: result.x[i] for i in range(n_channels)}
    lam_opt = result.x[n_channels]

    return carryover_map, lam_opt

[6]:

channels = ['SP', 'SB', 'Display', 'Video']
X_impressions = df.set_index('activity_date')[['sp_impressions', 'sb_impressions', 'display_impressions', 'video_impressions']].copy()
X_impressions.columns = channels
y_lta_total = df.set_index('activity_date')[['lta_sales_sp', 'lta_sales_sb', 'lta_sales_display', 'lta_sales_video']].sum(axis=1)

print(f"X: {X_impressions.shape}, y: {y_lta_total.shape}")

X: (731, 4), y: (731,)

[7]:

carryover_opt, lam_opt = optimize_carryover(
    X_impressions, y_lta_total, channels,
    carryover_bounds=(0.1, 0.95),
    lambda_bounds=(5, 100),
    maxiter=50
)

print("Optimized carryover rates:")
for ch, carry in carryover_opt.items():
    half_life = np.log(0.5) / np.log(carry) if carry > 0 else 0
    true_carry = ground_truth['carryover'].get(ch, 'N/A')
    print(f"  {ch}: {carry:.3f} (half-life: {half_life:.1f}d) | actual: {true_carry}")
print(f"Lambda: {lam_opt:.2f}")

Optimized carryover rates:
  SP: 0.270 (half-life: 0.5d) | actual: 0.3
  SB: 0.163 (half-life: 0.4d) | actual: 0.4
  Display: 0.768 (half-life: 2.6d) | actual: 0.7
  Video: 0.803 (half-life: 3.2d) | actual: 0.85
Lambda: 31.33

[8]:

X_adstocked = build_adstocked_features(X_impressions, carryover_opt)
mta_contrib, _, _ = time_decay_attribution(X_adstocked, y_lta_total, lam_opt)

lta_totals = {
    'SP': df['lta_sales_sp'].sum(),
    'SB': df['lta_sales_sb'].sum(),
    'Display': df['lta_sales_display'].sum(),
    'Video': df['lta_sales_video'].sum()
}

total_lta = sum(lta_totals.values())
total_mta = sum(mta_contrib.values())

[9]:

print("LTA vs MTA Attribution:")
print(f"{'Channel':<10} {'LTA ($M)':<12} {'LTA %':<10} {'MTA ($M)':<12} {'MTA %':<10} {'Shift':<10}")
print("-" * 64)
for ch in channels:
    lta_val = lta_totals[ch] / 1e6
    lta_pct = lta_totals[ch] / total_lta * 100
    mta_val = mta_contrib[ch] / 1e6
    mta_pct = mta_contrib[ch] / total_mta * 100
    shift = mta_pct - lta_pct
    print(f"{ch:<10} {lta_val:<12.1f} {lta_pct:<10.1f} {mta_val:<12.1f} {mta_pct:<10.1f} {shift:+.1f}pp")

LTA vs MTA Attribution:
Channel    LTA ($M)     LTA %      MTA ($M)     MTA %      Shift
----------------------------------------------------------------
SP         106.9        45.1       30.3         12.8       -32.3pp
SB         71.2         30.0       28.8         12.1       -17.9pp
Display    42.5         17.9       90.7         38.2       +20.3pp
Video      16.6         7.0        87.4         36.8       +29.8pp

[10]:

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

x = np.arange(len(channels))
width = 0.35

ax1 = axes[0]
lta_pcts = [lta_totals[c]/total_lta*100 for c in channels]
mta_pcts = [mta_contrib[c]/total_mta*100 for c in channels]
ax1.bar(x - width/2, lta_pcts, width, label='Last Touch', color='tab:red')
ax1.bar(x + width/2, mta_pcts, width, label='Time Decay MTA', color='tab:blue')
ax1.set_xticks(x)
ax1.set_xticklabels(channels)
ax1.set_ylabel('Share (%)')
ax1.set_title('LTA vs MTA Attribution')
ax1.legend()
ax1.set_ylim(0, 55)

ax2 = axes[1]
shifts = [mta_pcts[i] - lta_pcts[i] for i in range(len(channels))]
colors = ['tab:green' if s > 0 else 'tab:red' for s in shifts]
ax2.bar(channels, shifts, color=colors)
ax2.axhline(y=0, color='black', linewidth=0.5)
ax2.set_ylabel('Credit Shift (pp)')
ax2.set_title('MTA - LTA Shift')

plt.tight_layout()
plt.show()

../_images/example_notebooks_gcm_mta_incrementality_time_decay_12_0.png

3. Causal Attribution (DoWhy)#

Decompose YoY sales growth into driver contributions using DoWhy GCM:

Build DAG: spend variables, other drivers → sale
Fit causal model
Attribute mean change via Shapley values

3.1 With Raw Spend#

This is the default approach by fitting the model directly on observed spend without any time-series transformation.

[11]:

df_causal = df[['activity_date', 'year', 'quarter',
                'sp_spend', 'sb_spend', 'display_spend', 'video_spend',
                'special_shopping_event', 'other_shopping_event',
                'total_sales']].copy()
df_causal.columns = ['activity_date', 'year', 'quarter',
                     'sp_spend', 'sb_spend', 'display_spend', 'video_spend',
                     'special_shopping_event', 'other_shopping_event', 'sale']

print(f"Causal data: {len(df_causal)} rows")

Causal data: 731 rows

[12]:

# Build graph matching DGP structure
day_of_year = df['activity_date'].dt.dayofyear
df_causal['yearly_seasonality'] = 1 + 0.3 * np.sin(2 * np.pi * (day_of_year - 80) / 365)
df_causal['weekly_seasonality'] = 1 + 0.15 * np.sin(2 * np.pi * df['activity_date'].dt.dayofweek / 7)

edges = []

# Spend → sale
for col in ['sp_spend', 'sb_spend', 'display_spend', 'video_spend']:
    edges.append((col, 'sale'))

# Seasonality → sale
edges.append(('yearly_seasonality', 'sale'))
edges.append(('weekly_seasonality', 'sale'))

# Seasonality → all spend
for col in ['sp_spend', 'sb_spend', 'display_spend', 'video_spend']:
    edges.append(('yearly_seasonality', col))
    edges.append(('weekly_seasonality', col))

# Events → sale
edges.append(('special_shopping_event', 'sale'))
edges.append(('other_shopping_event', 'sale'))

# Events → all spend
for col in ['sp_spend', 'sb_spend', 'display_spend', 'video_spend']:
    edges.append(('special_shopping_event', col))
    edges.append(('other_shopping_event', col))

causal_graph = nx.DiGraph(edges)
plot_graph(causal_graph)

../_images/example_notebooks_gcm_mta_incrementality_time_decay_16_0.png

[13]:

# Graph validation
RUN_GRAPH_VALIDATION = True

if RUN_GRAPH_VALIDATION:
    validation_data = df_causal[['sp_spend','sb_spend','display_spend','video_spend',
                                  'yearly_seasonality', 'weekly_seasonality','special_shopping_event', 'other_shopping_event','sale']]
    result = falsify_graph(causal_graph, validation_data, suggestions=True, n_permutations=20)
    print(result)
    causal_graph = apply_suggestions(causal_graph, result)
    print(f"\nRevised graph edges: {list(causal_graph.edges())}")
else:
    print("Graph validation skipped")

Test permutations of given graph: 100%|██████████| 20/20 [00:20<00:00,  1.03s/it]

+-------------------------------------------------------------------------------------------------------+
|                                         Falsification Summary                                         |
+-------------------------------------------------------------------------------------------------------+
| The given DAG is informative because 0 / 20 of the permutations lie in the Markov                     |
| equivalence class of the given DAG (p-value: 0.00).                                                   |
| The given DAG violates 12/24 LMCs and is better than 100.0% of the permuted DAGs (p-value: 0.00).     |
| Based on the provided significance level (0.05) and because the DAG is informative,                   |
| we do not reject the DAG.                                                                             |
+-------------------------------------------------------------------------------------------------------+
|                                              Suggestions                                              |
+-------------------------------------------------------------------------------------------------------+
| Causal Minimality | - Remove edge weekly_seasonality --> sale                                         |
+-------------------------------------------------------------------------------------------------------+

Revised graph edges: [('sp_spend', 'sale'), ('sb_spend', 'sale'), ('display_spend', 'sale'), ('video_spend', 'sale'), ('yearly_seasonality', 'sale'), ('yearly_seasonality', 'sp_spend'), ('yearly_seasonality', 'sb_spend'), ('yearly_seasonality', 'display_spend'), ('yearly_seasonality', 'video_spend'), ('weekly_seasonality', 'sp_spend'), ('weekly_seasonality', 'sb_spend'), ('weekly_seasonality', 'display_spend'), ('weekly_seasonality', 'video_spend'), ('special_shopping_event', 'sale'), ('special_shopping_event', 'sp_spend'), ('special_shopping_event', 'sb_spend'), ('special_shopping_event', 'display_spend'), ('special_shopping_event', 'video_spend'), ('other_shopping_event', 'sale'), ('other_shopping_event', 'sp_spend'), ('other_shopping_event', 'sb_spend'), ('other_shopping_event', 'display_spend'), ('other_shopping_event', 'video_spend')]

[14]:

causal_model = gcm.StructuralCausalModel(causal_graph)
gcm.auto.assign_causal_mechanisms(causal_model, df_causal, quality=gcm.auto.AssignmentQuality.BETTER)
gcm.fit(causal_model, df_causal)

Fitting causal mechanism of node other_shopping_event: 100%|██████████| 9/9 [00:00<00:00, 39.20it/s]

[15]:

# Define growth period for comparison

df_new = df_causal[(df_causal['year'] == 2025) & (df_causal['quarter'].isin([1, 2]))].copy()
df_old = df_causal[(df_causal['year'] == 2024) & (df_causal['quarter'].isin([1, 2]))].copy()

mean_old = df_old['sale'].mean()
mean_new = df_new['sale'].mean()
total_change = mean_new - mean_old

print(f"Comparing: 2025 H1 ({len(df_new)}d) vs 2024 H1 ({len(df_old)}d)")
print(f"Mean daily sales: 2024=${mean_old:,.0f}, 2025=${mean_new:,.0f}")
print(f"Daily change: ${total_change:,.0f}")

Comparing: 2025 H1 (181d) vs 2024 H1 (182d)
Mean daily sales: 2024=$512,443, 2025=$558,526
Daily change: $46,083

[16]:

# Decompose mean change via Shapley values
# Can swap np.mean for np.median or np.var to decompose those instead

median_contribs, uncertainty = gcm.confidence_intervals(
    lambda: gcm.distribution_change(
        causal_model, df_old, df_new, 'sale',
        num_samples=2000,
        difference_estimation_func=lambda x, y: np.mean(y) - np.mean(x)
    ),
    confidence_level=0.95,
    num_bootstrap_resamples=20
)

Evaluating set functions...: 100%|██████████| 154/154 [00:03<00:00, 42.84it/s]
Evaluating set functions...: 100%|██████████| 147/147 [00:03<00:00, 43.82it/s]
Evaluating set functions...: 100%|██████████| 149/149 [00:03<00:00, 44.56it/s]
Evaluating set functions...: 100%|██████████| 147/147 [00:03<00:00, 44.41it/s]
Evaluating set functions...: 100%|██████████| 156/156 [00:03<00:00, 43.23it/s]
Evaluating set functions...: 100%|██████████| 141/141 [00:03<00:00, 44.81it/s]
Evaluating set functions...: 100%|██████████| 148/148 [00:03<00:00, 43.10it/s]
Evaluating set functions...: 100%|██████████| 151/151 [00:03<00:00, 44.70it/s]
Evaluating set functions...: 100%|██████████| 154/154 [00:03<00:00, 43.08it/s]
Evaluating set functions...: 100%|██████████| 144/144 [00:03<00:00, 42.51it/s]
Evaluating set functions...: 100%|██████████| 148/148 [00:03<00:00, 43.79it/s]
Evaluating set functions...: 100%|██████████| 152/152 [00:03<00:00, 42.78it/s]
Evaluating set functions...: 100%|██████████| 143/143 [00:03<00:00, 45.63it/s]
Evaluating set functions...: 100%|██████████| 146/146 [00:03<00:00, 43.83it/s]
Evaluating set functions...: 100%|██████████| 145/145 [00:03<00:00, 41.30it/s]
Evaluating set functions...: 100%|██████████| 151/151 [00:03<00:00, 42.98it/s]
Evaluating set functions...: 100%|██████████| 148/148 [00:03<00:00, 42.94it/s]
Evaluating set functions...: 100%|██████████| 143/143 [00:03<00:00, 44.90it/s]
Evaluating set functions...: 100%|██████████| 146/146 [00:03<00:00, 43.09it/s]
Evaluating set functions...: 100%|██████████| 144/144 [00:03<00:00, 36.62it/s]
Estimating bootstrap interval...: 100%|██████████| 20/20 [01:30<00:00,  4.54s/it]

[17]:

growth_rate = (mean_new - mean_old) / mean_old * 100

print("Causal Attribution to YoY Sales Change:")
print(f"{'Driver':<20} {'Contrib $':<12} {'% of Total':<12} {'% of Growth':<12} {'iROAS':<10} {'95% CI':<22}")
print("-" * 98)

sum_contribs = 0
total_abs_contrib = sum(abs(v) for v in median_contribs.values())

spend_changes = {}
for col in ['sp_spend', 'sb_spend', 'display_spend', 'video_spend']:
    spend_changes[col] = df_new[col].mean() - df_old[col].mean()

for node in sorted(median_contribs.keys(), key=lambda x: -abs(median_contribs[x])):
    val = median_contribs[node]
    sum_contribs += val
    pct_total = (val / total_abs_contrib * 100) if total_abs_contrib > 0 else 0
    pct_growth = (val / mean_old * 100)
    lb, ub = uncertainty[node]
    sig = "*" if (lb > 0 or ub < 0) else ""

    if node in spend_changes:
        spend_chg = spend_changes[node]
        iroas = val / spend_chg if spend_chg != 0 else 0
        print(f"{node:<20} ${val:>10,.0f} {pct_total:>10.1f}%  {pct_growth:>10.2f}%    ${iroas:>7.2f} {sig:<2} [{lb:>8,.0f}, {ub:>8,.0f}]")
    else:
        print(f"{node:<20} ${val:>10,.0f} {pct_total:>10.1f}%  {pct_growth:>10.2f}%    {'N/A':>8} {sig:<2} [{lb:>8,.0f}, {ub:>8,.0f}]")

print("-" * 98)
print(f"{'Sum of contributions':<20} ${sum_contribs:>10,.0f} {'':>11} {sum_contribs/mean_old*100:>10.2f}%")
print(f"{'Actual mean change':<20} ${total_change:>10,.0f} {'':>11} {growth_rate:>10.2f}%")

print(f"\nKPI summary (daily sales):")
print(f"  2024 H1: mean=${mean_old:,.0f}, median=${df_old['sale'].median():,.0f}")
print(f"  2025 H1: mean=${mean_new:,.0f}, median=${df_new['sale'].median():,.0f}")
print(f"  YoY growth: {growth_rate:.2f}% (mean)")

print(f"\n* = CI excludes 0")
print(f"% of Growth = contribution / baseline mean")
print(f"iROAS = sales contribution / spend change")
print(f"'sale' row = unexplained variance (organic growth, noise)")
print(f"Note: Sum ≠ actual due to Monte Carlo sampling in Shapley estimation. Increase num_samples/num_bootstrap_resamples to reduce gap.")

Causal Attribution to YoY Sales Change:
Driver               Contrib $    % of Total   % of Growth  iROAS      95% CI
--------------------------------------------------------------------------------------------------
sale                 $    33,194       73.3%        6.48%         N/A *  [  32,110,   34,160]
sb_spend             $     5,374       11.9%        1.05%    $   5.89 *  [   4,410,    6,080]
display_spend        $     2,849        6.3%        0.56%    $   3.09 *  [   1,738,    3,867]
sp_spend             $     2,319        5.1%        0.45%    $   2.86 *  [   1,367,    3,318]
video_spend          $     1,398        3.1%        0.27%    $   2.85 *  [     673,    2,151]
weekly_seasonality   $        64        0.1%        0.01%         N/A    [    -890,    1,274]
yearly_seasonality   $        58        0.1%        0.01%         N/A    [  -1,183,      950]
special_shopping_event $        24        0.1%        0.00%         N/A    [  -1,078,    1,310]
other_shopping_event $       -15       -0.0%       -0.00%         N/A    [  -1,135,      608]
--------------------------------------------------------------------------------------------------
Sum of contributions $    45,266                   8.83%
Actual mean change   $    46,083                   8.99%

KPI summary (daily sales):
  2024 H1: mean=$512,443, median=$526,496
  2025 H1: mean=$558,526, median=$556,330
  YoY growth: 8.99% (mean)

* = CI excludes 0
% of Growth = contribution / baseline mean
iROAS = sales contribution / spend change
'sale' row = unexplained variance (organic growth, noise)
Note: Sum ≠ actual due to Monte Carlo sampling in Shapley estimation. Increase num_samples/num_bootstrap_resamples to reduce gap.

[18]:

fig, ax = plt.subplots(figsize=(10, 6))

drivers = list(median_contribs.keys())
contribs = [median_contribs[d] for d in drivers]
sorted_idx = np.argsort(contribs)[::-1]
drivers = [drivers[i] for i in sorted_idx]
contribs = [contribs[i] for i in sorted_idx]

colors = ['tab:green' if c > 0 else 'tab:red' for c in contribs]
ax.barh(drivers, contribs, color=colors)
ax.axvline(x=0, color='black', linewidth=0.5)
ax.set_xlabel('Contribution to Daily Sales Change ($)')
ax.set_title('Causal Attribution: H1 2025 vs H1 2024')
ax.invert_yaxis()
plt.tight_layout()
plt.show()

../_images/example_notebooks_gcm_mta_incrementality_time_decay_22_0.png

[19]:

print("iROI Estimation (via intervention):")
print(f"{'Channel':<15} {'Est. iROI':<15} {'Actual iROI':<15}")
print("-" * 45)

for spend_var, ch in [('sp_spend', 'SP'), ('sb_spend', 'SB'),
                       ('display_spend', 'Display'), ('video_spend', 'Video')]:
    step = 1000
    effect = gcm.average_causal_effect(
        causal_model=causal_model,
        target_node='sale',
        interventions_alternative={spend_var: lambda x, s=step: x + s},
        interventions_reference={spend_var: lambda x: x},
        num_samples_to_draw=10000
    )
    iroi = effect / step
    actual_iroi = ground_truth['iroi'][ch]
    print(f"{ch:<15} ${iroi:<14.2f} ${actual_iroi:<14.2f}")

iROI Estimation (via intervention):
Channel         Est. iROI       Actual iROI
---------------------------------------------
SP              $2.09           $2.50
SB              $8.42           $2.20
Display         $3.96           $3.80
Video           $4.14           $4.50

The iROI estimates differ from ground truth. Possible reasons:

DAG uses raw spend, but DGP applies adstock before computing sales
Auto-assigned mechanism may not capture the adstock transform
Unmeasured confounders (competitor activity, macro factors, etc.)

3.2 With Adstocked Spend#

Section 3.1 used raw spend as inputs, but our DGP generates sales from adstocked spend, where each day’s effect accumulates from prior days’ spending. This mismatch could explain the iROI gaps above.

To test this, we transform spend using ground truth carryover rates and refit:

[20]:

# Create adstocked spend columns using ground truth carryover rates
def apply_adstock(x, carryover):
    adstocked = np.zeros_like(x)
    adstocked[0] = x[0]
    for t in range(1, len(x)):
        adstocked[t] = x[t] + carryover * adstocked[t-1]
    return adstocked

df_causal_adstocked = df_causal.copy()
df_causal_adstocked['sp_spend'] = apply_adstock(df_causal['sp_spend'].values, 0.3)
df_causal_adstocked['sb_spend'] = apply_adstock(df_causal['sb_spend'].values, 0.4)
df_causal_adstocked['display_spend'] = apply_adstock(df_causal['display_spend'].values, 0.7)
df_causal_adstocked['video_spend'] = apply_adstock(df_causal['video_spend'].values, 0.85)

# Rebuild causal model with adstocked data
causal_model_adstocked = gcm.StructuralCausalModel(causal_graph)
gcm.auto.assign_causal_mechanisms(causal_model_adstocked, df_causal_adstocked, quality=gcm.auto.AssignmentQuality.BETTER)
gcm.fit(causal_model_adstocked, df_causal_adstocked)

# Test iROI
print("iROI with adstocked spend:")
print(f"{'Channel':<15} {'Est. iROI':<15} {'Actual iROI':<15}")
print("-" * 45)

for spend_var, ch in [('sp_spend', 'SP'), ('sb_spend', 'SB'),
                       ('display_spend', 'Display'), ('video_spend', 'Video')]:
    step = 1000
    effect = gcm.average_causal_effect(
        causal_model=causal_model_adstocked,
        target_node='sale',
        interventions_alternative={spend_var: lambda x, s=step: x + s},
        interventions_reference={spend_var: lambda x: x},
        num_samples_to_draw=10000
    )
    iroi = effect / step
    actual_iroi = ground_truth['iroi'][ch]
    print(f"{ch:<15} ${iroi:<14.2f} ${actual_iroi:<14.2f}")

Fitting causal mechanism of node other_shopping_event: 100%|██████████| 9/9 [00:00<00:00, 40.08it/s]

iROI with adstocked spend:
Channel         Est. iROI       Actual iROI
---------------------------------------------
SP              $2.58           $2.50
SB              $5.62           $2.20
Display         $3.95           $3.80
Video           $3.40           $4.50

SP and SB improved drastically because these channels have correlated spend in the DGP, with both following demand seasonality and spike during events. We generated data that way to mimic real-world investment preferences.

When inputs are highly correlated, the model struggles to separate individual effects (that’s why SB showed bigger differences before because it was absorbing variance from other channels). Different carryover rates transform each channel differently, reducing correlation and letting the model distinguish their contributions.

In contrast, Display and Video were already close because their spend is more stable and less correlated with other channels (dampened seasonality in DGP). The model already works reasonably well without the transform.

Real data doesn’t come with known carryover rates, so this is more of a sanity check on the DGP than a practical approach.

4. Summary#

For reporting: LTA reflects what most attribution systems currently credits, MTA shows what credit would look like with carryover modeled, and causal attribution estimates actual business impact. When presenting to stakeholders, you can show all three to highlight the gap between LTA and MTA/causal highlights where budget reallocation opportunities exist.

[21]:

print("=" * 70)
print("Summary: LTA vs MTA vs Incrementality")
print("=" * 70)
print(f"{'Channel':<10} {'LTA %':<12} {'MTA %':<12} {'Shift':<12} {'Actual iROI':<12}")
print("-" * 58)

for ch in channels:
    lta_pct = lta_totals[ch] / total_lta * 100
    mta_pct = mta_contrib[ch] / total_mta * 100
    shift = mta_pct - lta_pct
    iroi = ground_truth['iroi'][ch]
    print(f"{ch:<10} {lta_pct:<12.1f} {mta_pct:<12.1f} {shift:+11.1f}pp ${iroi:<11.2f}")

print("-" * 58)
print(f"\nSP+SB LTA share: {(lta_totals['SP']+lta_totals['SB'])/total_lta*100:.0f}%")
print(f"SP+SB MTA share: {(mta_contrib['SP']+mta_contrib['SB'])/total_mta*100:.0f}%")
print(f"Display+Video gain: {(mta_contrib['Display']+mta_contrib['Video'])/total_mta*100 - (lta_totals['Display']+lta_totals['Video'])/total_lta*100:+.0f}pp")

======================================================================
Summary: LTA vs MTA vs Incrementality
======================================================================
Channel    LTA %        MTA %        Shift        Actual iROI
----------------------------------------------------------
SP         45.1         12.8               -32.3pp $2.50
SB         30.0         12.1               -17.9pp $2.20
Display    17.9         38.2               +20.3pp $3.80
Video      7.0          36.8               +29.8pp $4.50
----------------------------------------------------------

SP+SB LTA share: 75%
SP+SB MTA share: 25%
Display+Video gain: +50pp