Working paper · HTE Analysis · 2026-05-28

For whom does scientific-method training work — and why?

Heterogeneous Treatment Effects · GenericML · BLP + GATES + CLAN · Multi-site RCT (N=1,187)

The story we are building

Average treatment effect on performance = 0 (β₁ not significant in any outcome)
↓ but
Massive heterogeneity: β₂ = 0.79–0.89*** across all performance outcomes
↓ so who gains?
The "experienced practitioner" type: serial entrepreneur, +6yr work exp, older, less formally educated
↓ and why?
Same type absorbs more SI — suggesting a mechanism: theory works when the founder can implement it
↓ is it robust?
Must survive multiple GATES specs + pre-attrition sample + OLS triangulation — then it enters the paper

Full analysis map (mindmap)

mindmap root((**HTE Paper**
For whom does SDM work?)) **✅ BASE — Confirmed results** Performance track ATE null β₁ ns all outcomes HTE massive β₂ 0.79–0.89*** GATES G1=-0.83 · G2=-0.04 · G3=+1.35 CLAN top-5 all p<0.001 startup_founded +68pp work_exp_year +6.3yr managerial_exp +3.8yr age +5.8yr education -25pp SI absorption track ATE +1.70*** si_emp vs C HTE significant β₂=0.53–0.67*** CLAN vs C: experienced + has product CLAN vs E: educated + first venture Comparison 09_clan_comparison.R done 5 vars SAME direction education OPPOSITE — interesting tension **🔁 AXES TO VARY** A · Sample P1–P5 ✅ main P1–P2 PENDING pre-attrition P1 only PENDING cross-section B · Outcome Y log_sales ✅ any_sales ✅ log_emp ✅ log_sales_per_emp ✅ tfp ✅ si_theory_stock ✅ si_emp_stock ✅ si_thhp_stock PENDING C · Comparison TE vs Control ✅ TE vs EB ✅ D · GATES quantile spec 2-groups ✅ 3-groups ✅ main 4-groups ✅ Top20 vs Bot20 PENDING **🔲 COMPLEMENTARY** OLS parametric triangulation D × startup_founded PENDING D × work_exp_year PENDING D × college PENDING D × managerial_exp PENDING D × practitioner_type PENDING Types analysis Define practitioner type ATE by type Type distribution by site Site HTE exploratory Already in Z_i site dummies ✅ ATE by site PENDING Composition by site PENDING **📋 PERSISTENCE CRITERION** Survives ≥3 GATES specs Significant in ≥2 outcomes Holds in P1–P2 sample Confirmed by OLS Then it enters the paper **📄 PAPER OUTPUT** §3 BLP table main spec §3 GATES figure G1-G2-G3 §3 CLAN portrait of type §4 Mechanism via SI §5 Robustness tables Appendix site analysis

The 4 axes of variation — what we can turn

All core analyses run the same GenericML pipeline. A finding becomes a paper result only if it persists when we change these 4 knobs.

A · Sample (which periods?)
P1–P5 (full panel) done
P1–P2 (pre-attrition) pending
P1 only (cross-section) pending
D-003: attrition differential starts P3 (Control 73% vs TE/EB 37%)
B · Outcome Y
log_sales, any_sales, log_emp, log_sales_per_emp, tfp done
si_theory_stock, si_emp_stock done
si_thhp_stock pending
log_sales winsorized robustness
C · Comparison arm
TE vs Control done
TE vs EB done
Both comparisons embedded in 07 and 08 scripts
D · GATES quantile spec
2-groups (median split) done
3-groups (terciles) ← main done
4-groups (quartiles) done
Top-20% vs Bottom-20% pending

Complementary analyses (beyond GenericML)

done completed
pending next to run
explore exploratory / appendix
IDAnalysisWhat it answersStatusPriority
OLS Parametric Triangulation
P-01 Y ~ D × startup_founded + i.rct + i.period Confirms CLAN pattern with interpretable coefficient pending High
P-02 Y ~ D × work_exp_year + controls Confirms experience pattern (continuous) pending High
P-03 Y ~ D × college + controls Confirms education paradox pending High
P-04 Y ~ D × managerial_exp + controls Managerial vs general experience pending Medium
P-05 Y ~ D × practitioner_type + controls Composite index: does the "type" have a single effect? pending High
Types Analysis — "Reason about types" (Arnaldo)
T-01 Define practitioner type: startup_founded=1 AND work_exp > median AND college=0 Is there a coherent "type" in the data? pending High
T-02 Simple ATE by type: E[Y|D=1,type=1] − E[Y|D=0,type=1] How big is the gain for the practitioner type? pending High
T-03 Distribution of types by RCT site Does site composition explain country-level differences? pending Medium
Site-Level HTE — exploratory (CGJ 2025 shows significant site variation)
S-01 Site dummies already in Z_i — check if rct appears in top CLAN variables Does site membership predict benefit group? implicit Check outputs
S-02 ATE by site: Y ~ D + D×i.rct + controls Is the ATE null uniform across sites? (CGJ: Colombia > India) explore Medium
S-03 Type distribution by site — does practitioner type cluster by country? Compositional explanation for site-level differences explore After T-01
Mechanism
M-01 CLAN comparison: performance vs SI absorption (09_clan_comparison.R) Do the same people benefit in both tracks? done
M-04 Narrative mechanism: same type in both tracks → same story Informal mediation without IV: theory works when implementable pending High (paper §4)

Persistence criterion — what enters the paper (D-016)

A finding enters the paper if it passes all 4 filters:

1
GATES robustness: appears in ≥3 quantile specs (2g / 3g / 4g)
2
Outcome breadth: significant in ≥2 outcomes of the same track
3
Sample robustness: survives when restricted to P1–P2 (pre-attrition)
4
Parametric confirmation: confirmed with OLS interaction (interpretable β)

Target: 2–3 robust patterns. Not a boilerplate list of effects.

Current candidates: startup_founded · work_experience_year · managerial_exp · age · education/college (all *** in ≥4 performance outcomes)

What each analysis feeds into the paper

§3 — Main Results
BLP table
07_performance_hte.R + 08_si_hte.R
β₁ (ATE) and β₂ (HTE test) all outcomes × comparisons
§3 — Main Results
GATES figure (3g)
G1 / G2 / G3 bars for log_sales
Shows who gains and who doesn't
§3 — Main Results
CLAN — portrait of the type
Top 3 persistent CLAN vars
The "experienced practitioner" profile
§3 — Robustness in-text
OLS parametric table
11_ols_interactions.do
D × var_CLAN coefficients, interpretable magnitudes
§3 — Types
ATE by practitioner type
12_types_analysis.R
"Type gains X log_sales; non-type gains 0"
§4 — Mechanism
SI mechanism narrative
09_clan_comparison.R
Same type absorbs more SI → can implement theory
§5 — Robustness
P1–P2 robustness
10_robustness_p1p2.R
CLAN patterns survive pre-attrition sample
Appendix
Site-level HTE
13_site_hte.R (if warranted)
Composition of types by country

Execution order

#TaskScriptBlock
1Verify i.rct in 05_ols_interaction_robustness.doquick checkA — robustness
2Rerun 07+08 with P1–P2 only10_robustness_p1p2.RA — robustness
3Add top-20%/bottom-20% to 07 and 08edit 07_performance_hte.R + 08_si_hte.RA — robustness
4OLS interaction table (top-5 CLAN vars, all outcomes, with i.rct)11_ols_interactions.doA — robustness
5Define practitioner type + ATE by type12_types_analysis.RB — types
6Site-level ATE + type distribution by site12_types_analysis.R (add section)B — types
7Add si_thhp_stock to 08_si_hte.Redit 08_si_hte.RC — completeness
8Causal Forest (optional, pre-submission)new scriptD — final robustness

Note on rct / site controls

In GenericML (07 and 08): site dummies are in Z_i (lines 38–44 of 07_performance_hte.R). Y is residualized on period FE only — not site FE. This is deliberate: absorbing site variation in Y would prevent studying cross-site heterogeneity (CGJ 2025, p.66 explicit note). Site effects are absorbed within the ML nuisance models.

In OLS triangulation (Stata): i.rct must be included in all interaction specs. Verify in 05_ols_interaction_robustness.do before running 11_ols_interactions.do.

CGJ 2025 reference: finds significant site heterogeneity (p=0.001 in CLAN). Colombia and Italy show highest effects; India and China lowest. Check if this aligns with the practitioner-type composition by site once T-01 is done.