Heterogeneous Effects — Analysis Log

What this paper is about

We use a pooled dataset of six randomized controlled trials (Italy, Colombia, Netherlands, Spain, UK) to ask: which founders benefit most from theory-based entrepreneurship training?

The program randomly assigned founders to one of three arms — Pure Control, Theory-and-Evidence (T&E), and Evidence-based — all of which attended sessions with different content. The key comparison is content type, not program vs. no program.

The outcome is Scientific Intensity (SI): a composite score (1–5) from a structured baseline interview, measuring how scientifically the founder approaches decisions — testable hypotheses, structured experiments, evidence-based pivots.

Variables

Green — in dataset, used in analysis

Red — in dataset but zero RF predictive power

Gray — requested, not yet received

Hover over a chip to see the variable description.

Founder — Demographics

Who the founder is.

genderbirth_yeareducationeducation_fieldfull_timecountry_origin

Founder — Experience & History

Entrepreneurial and professional track record.

work_experience_yearmanagerial_experience_yearstartup_foundedtraining_course_numnumber_firms_workednumber_profesional_posnumber_industries_worked

Firm — Characteristics

Venture-level baseline characteristics.

sector_startupemployeesproduct_dummyhours_workedidea_aspiration_1idea_aspiration_2int_startup_phasefounding_team_sizeplanning_futuresales_totalmotivationgrowth_aspirationexpenditures

Beliefs & Cognition

How the founder thinks and what they believe about uncertainty, control, and opportunity.

illusion_of_controlfuture_ev_influencefuture_ev_orientprob_unexpectedevents_6Mprob_unexpectedevents_2Yeffect_unexpected_eventsnew_ideasnew_ideas_considerednew_ideas_worth_exploringidea_breadth_hidea_distantminimalismrisk_aversionunc_aversionendowment_1endowment_2signalsignal_typedistribution_prob_1..5

Team — Structure & Transactive Memory

Transactive Memory System scale — team cognition and coordination. Pending from Diego.

tms_1tms_2tms_3tms_4tms_5tms_6

Learning Orientation & Competitiveness

Goal orientation and competitiveness. Pending from Diego.

lo_mastery_1lo_mastery_2lo_perf_approach_1lo_perf_approach_2lo_perf_avoid_1lo_perf_avoid_2competitiveness_1competitiveness_2competitiveness_3competitiveness_4competitiveness_5competitiveness_6competitiveness_7

Scientific Intensity — Outcome (from baseline interview)

RA-coded from structured interview. 23 items across Theory, Hypotheses, Evidence, Evaluation, Decision → composite si_overall (1–5).

si_overalltheory_observationtheory_cleartheory_blockstheory_whytheory_hierarchyhp_explicithp_coherenthp_measurehp_hierarchyhp_falsifiableevidence_dummyevidence_coherentevidence_appropriateevidence_sampleevidence_biasevidence_mechanismevaluation_dataevaluation_theoryevaluation_rigorousevaluation_priorevaluation_updatedecision_datadecision_process

Analysis pipeline

Script 01 Pooled RF on SI level

Y = si_overall (level, all periods pooled) · Z = all variables including dosage

Purpose: first pass — which variables have any signal at all?

Limitation: pools all periods and arms; mixes level with change; dosage is post-assignment; not for HTE discovery.

Script 02 RF on ΔSI by arm and period

Y = ΔSI = si_overall_t − si_overall₀ · Z = baseline only, no dosage

Structure: 14 separate models — 3 arms × 5 periods (Pure Control p5 skipped, N=44)

Purpose: find where importance diverges across arms → candidate HTE moderators for DML.

Script 01 — What predicts SI in general

RF variable importance pooled — Script 01 · RF importance (max=100) · Y = si_overall · all periods pooled · N = 3,758

Top predictors: founding_team_size, int_startup_phase, hours_worked, employees, prob_unexpectedevents_2Y/6M, illusion_of_control. RMSE = 0.86 on a 1–5 scale.

What this does not mean: high importance ≠ the treatment works better for those founders. This is prediction, not causal HTE.

Script 02 — How importance evolves over time

RF delta SI importance by period top 6 — Script 02 · Top 6 variables by avg importance across arms · Y = ΔSI · averaged across the 3 arms

Averaged across arms, prob_unexpectedevents_6M and idea_aspiration_1/2 are consistently the strongest predictors of improvement. But the average hides the key signal — see below.

The key finding — divergence by arm

For each of 6 candidate variables: how strongly does it predict who improves their SI within each arm, across periods. Diverging lines = arm-specific predictor = candidate HTE moderator.

RF divergence by arm — Script 02 · Blue = Evidence-based · Orange = Pure Control · Red = T&E · Pure Control has no p5 (N<50)

illusion_of_control — Red (T&E) consistently high across all 5 periods. Founders who overestimate control improve more specifically in T&E — consistent with the theory that T&E content is designed to calibrate that bias. Primary DML candidate.

new_ideas — Red spikes to ~94 in period 3 (vs. ~47–52 in other arms), then converges. Openness to new ideas at baseline predicts T&E improvement mid-program. Likely transitory — test DML at period 3 specifically.

future_ev_influence — Orange (Control) starts at 100 in period 1; red (T&E) rises to ~81 by period 5. The belief that external factors shape the business becomes a T&E-specific predictor as the program progresses.

prob_unexpectedevents_6M — All three lines track closely. High everywhere, does not diverge → control variable in DML, not a primary moderator.

Caveat: these are predictive patterns, not causal estimates. Divergence suggests where to look in DML — it does not prove treatment effect heterogeneity.

Next steps

DML — primary: illusion_of_control

Chernozhukov et al. framework (BLP / GATES / CLAN) in Stata. Binary treatment: T&E vs. Pure Control.

DML — secondary: idea_aspiration_1, new_ideas

For new_ideas, run period-3-specific DML given the pattern is concentrated there.

Add pending variables

TMS, LO scales, competitiveness, risk/ambiguity aversion — re-run RF and update shortlist before final DML.

Pre-registration

Lock primary moderators and DML specification before looking at causal estimates. All results here are Exploratory.