What this paper is about
We use a pooled dataset of six randomized controlled trials (Italy, Colombia, Netherlands, Spain, UK) to ask: which founders benefit most from theory-based entrepreneurship training?
The program randomly assigned founders to one of three arms — Pure Control, Theory-and-Evidence (T&E), and Evidence-based — all of which attended sessions with different content. The key comparison is content type, not program vs. no program.
The outcome is Scientific Intensity (SI): a composite score (1–5) from a structured baseline interview, measuring how scientifically the founder approaches decisions — testable hypotheses, structured experiments, evidence-based pivots.
Variables
Hover over a chip to see the variable description.
Who the founder is.
Entrepreneurial and professional track record.
Venture-level baseline characteristics.
How the founder thinks and what they believe about uncertainty, control, and opportunity.
Transactive Memory System scale — team cognition and coordination. Pending from Diego.
Goal orientation and competitiveness. Pending from Diego.
RA-coded from structured interview. 23 items across Theory, Hypotheses, Evidence, Evaluation, Decision → composite si_overall (1–5).
Analysis pipeline
Y = si_overall (level, all periods pooled) · Z = all variables including dosage
Purpose: first pass — which variables have any signal at all?
Limitation: pools all periods and arms; mixes level with change; dosage is post-assignment; not for HTE discovery.
Y = ΔSI = si_overallt − si_overall0 · Z = baseline only, no dosage
Structure: 14 separate models — 3 arms × 5 periods (Pure Control p5 skipped, N=44)
Purpose: find where importance diverges across arms → candidate HTE moderators for DML.
Script 01 — What predicts SI in general
Top predictors: founding_team_size, int_startup_phase, hours_worked, employees, prob_unexpectedevents_2Y/6M, illusion_of_control. RMSE = 0.86 on a 1–5 scale.
Script 02 — How importance evolves over time
Averaged across arms, prob_unexpectedevents_6M and idea_aspiration_1/2 are consistently the strongest predictors of improvement. But the average hides the key signal — see below.
The key finding — divergence by arm
For each of 6 candidate variables: how strongly does it predict who improves their SI within each arm, across periods. Diverging lines = arm-specific predictor = candidate HTE moderator.