SDM Validation Roadmap

Two traditions represented: formative instrument papers (Chandler, Cacciotti, Spieth21) and composite/index tradition (Bloom & Van Reenen, Furman et al.).

The Boiling Water Problem

We made a thermometer, we need not another thermometer to validate ours but boiling water to tell us that the thermometer is indeed signaling 100 degrees

Criterion	Chandler JBV '11	Cacciotti JBV '20	Spieth21 JMS '21	B&VR QJE '07	Furman RP '02	SDM
TIER 1 — MINIMUM
1. Theoretical non-interchangeability	●	●	●	●	●	● Theory grounded
2. Content validity	●	●	●	●	◑	● 40-item pool from literature + 12 manager interviews + LLM
3. VIF / non-redundancy	○	◑	●	○	○	● ★ VIF 1.02–1.23 across all 24 items
4. CFA / model diagnostic	●	●	●	○	○	● 3-factor reflective CFA: CFI=.418, TLI=.355, RMSEA=.054. Framed correctly as expected misspecification
TIER 2 — GOOD MANAGEMENT PAPER
5. Experimental sensitivity	○	○	○	○	○	● ★★ η²=.43, p<.001
6. Robustness alt weights	○	○	○	●	●	● App A5.2: equal item weights, equal dimension weights, minimum operator
7. Within-study criterion	●	●	●	○	○	◑ Vignette r=.19***, N=2,745. But same-team, same-occasion / within-step r=.052 > between-step r=.023 (structural coherence)
8. Nomological validity	●	●	●	●	●	◑ AL Tier 1 b=0.16*** with firm FE, wild bootstrap, Holm correction. No only one possible explanation
TIER 3 — TOP JOURNALS
9. External criterion	○	○	○	●	●	○
10. Convergent w/ est. scales	○	●	○	○	○	○

At this stage, we do not have a completed boiling-water solution with the data currently available. This is defensible for RP Research Note, since no benchmark paper in the formative instrument tradition has boiling water either.

What we CAN do with Prolific / students (ranked by water temperature)

Note: Prolific can also mean small new data gathering with students. If students are used, monetary cost may drop to $0; timing would depend on access and logistics.

Panel 2: Decision Tree

Option	Temperature	Cost (rough approx.)	Time (rough approx.)	What it adds	Feasibility
1. Camuffo RCT assessment protocol on Prolific / students	Hot	$0-300	3-6 weeks	Criterion validity: independent protocol designed by different team scores same construct	doable
2. Prolific / students + external decision task	Warm-hot	$0-300	4-5 weeks	Quasi-criterion: someone external to the team designs another vignette-like measure	doable, but the scenarios must NOT be designed by our team. Could adapt from Bazerman & Moore (2013) or Kahneman-style decision cases
3. Prolific / students AOT/NFC only	Warm	$0-250	3-4 weeks	Measures other constructs close to SDM, for example AOT, NFC, among others	doable
4. All three in one Prolific / student session	Warm-to-hot	$0-350	4-5 weeks	Convergent measures + best available criterion (Camuffo protocol OR external decision task)	Best - one small data gathering exercise, ~20 min, covers Row 9 + Row 10

SDM — Current State

✓ VIF 1.02–1.23 (best-in-class)
✓ Priming RCT η²=.43 (unique)
✓ Content validity 3 stages
✓ Robustness alt weights (A5.2)
◑ Vignette r=.19 (same-team)
◑ Hierarchy gradient (3 alt mech)
✗ External criterion validity
✗ Convergent validity

▼

Path 0 — Writing fixes

$0 · 2 days

Cite B&VR '07 as tradition
Within-step r → §4 main text
Response-style bias → §4 main text
Explicit composite/index framing

Path A — Small new data gathering

$0-350 · 3-5 weeks

SDM + AOT + NFC + best available criterion*
Gives: Row 10 + best Row 9 available

Path B — External innovation/performance data

$0-500 · 4-8 weeks

Collect external variables: innovation or performance, e.g., patents or databases such as Orbis
Gives: SUBSTANTIVE FINDING (not Row 9)
Step 1: check data availability and match rate

Path C — A + B combined

$0-600 · 2-3 months

Convergent + best available criterion + substantive finding
Step 1: check external data availability and match rate

▼

RP Research Note (IF ~9)
R&D Management (IF ~4)

ORM** (IF ~8.5)
BJM (IF ~4)

RP Full Article (IF ~9)
JPIM (IF ~6.5)

JMS (IF ~7.5)
RP Full Article (IF ~9)

* "Best available criterion" = Camuffo RCT assessment protocol or external vignette-like decision task designed by a non-team member. Prolific can be replaced by students as a small new data gathering exercise. NOT boiling water — but the hottest water any comparable instrument paper has achieved.

** ORM also requires: respond to Edwards (2011). True boiling water (behavioral traces) is unrealistic for any first instrument paper. ORM may accept hot water + honest limitations.

SDM Journal Submission Options — Discussion Note