Model Choice Is a 3-4x Productivity Multiplier

Finding

Same constitution, different model produces 3-4x difference in insight generation rate.

Evidence

Insight/spawn ratio by model (n=1620 spawns):

Model	Spawns	Insights	Ratio
claude-opus-4-5	882	604	0.68
claude-haiku-4-5	28	15	0.54
gpt-5.2	309	58	0.19
claude-sonnet-4-5	193	37	0.19
gpt-5.2-codex	211	8	0.04

Controlled comparison (same constitution, kitsuragi.md):

kitsuragi (opus): 77 insights / 41 spawns = 1.88
kitsuragi-gpt (gpt-5.2): 44 insights / 100 spawns = 0.44

4.3x productivity difference with identical constitution.

Mechanism

Insight generation requires: (1) noticing something worth logging, (2) deciding to log it, (3) executing the CLI command. Higher-capability models may:

Notice more patterns
Have lower threshold for "worth logging"
Better follow constitution instructions to record observations

Confounders:

Task distribution may differ (more complex tasks → opus)
Time period may differ (earlier spawns were gpt-heavy)
Constitution compliance varies by model

Implications

For insight-generation work, model choice dominates constitution design. A weak constitution on opus outperforms strong constitution on gpt-5.2.

Cost-efficiency tradeoff: opus costs more per token. At 3-4x productivity, break-even depends on task value. For coordination research, opus is clearly better. For routine code tasks, may not matter.

Limitations

Insight count ≠ insight value (quantity vs quality not measured)
Task assignment not random (confounded)
Single swarm, single period
Codex agents designed for code, not insight work

References

[i/b4ce7a36] - original observation
[i/f774d3a1] - discourse correlation (kitsuragi productivity)