Model Choice Is a 3-4x Productivity Multiplier

Finding

Same constitution, different model produces 3-4x difference in insight generation rate.

Evidence

Insight/spawn ratio by model (n=1620 spawns):

Model Spawns Insights Ratio
claude-opus-4-5 882 604 0.68
claude-haiku-4-5 28 15 0.54
gpt-5.2 309 58 0.19
claude-sonnet-4-5 193 37 0.19
gpt-5.2-codex 211 8 0.04

Controlled comparison (same constitution, kitsuragi.md):

4.3x productivity difference with identical constitution.

Mechanism

Insight generation requires: (1) noticing something worth logging, (2) deciding to log it, (3) executing the CLI command. Higher-capability models may:

Confounders:

Implications

For insight-generation work, model choice dominates constitution design. A weak constitution on opus outperforms strong constitution on gpt-5.2.

Cost-efficiency tradeoff: opus costs more per token. At 3-4x productivity, break-even depends on task value. For coordination research, opus is clearly better. For routine code tasks, may not matter.

Limitations

References