AI can describe how people think but not predict if interventions actually work

What happened

Researchers tested whether large language models can predict whether climate interventions will change people's behavior, using data from 59,508 people across 62 countries. The models could describe observed attitudes accurately but failed to predict which interventions actually move people — the fit looked good on paper but the causal predictions were wrong, especially for interventions requiring emotional engagement rather than simple information.

Why it matters

This matters because researchers and policymakers increasingly use AI to simulate how populations will respond to interventions before running expensive trials. The trap: an AI can look right on descriptive metrics while being fundamentally wrong about causation. You can't see the difference just by checking the fit. The second problem is worse: AI simulations showed equal accuracy across countries with different wealth levels, but when checked against actual causal effects, accuracy varied wildly by country — meaning simulations masked the exact disparities that matter most for fairness.

The signal

Watch whether intervention studies start explicitly testing causal prediction accuracy (not just descriptive fit) before deploying simulations at scale, or whether institutions begin discovering misaligned effects only after running real-world pilots.