Why neural networks learn simple shortcuts before complex patterns

What happened

Researchers show that deep learning systems follow a mathematical principle of compression—they choose the simplest explanation of data that fits, then only adopt more complex ones when there's enough data to justify the added complexity. In practice, this means AI systems naturally learn spurious shortcuts first (like 'if the background is this color, predict this class'), and only move to reliable patterns when given more examples—which explains why AI can be brittle on new data and why having less training data sometimes makes models more robust.

Why it matters

This formalizes a long-observed quirk of neural networks using information theory, giving researchers a prediction tool for when and why AI systems will learn reliable features versus exploitable shortcuts—useful for understanding AI robustness and deciding how much data is actually needed for a given task.