Language models can't learn what matters — they copy patterns but miss the rules

What happened

Researchers trained AI language models on synthetic data where shape consistently defined object categories, then tested whether the models would learn this abstract rule the way children do. The models memorized every example perfectly but couldn't apply the rule to new words — they hit a ceiling at random guessing, revealing they were matching surface patterns rather than extracting the underlying structure.

Why it matters

This is a clean measurement of a real limitation in how these models work. Children learn abstract principles (shape is the kind of feature that matters for categories); language models learn associations between tokens in sequences. The gap isn't about model size or training time — it's about the learning mechanism itself. When AI systems are deployed to tasks requiring this kind of generalization (medical diagnosis, legal reasoning, scientific inference), this gap becomes a real failure mode, not a theoretical one.

The signal

Watch whether larger or differently-trained models can solve this task, or whether the result holds across different synthetic domains — that would tell you whether this is a fundamental limit of the architecture or a training regime problem.