The world is being quietly rearranged by people who write very long documents.


The title they went with The Illusion of Latent Generalization: Bi-directionality and the Reversal Curse Noisy translates that to

Language models store facts backward and forward as separate memories — not unified understanding


Researchers tested whether language models that learn facts in both directions (A implies B, B implies A) actually develop a single unified understanding or just memorize both routes separately. It turns out they memorize both routes as distinct entries with different internal structures, rather than developing a direction-agnostic concept of the relationship.
This matters because it shows a fundamental gap between how language models appear to work and how they actually work internally. A model that says it understands "Paris is the capital of France" may just have stored that phrase as a lookup table, not as a genuine concept. The implication is uncomfortable: training techniques that make models produce correct answers don't necessarily mean the model understands anything — it just means the model learned to retrieve the right output for the right input. This breaks a common assumption in AI development: that better performance on test benchmarks corresponds to deeper, more flexible internal reasoning.
Watch whether subsequent training methods designed to fix this (by explicitly forcing unified representations) actually make models more reliable on novel tasks they haven't seen before, or whether they still rely on memorization.

If you insist
Read the original →