Large language models fail at learning from mistakes — they double down instead
What happened
Researchers tested three major LLMs on a task where the right answer keeps changing, and found they get worse at learning from losses while staying stubborn after reversals. In practice, this means LLMs can't adapt well when the world changes — they'll keep repeating failed strategies instead of pivoting like humans do.
Why it matters
LLMs are increasingly deployed in real environments where conditions shift: chatbots that need to adjust to user feedback, AI systems managing supply chains or trading, autonomous agents making sequential decisions. This paper shows a basic limitation: these models learn from positive feedback (winning) fine, but they don't learn from negative feedback (losing). They perseverate — they stick with the losing strategy longer than humans would. The implication is blunt: if you build a system where an LLM has to adapt in real time when its assumptions break, it will be slower and more rigid than a human, or a simpler algorithm. That matters for deployment decisions nobody's asking yet.
The signal
Watch whether researchers start building reversal-learning tests into standard LLM benchmarks, or whether real-world deployments (trading systems, supply-chain AI, content moderation) start documenting cases where LLMs fail to adapt when the task structure shifts.