LLMs keep changing their minds on multiple-choice tests when wrong answers look plausible

What happened

Researchers found that large language models flip between correct and incorrect answers on multiple-choice questions when plausible wrong answers are present, and they built a method called Inclusion-of-Thoughts that filters out the distracting options to stabilize the model's reasoning. The technique makes the model more reliable on tests without requiring more computing power.

Why it matters

This is a measurement of a real problem in deployed LLMs: they're not actually confident in their answers the way humans are. The instability suggests the model isn't reasoning through the problem so much as pattern-matching across all options at once, getting pulled toward distraction. This matters because every standardized test, certification exam, and evaluation system that uses multiple-choice questions now has to account for the fact that the model's answer depends partly on which plausible wrong answers you include — a hidden variable that shouldn't matter if the model actually understood the question.

The signal

Watch whether commercial LLM providers adopt filtering methods like this one before deploying models on high-stakes testing or assessment tools, or whether they keep shipping models that flip answers based on distractor design.