Language models can now signal uncertainty about multiple correct answers

What happened

Researchers found that language models hide their true confidence when multiple valid answers exist, because they use an activation function designed to pick only one winner. They built a new layer that allows the model to express that multiple outputs could be right simultaneously, making its confidence scores actually useful for detecting when it's working with genuinely ambiguous problems.

Why it matters

Right now, when you ask a language model something with multiple correct answers—like a translation, a summary, or a creative task—the model's reported confidence is worthless as a signal of actual quality because its architecture forces it to pretend there is only one right answer. This means downstream systems that rely on model confidence (like automated quality checks or routing systems) are flying blind on the most interesting and important cases. This fix lets you actually see when the model is uncertain because the problem is genuinely hard, not because the model is confused.