AI explanations make people trust AI more — but often trust it less accurately

What happened

Researchers tested whether natural-language explanations from large language models actually improve human-AI team performance. It turns out: explanations increase user confidence but often decrease accuracy, especially on visual reasoning tasks where people become worse at catching AI errors.

Why it matters

For years, companies and researchers have treated AI explanations as a fix for the trust problem — if you show people why the AI decided something, they'll trust it more wisely. This study breaks that assumption. The paradox is that fluent explanations actively make people worse at error detection. Interfaces showing the AI's actual uncertainty (as percentages) or deferring borderline cases to humans outperformed explanation-based interfaces on visual tasks, while explanations did help on language-based reasoning — meaning the fix depends entirely on what kind of work the human-AI team is doing.

The signal

Watch whether companies deploying AI assistance in visual, design, or diagnostic work start shifting from narrative explanations toward uncertainty scores or selective automation, or whether they keep using explanations and simply accept lower error-recovery rates.