AI code assistants trust documentation over actual code — and miss bugs when they disagree
What happened
Researchers tested how well AI programming assistants spot problems when code, documentation, and tests conflict. It turns out they're much better at catching bugs in written documentation than at detecting when code quietly drifts away from what the documentation says it does.
Why it matters
AI code assistants are already in widespread use — they're built into most major development environments and used by millions of programmers daily. This finding shows a specific failure mode: when documentation looks correct but the actual code is subtly broken, the AI trusts the documentation and misses the bug. For critical systems (banking, medical devices, infrastructure), that's a real problem. Programmers using these tools need to know when the assistant is reliable and when it's confidently wrong.
The signal
Watch whether major code-assistant vendors (GitHub Copilot, Claude, etc.) add explicit consistency checking between documentation and implementation before offering suggestions on correctness-critical code.