LLMs make up library code 8-40% of the time. Static analysis catches less than half of it.

What happened

When large language models write code that uses external libraries, they invent features that don't actually exist in those libraries in 8 to 40 percent of responses. Static analysis tools (automated code inspection) can catch 16 to 70 percent of these fake-library errors, leaving a floor of real problems that no automated tool will ever catch.

Why it matters

This is the gap between what automation can and cannot solve. LLMs hallucinate libraries routinely, and the paper shows the absolute upper bound on what code inspection tools can ever catch is around 77 percent — meaning even if you build a perfect detector, 23 percent of hallucinated code will slip through because the problem is invisible to static analysis. This matters because it clarifies the real cost of using LLMs to write code: you cannot automate your way out of the problem. Someone still has to read and verify the generated code.

The signal

Track whether production code-generation systems (GitHub Copilot, Claude, etc.) add mandatory static analysis checks to their output before showing it to users, and whether those checks actually reduce hallucinated-library code in real deployed codebases.