Researchers used AI to generate 615 vulnerable code examples linked to known attack patterns

What happened

Computer scientists used GPT-4, Llama, and Claude to automatically generate code samples that contain specific security vulnerabilities described in standard attack and weakness catalogs. The resulting dataset gives security researchers thousands of labeled examples of broken code they can use to train vulnerability detection tools — something that's historically been scarce and expensive to create by hand.

Why it matters

Security researchers have always struggled to find large collections of real vulnerable code paired with clear descriptions of what went wrong and why. This dataset attempts to solve that by using language models to generate the examples automatically, which could accelerate research on tools that catch security bugs before they ship. The catch is whether AI-generated vulnerable code actually teaches detection systems anything useful about real vulnerabilities, or whether the AI simply produces plausible-looking but ultimately hollow examples that don't match how actual attacks work.

The signal

Track whether vulnerability detection tools trained on this AI-generated dataset catch real bugs at the same rate as tools trained on human-written vulnerable code from production systems.