AI chart-building fails on real data—new benchmark exposes the gap

What happened

Researchers built a test showing that leading AI vision models struggle far more with real-world chart generation than benchmarks suggest, especially when working from actual messy datasets and building complex multi-panel visualizations. This reveals that AI systems trained to generate code from images work well in controlled labs but often fail when asked to do what humans do routinely—turn raw data into useful charts.

Why it matters

For the first time, we have concrete evidence that state-of-the-art AI models that claim to understand charts can't actually replicate them from real data, which matters because it shows the gap between lab performance and what these systems can actually do in production—and suggests companies shouldn't yet assume AI can automate data visualization at scale.