Nobody knows if AI tools actually help web agents — a careful retest suggests earlier claims were overblown

What happened

Researchers ran a large, controlled experiment to test whether giving AI agents access to tools (like clicking buttons, filling forms, reading pages) actually makes them better at navigating the web. Earlier studies claimed tools helped a lot, but those studies used small samples and different test setups, making it hard to compare. This new study tests the same tools across many different AI models and benchmarks and finds the benefits are much smaller and more uneven than the earlier hype suggested.

Why it matters

Web agents are supposed to become general-purpose assistants that can do tasks across the internet — book flights, file forms, research competitors. Most recent research on building these agents has assumed that giving them access to tools is obviously better. But if tools don't actually help much, or help unpredictably, the whole research direction needs to recalibrate. This is the kind of negative or deflating result that rarely gets published but matters more than a positive one, because it means hundreds of papers built on a shaky foundation.

The signal

Whether future web agent papers still assume tool use is beneficial, or start qualifying their claims with actual numbers showing which tools help which models by how much.