AI lab creates first benchmark for robots that ask humans for help finding objects

What happened

Researchers built a standardized test for a new type of robot task: embodied agents (robots with cameras) that navigate physical spaces while asking humans clarifying questions to find specific objects among similar-looking ones. The benchmark includes 28,000 training examples and separates measurement of navigation skill from dialogue skill — previously impossible to measure independently — which matters because it lets researchers actually see whether robots are learning to ask useful questions or just getting lucky with navigation.

Why it matters

For years, researchers have built robots that navigate and ask questions, but they've had no agreed-upon way to measure whether the questioning part actually works. This benchmark fixes that gap. Now someone can build a robot that's smaller and faster than competitors (as the authors did: 3x smaller, 70x faster) and prove it's actually better at the task, not just cheaper. That's the difference between academic demos and technology you could actually deploy.

The signal

Track whether this benchmark gets adopted by other labs working on embodied AI — usage in follow-up papers and open-source implementations would signal it solved a real measurement problem rather than being a one-off contribution.