What happened
Researchers created StreamGaze, the first dataset and test that measures whether multimodal AI models can understand where humans are looking in streaming video and predict what they intend to do next. This matters because AR glasses and similar devices need AI that can follow your eye movements in real time — not just understand what's on screen, but anticipate your next action based on where you're looking.
Why it matters
Until now, no one had actually measured whether large language models could use gaze signals for real-time reasoning in video — this is the first benchmark that does. The gap between AI performance and human performance on these tasks is substantial, which means the path to functional gaze-aware AR systems is still early.