What happened
Researchers built a benchmark testing whether AI can watch someone use software like Photoshop and understand not just what they're clicking, but why they're doing it and when they need help. Current AI models fail badly at this — getting the user's intent right only 44% of the time — but performance jumps dramatically when you give the AI context about what the user is trying to accomplish.
Why it matters
Every AI assistant that tries to help you with creative software (design tools, video editors, spreadsheets) currently can only react to your clicks, not understand your actual goal. This benchmark measures whether AI can shift from automation to genuine collaboration, which requires solving a fundamentally harder problem than just predicting the next action.