Classroom surveillance software claims to read student attention without storing video

What happened

Researchers built a system that analyzes student behavior in real time by extracting skeletal poses and eye gaze from classroom video, then immediately deleting the footage and feeding only the geometric data to a large language model for interpretation. In practice, this means schools could now get automated attendance-like reports on who is paying attention without keeping recordings, though the system still struggles to understand classroom layouts and spatial reasoning.

Why it matters

This is a real-world deployment test of whether large language models can actually do useful work on multimodal data outside of controlled research settings. The honest finding is that they mostly can't yet — the system works for basic pose extraction but fails at spatial reasoning, which is the part that would actually matter to educators. What's worth watching is whether this shapes how schools think about classroom surveillance: if the technology barely works, the privacy compliance might matter more than the utility, which inverts the usual playbook where utility justifies the privacy trade.

The signal

Whether any school district actually deploys this, and if they do, whether teachers find the attention summaries useful enough to change anything about how they teach.