AI model learns to watch hours of video without forgetting key details

What happened

A new AI system can now understand and answer questions about videos that are hours or days long, whereas previous systems could only handle short clips because they forgot important visual information. This matters because video understanding is everywhere — security monitoring, medical imaging analysis, autonomous vehicles, and content moderation — and scaling it to long-form content removes a major practical limitation that has kept these systems confined to short clips.

Why it matters

For years, video AI models have hit a hard wall at short clips because their 'memory' is too small and they compress visual details into text summaries that lose crucial information; this system demonstrates that keeping multiple types of memory — raw visual snapshots, factual events, and high-level concepts — lets the AI reason over hours of video the way humans actually watch and understand long events, which could unlock real applications in surveillance, medical diagnostics, and content analysis that require understanding what happened over extended periods.