What happened
Researchers found a way to compress the image data that flows through vision-language AI models — cutting it down to a third while keeping the same accuracy. This matters because these models are expensive to run; using fewer tokens means cheaper inference, faster responses, and lower power consumption on every query.