AI models shrink to 28 times smaller while keeping 90% of their smarts — first test of whether edge AI actually works

What happened

Researchers built a technique to compress multi-agent AI systems so they run on small devices instead of big servers, while preserving the coordination behavior the agents learned from their expert versions. In practice, this means swarms of robots or autonomous systems could operate on embedded hardware without constant cloud connection — the first real evidence that you don't lose the important parts when you squeeze the model down.

Why it matters

For years, deployed multi-agent AI systems have been theoretical because they require so much compute that they only work on servers. This paper shows you can shrink them 28 times over while keeping 90% of performance, which means edge deployment — actual robots on actual hardware with limited power — stops being aspirational and starts being engineering. The structural change is simple: you can now ask whether a system is worth deploying based on what it costs to run it, not whether you can afford to run it at all.

The signal

Whether robotics labs and autonomous vehicle teams actually adopt this technique in the next 18 months, or whether the benchmarks don't translate to messier real-world coordination problems where the 10% performance loss matters.